


Остановите войну!
for scientists:


default search action
Jonathan Le Roux
Person information

Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [j15]Zhong-Qiu Wang
, Gordon Wichern
, Shinji Watanabe
, Jonathan Le Roux
:
STFT-Domain Neural Speech Enhancement With Very Low Algorithmic Latency. IEEE ACM Trans. Audio Speech Lang. Process. 31: 397-410 (2023) - [j14]Darius Petermann
, Gordon Wichern
, Aswin Shanmugam Subramanian
, Zhong-Qiu Wang
, Jonathan Le Roux
:
Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2592-2605 (2023) - [c107]Ke Chen, Gordon Wichern, François G. Germain, Jonathan Le Roux:
Paᗧ-HuBERT: Self-Supervised Music Source Separation Via Primitive Auditory Clustering And Hidden-Unit Bert. ICASSP Workshops 2023: 1-5 - [c106]François G. Germain, Gordon Wichern, Jonathan Le Roux:
Hyperbolic Unsupervised Anomalous Sound Detection. WASPAA 2023: 1-5 - [c105]Ricardo Falcón Pérez, Gordon Wichern, François G. Germain, Jonathan Le Roux:
Location as Supervision for Weakly Supervised Multi-Channel Source Separation of Machine Sounds. WASPAA 2023: 1-5 - [i59]Christoph Böddeker, Aswin Shanmugam Subramanian, Gordon Wichern, Reinhold Haeb-Umbach, Jonathan Le Roux:
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings. CoRR abs/2303.03849 (2023) - [i58]Ke Chen, Gordon Wichern, François G. Germain, Jonathan Le Roux:
Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT. CoRR abs/2304.02160 (2023) - [i57]Chiori Hori, Puyuan Peng, David Harwath, Xinyu Liu, Kei Ota, Siddarth Jain, Radu Corcodel, Devesh K. Jha, Diego Romeres, Jonathan Le Roux:
Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos. CoRR abs/2306.15644 (2023) - [i56]Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada P. Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman A. Solovyev, Alexander L. Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji:
The Sound Demixing Challenge 2023 - Cinematic Demixing Track. CoRR abs/2308.06981 (2023) - 2022
- [j13]Yosuke Higuchi
, Niko Moritz, Jonathan Le Roux
, Takaaki Hori
:
Momentum Pseudo-Labeling: Semi-Supervised ASR With Continuously Improving Pseudo-Labels. IEEE J. Sel. Top. Signal Process. 16(6): 1424-1438 (2022) - [c104]Anoop Cherian, Chiori Hori, Tim K. Marks, Jonathan Le Roux:
(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering. AAAI 2022: 444-453 - [c103]Satvik Venkatesh, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Improved Domain Generalization via Disentangled Multi-Task Learning in Unsupervised Anomalous Sound Detection. DCASE 2022 - [c102]Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks. ICASSP 2022: 526-530 - [c101]Olga Slizovskaia, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
Locate This, Not that: Class-Conditioned Sound Event DOA Estimation. ICASSP 2022: 711-715 - [c100]Niko Moritz, Takaaki Hori, Shinji Watanabe
, Jonathan Le Roux:
Sequence Transduction with Graph-Based Supervision. ICASSP 2022: 7212-7216 - [c99]Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe
, Jonathan Le Roux:
Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR. ICASSP 2022: 7322-7326 - [c98]Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy. ICASSP 2022: 7672-7676 - [c97]Ankit P. Shah, Shijie Geng, Peng Gao, Anoop Cherian, Takaaki Hori, Tim K. Marks, Jonathan Le Roux, Chiori Hori:
Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning. ICASSP 2022: 7732-7736 - [c96]Efthymios Tzinis, Gordon Wichern, Aswin Shanmugam Subramanian, Paris Smaragdis, Jonathan Le Roux:
Heterogeneous Target Speech Separation. INTERSPEECH 2022: 1796-1800 - [c95]Chiori Hori, Takaaki Hori, Jonathan Le Roux:
Low-Latency Online Streaming VideoQA Using Audio-Visual Transformers. INTERSPEECH 2022: 4511-4515 - [i55]Anoop Cherian, Chiori Hori, Tim K. Marks, Jonathan Le Roux:
(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering. CoRR abs/2202.09277 (2022) - [i54]Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux:
Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR. CoRR abs/2203.00232 (2022) - [i53]Olga Slizovskaia, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
Locate This, Not That: Class-Conditioned Sound Event DOA Estimation. CoRR abs/2203.04197 (2022) - [i52]Efthymios Tzinis, Gordon Wichern, Aswin Shanmugam Subramanian, Paris Smaragdis, Jonathan Le Roux:
Heterogeneous Target Speech Separation. CoRR abs/2204.03594 (2022) - [i51]Zhong-Qiu Wang, Gordon Wichern, Shinji Watanabe
, Jonathan Le Roux:
STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency. CoRR abs/2204.09911 (2022) - [i50]Zexu Pan, Gordon Wichern, François G. Germain, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Towards End-to-end Speaker Diarization in the Wild. CoRR abs/2211.01299 (2022) - [i49]Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux:
Cold Diffusion for Speech Enhancement. CoRR abs/2211.02527 (2022) - [i48]Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux:
Optimal Condition Training for Target Source Separation. CoRR abs/2211.05927 (2022) - [i47]Rohith Aralikatti, Christoph Böddeker, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Reverberation as Supervision for Speech Separation. CoRR abs/2211.08303 (2022) - [i46]Dimitrios Bralios, Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux:
Latent Iterative Refinement for Modular Source Separation. CoRR abs/2211.11917 (2022) - [i45]Darius Petermann, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Hyperbolic Audio Source Separation. CoRR abs/2212.05008 (2022) - [i44]Darius Petermann, Gordon Wichern, Aswin Shanmugam Subramanian, Zhong-Qiu Wang, Jonathan Le Roux:
Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks. CoRR abs/2212.07327 (2022) - 2021
- [j12]Zhong-Qiu Wang
, Gordon Wichern
, Jonathan Le Roux
:
On the Compensation Between Magnitude and Phase in Speech Separation. IEEE Signal Process. Lett. 28: 2018-2022 (2021) - [j11]Zhong-Qiu Wang
, Gordon Wichern
, Jonathan Le Roux
:
Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3476-3490 (2021) - [c94]Shijie Geng, Peng Gao, Moitreya Chatterjee, Chiori Hori, Jonathan Le Roux, Yongfeng Zhang, Hongsheng Li, Anoop Cherian:
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers. AAAI 2021: 1415-1423 - [c93]Yun-Ning Hung, Gordon Wichern, Jonathan Le Roux:
Transcription Is All You Need: Learning To Separate Musical Mixtures With Score As Supervision. ICASSP 2021: 46-50 - [c92]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Capturing Multi-Resolution Context by Dilated Self-Attention. ICASSP 2021: 5869-5873 - [c91]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Semi-Supervised Speech Recognition Via Graph-Based Temporal Classification. ICASSP 2021: 6548-6552 - [c90]Sameer Khurana, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training. ICASSP 2021: 6553-6557 - [c89]Moitreya Chatterjee, Jonathan Le Roux, Narendra Ahuja, Anoop Cherian:
Visual Scene Graphs for Audio Source Separation. ICCV 2021: 1184-1193 - [c88]Chiori Hori, Takaaki Hori, Jonathan Le Roux:
Optimizing Latency for Online Video Captioning Using Audio-Visual Transformers. Interspeech 2021: 586-590 - [c87]Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition. Interspeech 2021: 726-730 - [c86]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition. Interspeech 2021: 1822-1826 - [c85]Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux:
Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers. Interspeech 2021: 2097-2101 - [c84]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
Convolutive Prediction for Reverberant Speech Separation. WASPAA 2021: 56-60 - [c83]Gordon Wichern, Ankush Chakrabarty, Zhong-Qiu Wang, Jonathan Le Roux:
Anomalous Sound Detection Using Attentive Neural Processes. WASPAA 2021: 186-190 - [i43]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Capturing Multi-Resolution Context by Dilated Self-Attention. CoRR abs/2104.02858 (2021) - [i42]Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux:
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers. CoRR abs/2104.09426 (2021) - [i41]Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition. CoRR abs/2106.08922 (2021) - [i40]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition. CoRR abs/2107.01269 (2021) - [i39]Chiori Hori, Takaaki Hori, Jonathan Le Roux:
Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers. CoRR abs/2108.02147 (2021) - [i38]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
On The Compensation Between Magnitude and Phase in Speech Separation. CoRR abs/2108.05470 (2021) - [i37]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
Convolutive Prediction for Reverberant Speech Separation. CoRR abs/2108.07194 (2021) - [i36]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation. CoRR abs/2108.07376 (2021) - [i35]Moitreya Chatterjee, Jonathan Le Roux, Narendra Ahuja, Anoop Cherian:
Visual Scene Graphs for Audio Source Separation. CoRR abs/2109.11955 (2021) - [i34]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement. CoRR abs/2110.00570 (2021) - [i33]Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy. CoRR abs/2110.04948 (2021) - [i32]Ankit P. Shah, Shijie Geng, Peng Gao, Anoop Cherian, Takaaki Hori, Tim K. Marks, Jonathan Le Roux, Chiori Hori:
Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning. CoRR abs/2110.06894 (2021) - [i31]Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks. CoRR abs/2110.09958 (2021) - [i30]Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux:
Sequence Transduction with Graph-based Supervision. CoRR abs/2111.01272 (2021) - 2020
- [j10]Fatemeh Pishdadian, Gordon Wichern, Jonathan Le Roux:
Finding Strength in Weakness: Learning to Separate Sounds With Weak Supervision. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2386-2399 (2020) - [c82]Fatemeh Pishdadian, Gordon Wichern, Jonathan Le Roux:
Learning to Separate Sounds from Weakly Labeled Scenes. ICASSP 2020: 91-95 - [c81]Matthew Maciejewski, Gordon Wichern, Emmett McQuinn, Jonathan Le Roux:
WHAMR!: Noisy and Reverberant Single-Channel Speech Separation. ICASSP 2020: 696-700 - [c80]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Streaming Automatic Speech Recognition with the Transformer Model. ICASSP 2020: 6074-6078 - [c79]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe
:
End-To-End Multi-Speaker Speech Recognition With Transformer. ICASSP 2020: 6134-6138 - [c78]Leda Sari, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR. ICASSP 2020: 7384-7388 - [c77]Niko Moritz, Gordon Wichern, Takaaki Hori, Jonathan Le Roux:
All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection. INTERSPEECH 2020: 3112-3116 - [c76]Tejas Jayashankar, Jonathan Le Roux, Pierre Moulin:
Detecting Audio Attacks on ASR Systems with Dropout Uncertainty. INTERSPEECH 2020: 4671-4675 - [c75]Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux:
Transformer-Based Long-Context End-to-End Speech Recognition. INTERSPEECH 2020: 5011-5015 - [c74]Ethan Manilow, Gordon Wichern, Jonathan Le Roux:
Hierarchical Musical Instrument Separation. ISMIR 2020: 376-383 - [c73]Prem Seetharaman, Gordon Wichern, Bryan Pardo, Jonathan Le Roux:
Autoclip: Adaptive Gradient Clipping for Source Separation Networks. MLSP 2020: 1-6 - [i29]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Streaming automatic speech recognition with the transformer model. CoRR abs/2001.02674 (2020) - [i28]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe:
End-to-End Multi-speaker Speech Recognition with Transformer. CoRR abs/2002.03921 (2020) - [i27]Leda Sari, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Speaker Adaptation using Attention-based Speaker Memory for End-to-End ASR. CoRR abs/2002.06165 (2020) - [i26]Tejas Jayashankar, Jonathan Le Roux, Pierre Moulin:
Detecting Audio Attacks on ASR Systems with Dropout Uncertainty. CoRR abs/2006.01906 (2020) - [i25]Shijie Geng, Peng Gao, Chiori Hori, Jonathan Le Roux, Anoop Cherian:
Spatio-Temporal Scene Graphs for Video Dialog. CoRR abs/2007.03848 (2020) - [i24]Prem Seetharaman, Gordon Wichern, Bryan Pardo, Jonathan Le Roux:
AutoClip: Adaptive Gradient Clipping for Source Separation Networks. CoRR abs/2007.14469 (2020) - [i23]Peng Gao, Chiori Hori, Shijie Geng, Takaaki Hori, Jonathan Le Roux:
Multi-Pass Transformer for Machine Translation. CoRR abs/2009.11382 (2020) - [i22]Yun-Ning Hung, Gordon Wichern, Jonathan Le Roux:
Transcription Is All You Need: Learning to Separate Musical Mixtures with Score as Supervision. CoRR abs/2010.11904 (2020) - [i21]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Semi-Supervised Speech Recognition via Graph-based Temporal Classification. CoRR abs/2010.15653 (2020) - [i20]Sameer Khurana, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training. CoRR abs/2011.13439 (2020)
2010 – 2019
- 2019
- [j9]Jonathan Le Roux
, Gordon Wichern
, Shinji Watanabe
, Andy M. Sarroff, John R. Hershey:
Phasebook and Friends: Leveraging Discrete Representations for Source Separation. IEEE J. Sel. Top. Signal Process. 13(2): 370-382 (2019) - [c72]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe
:
MIMO-Speech: End-to-End Multi-Channel Multi-Speaker Speech Recognition. ASRU 2019: 237-244 - [c71]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Streaming End-to-End Speech Recognition with Joint CTC-Attention Based Models. ASRU 2019: 936-943 - [c70]Jonathan Le Roux, Gordon Wichern, Shinji Watanabe
, Andy M. Sarroff, John R. Hershey:
The Phasebook: Building Complex Masks via Discrete Representations for Source Separation. ICASSP 2019: 66-70 - [c69]Prem Seetharaman, Gordon Wichern, Shrikant Venkataramani, Jonathan Le Roux:
Class-conditional Embeddings for Music Source Separation. ICASSP 2019: 301-305 - [c68]Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo:
Bootstrapping Single-channel Source Separation via Unsupervised Spatial Clustering on Stereo Mixtures. ICASSP 2019: 356-360 - [c67]Jonathan Le Roux, Scott Wisdom, Hakan Erdogan, John R. Hershey:
SDR - Half-baked or Well Done? ICASSP 2019: 626-630 - [c66]Ryo Aihara, Toshiyuki Hanazawa, Yohei Okato, Gordon Wichern, Jonathan Le Roux:
Teacher-student Deep Clustering for Low-delay Single Channel Speech Separation. ICASSP 2019: 690-694 - [c65]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Triggered Attention for End-to-end Speech Recognition. ICASSP 2019: 5666-5670 - [c64]Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe
, Jonathan Le Roux:
Cycle-consistency Training for End-to-end Speech Recognition. ICASSP 2019: 6271-6275 - [c63]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition. INTERSPEECH 2019: 76-80 - [c62]Gordon Wichern, Joe Antognini, Michael Flynn, Licheng Richard Zhu, Emmett McQuinn, Dwight Crow, Ethan Manilow, Jonathan Le Roux:
WHAM!: Extending Speech Separation to Noisy Environments. INTERSPEECH 2019: 1368-1372 - [c61]Hiroshi Seki, Takaaki Hori, Shinji Watanabe
, Jonathan Le Roux, John R. Hershey:
End-to-End Multilingual Multi-Speaker Speech Recognition. INTERSPEECH 2019: 3755-3759 - [c60]Hiroshi Seki, Takaaki Hori, Shinji Watanabe
, Niko Moritz, Jonathan Le Roux:
Vectorized Beam Search for CTC-Attention-Based Speech Recognition. INTERSPEECH 2019: 3825-3829 - [c59]Ethan Manilow, Gordon Wichern, Prem Seetharaman, Jonathan Le Roux:
Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity. WASPAA 2019: 45-49 - [c58]Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin W. Wilson, Jonathan Le Roux, John R. Hershey:
Universal Sound Separation. WASPAA 2019: 175-179 - [i19]Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin W. Wilson, Jonathan Le Roux, John R. Hershey:
Universal Sound Separation. CoRR abs/1905.03330 (2019) - [i18]Gordon Wichern, Joe Antognini, Michael Flynn, Licheng Richard Zhu, Emmett McQuinn, Dwight Crow, Ethan Manilow, Jonathan Le Roux:
WHAM!: Extending Speech Separation to Noisy Environments. CoRR abs/1907.01160 (2019) - [i17]Ethan Manilow, Gordon Wichern, Prem Seetharaman, Jonathan Le Roux:
Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity. CoRR abs/1909.08494 (2019) - [i16]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe:
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition. CoRR abs/1910.06522 (2019) - [i15]Matthew Maciejewski, Gordon Wichern, Emmett McQuinn, Jonathan Le Roux:
WHAMR!: Noisy and Reverberant Single-Channel Speech Separation. CoRR abs/1910.10279 (2019) - [i14]Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo:
Bootstrapping deep music separation from primitive auditory grouping principles. CoRR abs/1910.11133 (2019) - [i13]Fatemeh Pishdadian, Gordon Wichern, Jonathan Le Roux:
Finding Strength in Weakness: Learning to Separate Sounds with Weak Supervision. CoRR abs/1911.02182 (2019) - 2018
- [c57]Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
A Purely End-to-End System for Multi-speaker Speech Recognition. ACL (1) 2018: 2620-2630 - [c56]Zhong-Qiu Wang, Jonathan Le Roux, John R. Hershey:
Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation. ICASSP 2018: 1-5 - [c55]Zhong-Qiu Wang, Jonathan Le Roux, John R. Hershey:
Alternative Objective Functions for Deep Clustering. ICASSP 2018: 686-690 - [c54]Shane Settle, Jonathan Le Roux, Takaaki Hori, Shinji Watanabe
, John R. Hershey:
End-to-End Multi-Speaker Speech Recognition. ICASSP 2018: 4819-4823 - [c53]Hiroshi Seki, Shinji Watanabe
, Takaaki Hori, Jonathan Le Roux, John R. Hershey:
An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech. ICASSP 2018: 4919-4923 - [c52]Zhong-Qiu Wang, Jonathan Le Roux, DeLiang Wang, John R. Hershey:
End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction. INTERSPEECH 2018: 2708-2712 - [c51]Gordon Wichern, Jonathan Le Roux:
Phase Reconstruction with Learned Time-Frequency Representations for Single-Channel Speech Separation. IWAENC 2018: 396-400 - [i12]Zhong-Qiu Wang, Jonathan Le Roux, DeLiang Wang, John R. Hershey:
End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction. CoRR abs/1804.10204 (2018) - [i11]Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
A Purely End-to-end System for Multi-speaker Speech Recognition. CoRR abs/1805.05826 (2018) - [i10]Jonathan Le Roux, Gordon Wichern, Shinji Watanabe, Andy M. Sarroff, John R. Hershey:
Phasebook and Friends: Leveraging Discrete Representations for Source Separation. CoRR abs/1810.01395 (2018) - [i9]Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux:
Cycle-consistency training for end-to-end speech recognition. CoRR abs/1811.01690 (2018) - [i8]Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo:
Bootstrapping single-channel source separation via unsupervised spatial clustering on stereo mixtures. CoRR abs/1811.02130 (2018) - [i7]Jonathan Le Roux, Scott Wisdom, Hakan Erdogan, John R. Hershey:
SDR - half-baked or well done? CoRR abs/1811.02508 (2018) - [i6]Prem Seetharaman, Gordon Wichern, Shrikant Venkataramani, Jonathan Le Roux:
Class-conditional embeddings for music source separation. CoRR abs/1811.03076 (2018) - 2017
- [j8]