- Shota Horiguchi, Yusuke Fujita, Kenji Nagamatsu:
Block-Online Guided Source Separation. SLT 2021: 236-242 - Delowar Hossain, Yoshinao Sato:
Efficient corpus design for wake-word detection. SLT 2021: 1094-1100 - Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Y. Hannun:
Semi-Supervised end-to-end Speech Recognition via Local Prior Matching. SLT 2021: 125-132 - Qiong Hu, Tobias Bleisch, Petko Petkov, Tuomo Raitio, Erik Marchi, Varun Lakshminarasimhan:
Whispered and Lombard Neural Speech Synthesis. SLT 2021: 454-461 - Ke Hu, Ruoming Pang, Tara N. Sainath, Trevor Strohman:
Transformer Based Deliberation for Two-Pass Speech Recognition. SLT 2021: 68-74 - Yushi Hu, Shane Settle, Karen Livescu:
Acoustic Span Embeddings for Multilingual Query-by-Example Search. SLT 2021: 935-942 - Yiling Huang, Yutian Chen, Jason Pelecanos, Quan Wang:
Synth2Aug: Cross-Domain Speaker Recognition with TTS Synthesized Speech. SLT 2021: 316-322 - Tzu-hsien Huang, Jheng-Hao Lin, Hung-yi Lee:
How Far Are We from Robust Voice Conversion: A Survey. SLT 2021: 514-521 - Chien-yu Huang, Yist Y. Lin, Hung-yi Lee, Lin-Shan Lee:
Defending Your Voice: Adversarial Attack on Voice Conversion. SLT 2021: 552-559 - Jaesung Huh, Minjae Lee, Heesoo Heo, Seongkyu Mun, Joon Son Chung:
Metric Learning for Keyword Spotting. SLT 2021: 133-140 - Christiaan Jacobs, Yevgen Matusevych, Herman Kamper:
Acoustic Word Embeddings for Zero-Resource Languages Using Self-Supervised Contrastive Learning and Multilingual Adaptation. SLT 2021: 919-926 - Aditya Jayasimha, Periyasamy Paramasivam:
Personalizing Speech Start Point and End Point Detection in ASR Systems from Speaker Embeddings. SLT 2021: 771-777 - Hiroto Kai, Shinnosuke Takamichi, Sayaka Shiota, Hitoshi Kiya:
Lightweight Voice Anonymization Based on Data-Driven Optimization of Cascaded Voice Modification Modules. SLT 2021: 560-566 - Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka:
Investigation of End-to-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings. SLT 2021: 809-816 - Fang Kang, Feiran Yang, Jun Yang:
Real-Time Independent Vector Analysis with a Deep-Learning-Based Source Model. SLT 2021: 665-669 - Takatomo Kano, Sakriani Sakti, Satoshi Nakamura:
Transformer-Based Direct Speech-To-Speech Translation with Transcoder. SLT 2021: 958-965 - Subash Khanal, Michael T. Johnson, Narjes Bozorg:
Articulatory Comparison of L1 and L2 Speech for Mispronunciation Diagnosis. SLT 2021: 693-697 - Aparna Khare, Srinivas Parthasarathy, Shiva Sundaram:
Self-Supervised Learning with Cross-Modal Transformers for Emotion Recognition. SLT 2021: 381-388 - Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux:
Data Augmenting Contrastive Learning of Speech Representations in the Time Domain. SLT 2021: 215-222 - You Jin Kim, Hee Soo Heo, Soo-Whan Chung, Bong-Jin Lee:
End-To-End Lip Synchronisation Based on Pattern Classification. SLT 2021: 598-605 - Yuxiang Kong, Jian Wu, Quandong Wang, Peng Gao, Weiji Zhuang, Yujun Wang, Lei Xie:
Multi-Channel Automatic Speech Recognition Using Deep Complex Unet. SLT 2021: 104-110 - Suchitra Krishnamachari, Manoj Kumar, So Hyun Kim, Catherine Lord, Shrikanth Narayanan:
Developing Neural Representations for Robust Child-Adult Diarization. SLT 2021: 590-597 - Youngki Kwon, Hee Soo Heo, Jaesung Huh, Bong-Jin Lee, Joon Son Chung:
Look Who's Not Talking. SLT 2021: 567-573 - Seong Min Kye, Joon Son Chung, Hoirin Kim:
Supervised Attention for Speaker Recognition. SLT 2021: 286-293 - Seong Min Kye, Yoohwan Kwon, Joon Son Chung:
Cross Attentive Pooling for Speaker Verification. SLT 2021: 294-300 - Max W. Y. Lam, Jun Wang, Dan Su, Dong Yu:
Effective Low-Cost Time-Domain Audio Separation Using Globally Attentive Locally Recurrent Networks. SLT 2021: 801-808 - Duc Le, Gil Keren, Julian Chan, Jay Mahadeokar, Christian Fuegen, Michael L. Seltzer:
Deep Shallow Fusion for RNN-T Personalization. SLT 2021: 251-257 - Shi-wook Lee:
Domain Generalization with Triplet Network for Cross-Corpus Speech Emotion Recognition. SLT 2021: 389-396 - Yi Lei, Shan Yang, Lei Xie:
Fine-Grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis. SLT 2021: 423-430 - Chenda Li, Jing Shi, Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Naoyuki Kamo, Moto Hira, Tomoki Hayashi, Christoph Böddeker, Zhuo Chen, Shinji Watanabe:
ESPnet-SE: End-To-End Speech Enhancement and Separation Toolkit Designed for ASR Integration. SLT 2021: 785-792