default search action
Yui Sudo
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c17]Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe:
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification. ACL (1) 2024: 10192-10209 - [c16]Muhammad Shakeel, Yui Sudo, Yifan Peng, Shinji Watanabe:
Joint Optimization of Streaming and Non-Streaming Automatic Speech Recognition with Multi-Decoder and Knowledge Distillation. ICASSP Workshops 2024: 570-574 - [c15]Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Yifan Peng, Shinji Watanabe:
Contextualized Automatic Speech Recognition With Attention-Based Bias Phrase Boosted Beam Search. ICASSP 2024: 10896-10900 - [c14]Takahiro Osaki, Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai:
Improving Noise Robustness of Automatic Speech Recognition Based on a Parallel Adapter Model with Near-Identity Initialization. IEA/AIE 2024: 454-466 - [i11]Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Yifan Peng, Shinji Watanabe:
Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search. CoRR abs/2401.10449 (2024) - [i10]Yifan Peng, Jinchuan Tian, William Chen, Siddhant Arora, Brian Yan, Yui Sudo, Muhammad Shakeel, Kwanghee Choi, Jiatong Shi, Xuankai Chang, Jee-weon Jung, Shinji Watanabe:
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer. CoRR abs/2401.16658 (2024) - [i9]Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe:
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification. CoRR abs/2402.12654 (2024) - [i8]Yui Sudo, Yosuke Fukumoto, Muhammad Shakeel, Yifan Peng, Shinji Watanabe:
Contextualized Automatic Speech Recognition with Dynamic Vocabulary. CoRR abs/2405.13344 (2024) - [i7]Muhammad Shakeel, Yui Sudo, Yifan Peng, Shinji Watanabe:
Joint Optimization of Streaming and Non-Streaming Automatic Speech Recognition with Multi-Decoder and Knowledge Distillation. CoRR abs/2405.13514 (2024) - [i6]Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Brian Yan, Jiatong Shi, Yifan Peng, Shinji Watanabe:
4D ASR: Joint Beam Search Integrating CTC, Attention, Transducer, and Mask Predict Decoders. CoRR abs/2406.02950 (2024) - [i5]Muhammad Shakeel, Yui Sudo, Yifan Peng, Shinji Watanabe:
Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss. CoRR abs/2406.16120 (2024) - 2023
- [c13]Ryu Takeda, Yui Sudo, Kazunori Komatani:
Flexible Evidence Model to Reduce Uncertainty Mismatch Between Speech Enhancement and ASR Based on Encoder-Decoder Architecture. APSIPA ASC 2023: 1830-1837 - [c12]Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan S. Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-Weon Jung, Soumi Maiti, Shinji Watanabe:
Reproducing Whisper-Style Training Using An Open-Source Toolkit And Publicly Available Data. ASRU 2023: 1-8 - [c11]Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe:
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models. INTERSPEECH 2023: 62-66 - [c10]Yui Sudo, Kazuya Hata, Kazuhiro Nakadai:
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation. INTERSPEECH 2023: 491-495 - [c9]Yui Sudo, Muhammad Shakeel, Brian Yan, Jiatong Shi, Shinji Watanabe:
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders. INTERSPEECH 2023: 3312-3316 - [c8]Yui Sudo, Muhammad Shakeel, Yifan Peng, Shinji Watanabe:
Time-synchronous one-pass Beam Search for Parallel Online and Offline Transducers with Dynamic Block Training. INTERSPEECH 2023: 4479-4483 - [c7]Yui Sudo, Masayuki Takigahira, Hideo Tsuru, Kazuhiro Nakadai, Hirofumi Nakajima:
Online Adaptation of Fourier Series Based Acoustic Transfer Function Model to Improve Sound Source Localization and Separation. RO-MAN 2023: 2058-2063 - [i4]Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe:
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models. CoRR abs/2305.17651 (2023) - [i3]Yui Sudo, Kazuya Hata, Kazuhiro Nakadai:
Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation. CoRR abs/2305.17846 (2023) - [i2]Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan S. Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-weon Jung, Soumi Maiti, Shinji Watanabe:
Reproducing Whisper-Style Training Using an Open-Source Toolkit and Publicly Available Data. CoRR abs/2309.13876 (2023) - 2022
- [c6]Ryu Takeda, Yui Sudo, Kazuhiro Nakadai, Kazunori Komatani:
Empirical Sampling from Latent Utterance-wise Evidence Model for Missing Data ASR based on Neural Encoder-Decoder Model. INTERSPEECH 2022: 3789-3793 - [c5]Yui Sudo, Muhammad Shakeel, Kazuhiro Nakadai, Jiatong Shi, Shinji Watanabe:
Streaming Automatic Speech Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection. INTERSPEECH 2022: 4641-4645 - [i1]Yui Sudo, Muhammad Shakeel, Brian Yan, Jiatong Shi, Shinji Watanabe:
4D ASR: Joint modeling of CTC, Attention, Transducer, and Mask-Predict decoders. CoRR abs/2212.10818 (2022) - 2021
- [j2]Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai:
Multichannel environmental sound segmentation. Appl. Intell. 51(11): 8245-8259 (2021) - [c4]Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai:
Multi-channel Environmental Sound Segmentation utilizing Sound Source Localization and Separation U-Net. SII 2021: 382-387 - 2020
- [j1]Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai:
Sound event aware environmental sound segmentation with Mask U-Net. Adv. Robotics 34(20): 1280-1290 (2020) - [c3]Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai:
Multi-channel Environmental sound segmentation. SII 2020: 820-825
2010 – 2019
- 2019
- [c2]Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai:
Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection. DCASE 2019: 244-247 - [c1]Yui Sudo, Katsutoshi Itoyama, Kenji Nishida, Kazuhiro Nakadai:
Environmental sound segmentation utilizing Mask U-Net. IROS 2019: 5340-5345
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 22:08 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint