


default search action
Hung-yi Lee
Hung-Yi Lee
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2025
- [i287]Sung-Feng Huang, Heng-Cheng Kuo, Zhehuai Chen, Xuesong Yang, Chao-Han Huck Yang, Yu Tsao, Yu-Chiang Frank Wang, Hung-yi Lee, Szu-Wei Fu:
Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits. CoRR abs/2501.03805 (2025) - [i286]Zhe Li, Man-Wai Mak, Mert Pilanci, Hung-yi Lee, Helen Meng:
Spectral-Aware Low-Rank Adaptation for Speaker Verification. CoRR abs/2501.03829 (2025) - [i285]Jiawei Du, Xuanjun Chen, Haibin Wu, Lin Zhang, I-Ming Lin, I-Hsiang Chiu, Wenze Ren, Yuan Tseng, Yu Tsao, Jyh-Shing Roger Jang, Hung-yi Lee:
CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset. CoRR abs/2501.08238 (2025) - [i284]Chao-Chung Wu, Zhi Rui Tam, Chieh-Yen Lin, Hung-yi Lee, Yun-Nung Chen:
Clear Minds Think Alike: What Makes LLM Fine-tuning Robust? A Study of Token Perplexity. CoRR abs/2501.14315 (2025) - [i283]Chan-Jan Hsu, Yi-Cheng Lin, Chia-Chun Lin, Wei-Chih Chen, Ho-Lam Chung, Chen-An Li, Yi-Chang Chen, Chien-Yu Yu, Ming-Ji Lee, Chien-Cheng Chen, Ru-Heng Huang, Hung-yi Lee, Da-Shan Shiu:
BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation - Challenges and Insights. CoRR abs/2501.17790 (2025) - [i282]Chun-Yi Kuan, Hung-yi Lee:
Gender Bias in Instruction-Guided Speech Synthesis Models. CoRR abs/2502.05649 (2025) - [i281]Yu-Xiang Lin, Chih-Kai Yang, Wei-Chih Chen, Chen-An Li, Chien-yu Huang, Xuanjun Chen, Hung-yi Lee:
A Preliminary Exploration with GPT-4o Voice Mode. CoRR abs/2502.09940 (2025) - 2024
- [j27]Shu-Wen Yang
, Heng-Jui Chang
, Zili Huang, Andy T. Liu
, Cheng-I Lai
, Haibin Wu
, Jiatong Shi
, Xuankai Chang, Hsiang-Sheng Tsai
, Wen-Chin Huang
, Tzu-hsun Feng, Po-Han Chi
, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe
, Hung-yi Lee:
A Large-Scale Evaluation of Speech Foundation Models. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2884-2899 (2024) - [j26]Kai-Wei Chang
, Haibin Wu
, Yu-Kai Wang
, Yuan-Kuei Wu
, Hua Shen, Wei-Cheng Tseng, Iu-thing Kang, Shang-wen Li
, Hung-Yi Lee
:
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3730-3744 (2024) - [j25]Shensian Syu
, Juncheng Xie
, Hung-yi Lee
:
Improving Non-Autoregressive Translation Quality With Pretrained Language Model, Embedding Distillation and Upsampling Strategy for CTC. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4121-4133 (2024) - [c242]Cheng-Han Chiang, Hung-yi Lee:
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations. ACL (Findings) 2024: 2734-2751 - [c241]Guan-Ting Lin, Cheng-Han Chiang, Hung-yi Lee:
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations. ACL (1) 2024: 6626-6642 - [c240]Haibin Wu, Ho-Lam Chung, Yi-Cheng Lin, Yuan-Kuei Wu, Xuanjun Chen, Yu-Chi Pai, Hsiu-Hsuan Wang, Kai-Wei Chang, Alexander H. Liu, Hung-yi Lee:
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models. ACL (Findings) 2024: 10330-10348 - [c239]Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu-Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, Hung-yi Lee:
Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages. ACL (1) 2024: 10943-10959 - [c238]Siddhant Arora, Ankita Pasad, Chung-Ming Chien, Jionghao Han, Roshan S. Sharma, Jee-weon Jung, Hira Dhamyal, William Chen, Suwon Shon, Hung-yi Lee, Karen Livescu, Shinji Watanabe:
On the Evaluation of Speech Foundation Models for Spoken Language Understanding. ACL (Findings) 2024: 11923-11938 - [c237]Wenze Ren, Yi-Cheng Lin, Huang-Cheng Chou, Haibin Wu, Yi-Chiao Wu, Chi-Chun Lee, Hung-Yi Lee, Hsin-Min Wang, Yu Tsao:
EMO-Codec: An In-Depth Look at Emotion Preservation Capacity of Legacy and Neural Codec Models with Subjective and Objective Evaluations. APSIPA 2024: 1-6 - [c236]Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee:
Empower Typed Descriptions by Large Language Models for Speech Emotion Recognition. APSIPA 2024: 1-6 - [c235]Cheng-Han Chiang, Hung-yi Lee:
Over-Reasoning and Redundant Calculation of Large Language Models. EACL (2) 2024: 161-169 - [c234]Zhi Rui Tam, Cheng-Kuang Wu, Yi-Lin Tsai, Chieh-Yen Lin, Hung-yi Lee, Yun-Nung Chen:
Let Me Speak Freely? A Study On The Impact Of Format Restrictions On Large Language Model Performance. EMNLP (Industry Track) 2024: 1218-1236 - [c233]Cheng-Kuang Wu, Zhi Rui Tam, Chao-Chung Wu, Chieh-Yen Lin, Hung-yi Lee, Yun-Nung Chen:
I Need Help! Evaluating LLM's Ability to Ask for Users' Support: A Case Study on Text-to-SQL Generation. EMNLP 2024: 2191-2199 - [c232]Cheng-Han Chiang, Wei-Chih Chen, Chun-Yi Kuan, Chienchou Yang, Hung-yi Lee:
Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course. EMNLP 2024: 2489-2513 - [c231]Hsuan Su, Hua Farn, Fan-Yun Sun, Shang-Tse Chen, Hung-yi Lee:
Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition. EMNLP 2024: 8905-8915 - [c230]Guan-Ting Lin, Hung-yi Lee:
Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? EMNLP (Findings) 2024: 13391-13401 - [c229]Hung-Ting Su, Ya-Ching Hsu, Xudong Lin, Xiang Qian Shi, Yulei Niu, Han-Yuan Hsu, Hung-yi Lee, Winston H. Hsu:
Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses. EMNLP (Findings) 2024: 14839-14854 - [c228]Tzu-Han Lin, Chen-An Li, Hung-yi Lee, Yun-Nung Chen:
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging. EMNLP 2024: 15506-15524 - [c227]Guan-Ting Lin, Wei Huang, Hung-yi Lee:
Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech. EMNLP 2024: 20003-20015 - [c226]Hsuan-Fu Wang, Yi-Jen Shih, Heng-Jui Chang, Layne Berry, Puyuan Peng, Hung-Yi Lee, Hsin-Min Wang, David Harwath:
SpeechCLIP+: Self-Supervised Multi-Task Representation Learning for Speech Via Clip and Speech-Image Data. ICASSP Workshops 2024: 465-469 - [c225]Fabian Ritter Gutierrez, Kuan-Po Huang, Dianwen Ng, Jeremy H. M. Wong, Hung-Yi Lee, Eng Siong Chng, Nancy F. Chen:
Noise Robust Distillation of Self-Supervised Speech Models via Correlation Metrics. ICASSP Workshops 2024: 495-499 - [c224]Chih-Kai Yang, Kuan-Po Huang, Ke-Han Lu, Chun-Yi Kuan, Chi-Yuan Hsiao, Hung-Yi Lee:
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR And Speech-to-Text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision. ICASSP Workshops 2024: 540-544 - [c223]Hung-Chieh Fang, Nai-Xuan Ye, Yi-Jen Shih, Puyuan Peng, Hsuan-Fu Wang, Layne Berry, Hung-Yi Lee, David Harwath:
Integrating Self-Supervised Speech Model with Pseudo Word-Level Targets from Visually-Grounded Speech Model. ICASSP Workshops 2024: 645-649 - [c222]Tzu-Han Lin, How-Shing Wang, Hao-Yung Weng, Kuang-Chen Peng, Zih-Ching Chen, Hung-Yi Lee:
PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques. ICASSP Workshops 2024: 705-709 - [c221]Haibin Wu, Heng-Cheng Kuo, Yu Tsao, Hung-Yi Lee:
Scalable Ensemble-Based Detection Method Against Adversarial Attacks For Speaker Verification. ICASSP 2024: 4670-4674 - [c220]Yuan Tseng, Layne Berry, Yiting Chen, I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Poyao Huang, Chun-Mao Lai, Shang-Wen Li, David Harwath, Yu Tsao, Abdelrahman Mohamed, Chi-Luen Feng, Hung-Yi Lee:
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models. ICASSP 2024: 6890-6894 - [c219]Xuanjun Chen, Haibin Wu, Chung-Che Wang, Hung-Yi Lee, Jyh-Shing Roger Jang:
Multimodal Transformer Distillation for Audio-Visual Synchronization. ICASSP 2024: 7755-7759 - [c218]Kuan-Po Huang, Chih-Kai Yang, Yu-Kuan Fu, Ewan Dunbar, Hung-Yi Lee:
Zero Resource Code-Switched Speech Benchmark Using Speech Utterance Pairs for Multiple Spoken Languages. ICASSP 2024: 10006-10010 - [c217]Guan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-Yi Lee, Ivan Bulyko:
Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue. ICASSP 2024: 10316-10320 - [c216]Chien-Yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan S. Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-Yi Lee:
Dynamic-Superb: Towards a Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark For Speech. ICASSP 2024: 12136-12140 - [c215]Chyi-Jiunn Lin, Guan-Ting Lin, Yung-Sung Chuang, Wei-Lun Wu, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Lin-Shan Lee:
SpeechDPR: End-To-End Spoken Passage Retrieval For Open-Domain Spoken Question Answering. ICASSP 2024: 12476-12480 - [c214]Kevin Everson, Yile Gu, Chao-Han Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-Yi Lee, Ariya Rastrow, Andreas Stolcke:
Towards ASR Robust Spoken Language Understanding Through in-Context Learning with Word Confusion Networks. ICASSP 2024: 12856-12860 - [c213]Pei-Jun Liao, Hung-Yi Lee, Hsin-Min Wang:
Ensemble Knowledge Distillation from Speech SSL Models Considering Inter-Teacher Differences. ISCSLP 2024: 716-720 - [c212]Shih-Cheng Huang, Shih-Heng Wang, Min-Han Shih, Saurav Sahay, Hung-yi Lee:
Systematic Analysis for Pretrained Language Model Priming for Parameter-Efficient Fine-tuning. NAACL (Student Research Workshop) 2024: 1-7 - [c211]Yun-Yen Chuang, Hung-Min Hsu, Kevin Lin, Chen-Sheng Gu, Ling Zhen Li, Ray-I Chang, Hung-yi Lee:
Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration. NeurIPS 2024 - [c210]Liang-Hsuan Tseng, En-Pei Hu, Cheng-Han Chiang, Yuan Tseng, Hung-yi Lee, Lin-Shan Lee, Shao-Hua Sun:
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR. NeurIPS 2024 - [c209]Cheng-Kuang Wu, Zhi Rui Tam, Chieh-Yen Lin, Yun-Nung Chen, Hung-yi Lee:
StreamBench: Towards Benchmarking Continuous Improvement of Language Agents. NeurIPS 2024 - [c208]Chih-Kai Yang, Kuan-Po Huang, Hung-Yi Lee:
Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper. SLT 2024: 1-8 - [c207]Liang-Hsuan Tseng, Zih-Ching Chen, Wei-Shun Chang, Cheng-Kuang Lee, Tsung-Ren Huang, Hung-Yi Lee:
Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation For Code-Switching ASR Using Realistic Data. SLT 2024: 118-125 - [c206]Shih-Heng Wang, Jiatong Shi, Chien-Yu Huang, Shinji Watanabe, Hung-Yi Lee:
Fusion Of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition. SLT 2024: 247-254 - [c205]Tzu-Quan Lin, Guan-Ting Lin, Hung-Yi Lee, Hao Tang:
Property Neurons in Self-Supervised Speech Transformers. SLT 2024: 401-408 - [c204]Yi-Cheng Lin, Tzu-Quan Lin, Chih-Kai Yang, Ke-Han Lu, Wei-Chih Chen, Chun-Yi Kuan, Hung-Yi Lee:
Listen and Speak Fairly: a Study on Semantic Gender Bias in Speech Integrated Large Language Models. SLT 2024: 439-446 - [c203]Huang-Cheng Chou, Haibin Wu, Lucas Goncalves, Seong-Gyun Leem, Ali Salman, Carlos Busso, Hung-Yi Lee, Chi-Chun Lee:
Embracing Ambiguity And Subjectivity Using The All-Inclusive Aggregation Rule For Evaluating Multi-Label Speech Emotion Recognition Systems. SLT 2024: 502-509 - [c202]Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee:
Open-Emotion: A Reproducible EMO-Superb For Speech Emotion Recognition Systems. SLT 2024: 510-517 - [c201]Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kaiwei Chang, Jiawei Du, Ke-Han Lu, Alexander H. Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James R. Glass, Shinji Watanabe, Hung-Yi Lee:
Codec-Superb @ SLT 2024: A Lightweight Benchmark For Neural Audio Codec Models. SLT 2024: 570-577 - [c200]Sung-Feng Huang, Heng-Cheng Kuo, Zhehuai Chen, Xuesong Yang, Chao-Han Huck Yang, Yu Tsao, Yu-Chiang Frank Wang, Hung-Yi Lee, Szu-Wei Fu:
Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits. SLT 2024: 652-659 - [c199]Yi-Cheng Lin, Wei-Chih Chen, Hung-Yi Lee:
Spoken Stereoset: on Evaluating Social Bias Toward Speaker in Speech Large Language Models. SLT 2024: 871-878 - [c198]Jiawei Du, I-Ming Lin, I-Hsiang Chiu, Xuanjun Chen, Haibin Wu, Wenze Ren, Yu Tsao, Hung-Yi Lee, Jyh-Shing Roger Jang:
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset. SLT 2024: 921-928 - [c197]Andy T. Liu, Yi-Cheng Lin, Haibin Wu, Stefan Winkler, Hung-Yi Lee:
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget. SLT 2024: 961-968 - [c196]Chun-Yi Kuan, Chih-Kai Yang, Wei-Ping Huang, Ke-Han Lu, Hung-Yi Lee:
Speech-Copilot: Leveraging Large Language Models for Speech Processing Via Task Decomposition, Modularization, and Program Generation. SLT 2024: 1060-1067 - [i280]Chih-Kai Yang, Kuan-Po Huang, Ke-Han Lu, Chun-Yi Kuan, Chi-Yuan Hsiao, Hung-yi Lee:
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision. CoRR abs/2401.00273 (2024) - [i279]Tzu-Han Lin, How-Shing Wang, Hao-Yung Weng, Kuang-Chen Peng, Zih-Ching Chen, Hung-yi Lee:
PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques. CoRR abs/2401.02122 (2024) - [i278]Kevin Everson, Yile Gu, Chao-Han Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-yi Lee, Ariya Rastrow, Andreas Stolcke:
Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks. CoRR abs/2401.02921 (2024) - [i277]Chen-An Li, Hung-Yi Lee:
Examining Forgetting in Continual Pre-training of Aligned Large Language Models. CoRR abs/2401.03129 (2024) - [i276]David Cheng-Han Chiang, Hung-Yi Lee:
Over-Reasoning and Redundant Calculation of Large Language Models. CoRR abs/2401.11467 (2024) - [i275]Chyi-Jiunn Lin, Guan-Ting Lin, Yung-Sung Chuang, Wei-Lun Wu, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Lin-Shan Lee:
SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering. CoRR abs/2401.13463 (2024) - [i274]Wei-Ping Huang, Sung-Feng Huang, Hung-yi Lee:
Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization. CoRR abs/2402.01692 (2024) - [i273]Liang-Hsuan Tseng, En-Pei Hu, David Cheng-Han Chiang, Yuan Tseng, Hung-yi Lee, Lin-Shan Lee, Shao-Hua Sun:
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR. CoRR abs/2402.03988 (2024) - [i272]David Cheng-Han Chiang, Hung-yi Lee:
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations. CoRR abs/2402.05629 (2024) - [i271]Hung-Chieh Fang, Nai-Xuan Ye, Yi-Jen Shih, Puyuan Peng, Hsuan-Fu Wang, Layne Berry, Hung-yi Lee, David Harwath:
Integrating Self-supervised Speech Model with Pseudo Word-level Targets from Visually-grounded Speech Model. CoRR abs/2402.05819 (2024) - [i270]Hsuan-Fu Wang, Yi-Jen Shih, Heng-Jui Chang, Layne Berry, Puyuan Peng, Hung-yi Lee, Hsin-Min Wang, David Harwath:
SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data. CoRR abs/2402.06959 (2024) - [i269]Guan-Ting Lin, David Cheng-Han Chiang, Hung-yi Lee:
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations. CoRR abs/2402.12786 (2024) - [i268]Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du
, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee:
EMO-SUPERB: An In-depth Look at Speech Emotion Recognition. CoRR abs/2402.13018 (2024) - [i267]Haibin Wu, Ho-Lam Chung, Yi-Cheng Lin, Yuan-Kuei Wu, Xuanjun Chen, Yu-Chi Pai, Hsiu-Hsuan Wang, Kai-Wei Chang, Alexander H. Liu, Hung-yi Lee:
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models. CoRR abs/2402.13071 (2024) - [i266]Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kai-Wei Chang, Ho-Lam Chung, Alexander H. Liu, Hung-yi Lee:
Towards audio language modeling - an overview. CoRR abs/2402.13236 (2024) - [i265]Shu-Wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe
, Hung-yi Lee:
A Large-Scale Evaluation of Speech Foundation Models. CoRR abs/2404.09385 (2024) - [i264]Li-Chun Lu, Shou-Jen Chen, Tsung-Min Pai, Chan-Hung Yu, Hung-yi Lee, Shao-Hua Sun:
LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play. CoRR abs/2405.06373 (2024) - [i263]Kuang-Ming Chen, Hung-yi Lee:
InstructionCP: A fast approach to transfer Large Language Models into target language. CoRR abs/2405.20175 (2024) - [i262]Hsuan Su, Hua Farn, Fan-Yun Sun, Shang-Tse Chen, Hung-yi Lee:
Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition. CoRR abs/2406.02925 (2024) - [i261]Fabian Ritter Gutierrez, Kuan-Po Huang, Jeremy H. M. Wong, Dianwen Ng, Hung-yi Lee, Nancy F. Chen, Eng Siong Chng:
Dataset-Distillation Generative Model for Speech Emotion Recognition. CoRR abs/2406.02963 (2024) - [i260]Xuanjun Chen, Jiawei Du, Haibin Wu, Jyh-Shing Roger Jang, Hung-yi Lee:
Neural Codec-based Adversarial Sample Detection for Speaker Verification. CoRR abs/2406.04582 (2024) - [i259]Yi-Cheng Lin, Tzu-Quan Lin, Hsi-Che Lin, Andy T. Liu, Hung-yi Lee:
On the social bias of speech self-supervised models. CoRR abs/2406.04997 (2024) - [i258]Tzu-Quan Lin, Hung-yi Lee, Hao Tang:
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models. CoRR abs/2406.05464 (2024) - [i257]Chih-Kai Yang, Kuan-Po Huang, Hung-yi Lee:
Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper. CoRR abs/2406.05806 (2024) - [i256]Haibin Wu, Yuan Tseng, Hung-yi Lee:
CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from Codec-Based Speech Synthesis Systems. CoRR abs/2406.07237 (2024) - [i255]Chun-Yi Kuan, Wei-Ping Huang, Hung-yi Lee:
Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models. CoRR abs/2406.08402 (2024) - [i254]Jiatong Shi, Shih-Heng Wang, William Chen, Martijn Bartelds, Vanya Bannihatti Kumar, Jinchuan Tian, Xuankai Chang, Dan Jurafsky, Karen Livescu, Hung-yi Lee, Shinji Watanabe:
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets. CoRR abs/2406.08641 (2024) - [i253]Cheng-Kuang Wu, Zhi Rui Tam, Chieh-Yen Lin, Yun-Nung Chen, Hung-yi Lee:
StreamBench: Towards Benchmarking Continuous Improvement of Language Agents. CoRR abs/2406.08747 (2024) - [i252]Siddhant Arora, Ankita Pasad, Chung-Ming Chien, Jionghao Han, Roshan S. Sharma, Jee-weon Jung, Hira Dhamyal, William Chen, Suwon Shon, Hung-yi Lee, Karen Livescu, Shinji Watanabe:
On the Evaluation of Speech Foundation Models for Spoken Language Understanding. CoRR abs/2406.10083 (2024) - [i251]Hung-Ting Su, Chun-Tong Chao, Ya-Ching Hsu, Xudong Lin, Yulei Niu, Hung-Yi Lee, Winston H. Hsu:
Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies. CoRR abs/2406.10923 (2024) - [i250]Guan-Ting Lin, Wei-Ping Huang, Hung-yi Lee:
Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech. CoRR abs/2406.11064 (2024) - [i249]Guan-Ting Lin, Hung-yi Lee:
Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? CoRR abs/2406.11065 (2024) - [i248]Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, He Huang, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee:
DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment. CoRR abs/2406.18871 (2024) - [i247]Tzu-Han Lin, Chen-An Li, Hung-yi Lee, Yun-Nung Chen:
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging. CoRR abs/2407.01470 (2024) - [i246]Yu-Kuan Fu, Cheng-Kuang Lee, Hsiu-Hsuan Wang, Hung-yi Lee:
Investigating the Effects of Large-Scale Pseudo-Stereo Data and Different Speech Foundation Model on Dialogue Generative Spoken Language Model. CoRR abs/2407.01911 (2024) - [i245]Cheng-Han Chiang, Wei-Chih Chen, Chun-Yi Kuan, Chienchou Yang, Hung-yi Lee:
Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course. CoRR abs/2407.05216 (2024) - [i244]Yi-Cheng Lin, Tzu-Quan Lin, Chih-Kai Yang, Ke-Han Lu, Wei-Chih Chen, Chun-Yi Kuan, Hung-yi Lee:
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models. CoRR abs/2407.06957 (2024) - [i243]Chun-Yi Kuan, Chih-Kai Yang, Wei-Ping Huang, Ke-Han Lu, Hung-yi Lee:
Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation. CoRR abs/2407.09886 (2024) - [i242]Liang-Hsuan Tseng, Zih-Ching Chen, Wei-Shun Chang, Cheng-Kuang Lee, Tsung-Ren Huang, Hung-yi Lee:
Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data. CoRR abs/2407.10603 (2024) - [i241]Cheng-Kuang Wu, Zhi Rui Tam, Chao-Chung Wu, Chieh-Yen Lin, Hung-yi Lee, Yun-Nung Chen:
I Need Help! Evaluating LLM's Ability to Ask for Users' Support: A Case Study on Text-to-SQL Generation. CoRR abs/2407.14767 (2024) - [i240]Wenze Ren, Yi-Cheng Lin, Huang-Cheng Chou, Haibin Wu, Yi-Chiao Wu, Chi-Chun Lee, Hung-yi Lee, Yu Tsao:
EMO-Codec: An In-Depth Look at Emotion Preservation capacity of Legacy and Neural Codec Models With Subjective and Objective Evaluations. CoRR abs/2407.15458 (2024) - [i239]Zhi Rui Tam, Cheng-Kuang Wu, Yi-Lin Tsai, Chieh-Yen Lin, Hung-yi Lee, Yun-Nung Chen:
Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models. CoRR abs/2408.02442 (2024) - [i238]Shachi H. Kumar, Saurav Sahay, Sahisnu Mazumder, Eda Okur, Ramesh Manuvinakurike, Nicole Beckage, Hsuan Su, Hung-yi Lee, Lama Nachman:
Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models. CoRR abs/2408.03907 (2024) - [i237]Yi-Cheng Lin, Wei-Chih Chen, Hung-yi Lee:
Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models. CoRR abs/2408.07665 (2024) - [i236]Kai-Wei Chang, Haibin Wu, Yu-Kai Wang, Yuan-Kuei Wu, Hua Shen, Wei-Cheng Tseng, Iu-thing Kang, Shang-Wen Li, Hung-yi Lee:
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks. CoRR abs/2408.13040 (2024) - [i235]Chien-yu Huang, Min-Han Shih, Ke-Han Lu, Chi-Yuan Hsiao, Hung-yi Lee:
SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning. CoRR abs/2408.13891 (2024) - [i234]Juncheng Xie, Shensian Syu, Hung-yi Lee:
Non-instructional Fine-tuning: Enabling Instruction-Following Capabilities in Pre-trained Language Models without Instruction-Following Data. CoRR abs/2409.00096 (2024) - [i233]Tzu-Quan Lin, Guan-Ting Lin, Hung-yi Lee, Hao Tang:
Property Neurons in Self-Supervised Speech Transformers. CoRR abs/2409.05910 (2024) - [i232]Jiawei Du
, I-Ming Lin, I-Hsiang Chiu
, Xuanjun Chen, Haibin Wu, Wenze Ren, Yu Tsao, Hung-yi Lee, Jyh-Shing Roger Jang:
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset. CoRR abs/2409.08731 (2024) - [i231]Ming-Hao Hsu, Kuan Po Huang, Hung-yi Lee:
Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages. CoRR abs/2409.10429 (2024) - [i230]Hsi-Che Lin, Yi-Cheng Lin, Huang-Cheng Chou, Hung-yi Lee:
Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection. CoRR abs/2409.10985 (2024) - [i229]Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kai-Wei Chang, Jiawei Du
, Ke-Han Lu, Alexander H. Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James R. Glass, Shinji Watanabe, Hung-yi Lee:
Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models. CoRR abs/2409.14085 (2024) - [i228]Hung-Ting Su, Ya-Ching Hsu, Xudong Lin, Xiang Qian Shi, Yulei Niu, Han-Yuan Hsu, Hung-yi Lee, Winston H. Hsu:
Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses. CoRR abs/2409.14324 (2024) - [i227]Andy T. Liu, Yi-Cheng Lin, Haibin Wu, Stefan Winkler, Hung-yi Lee:
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget. CoRR abs/2409.16295 (2024) - [i226]Ke-Han Lu, Zhehuai Chen, Szu-Wei Fu, Chao-Han Huck Yang, Jagadeesh Balam, Boris Ginsburg, Yu-Chiang Frank Wang, Hung-yi Lee:
Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data. CoRR abs/2409.20007 (2024) - [i225]Yun-Yen Chuang, Hung-Min Hsu, Kevin Lin, Chen-Sheng Gu, Ling Zhen Li, Ray-I Chang, Hung-yi Lee:
Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration. CoRR abs/2410.13201 (2024) - [i224]Chun-Yi Kuan, Hung-yi Lee:
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning. CoRR abs/2410.16130 (2024) - [i223]Guan-Ting Lin, Prashanth Gurunath Shivakumar, Aditya Gourav, Yile Gu, Ankur Gandhe, Hung-yi Lee, Ivan Bulyko:
Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback. CoRR abs/2411.01834 (2024) - [i222]Chien-yu Huang, Wei-Chih Chen, Shu-Wen Yang, Andy T. Liu, Chen-An Li, Yu-Xiang Lin, Wei-Cheng Tseng, Anuj Diwan, Yi-Jen Shih, Jiatong Shi, William Chen, Xuanjun Chen, Chi-Yuan Hsiao, Puyuan Peng, Shih-Heng Wang, Chun-Yi Kuan, Ke-Han Lu, Kai-Wei Chang, Chih-Kai Yang, Fabian Ritter Gutierrez, Ming To Chuang, Kuan-Po Huang, Siddhant Arora, You-Kuan Lin, Eunjung Yeo, Kalvin Chang, Chung-Ming Chien, Kwanghee Choi, Cheng-Hsiu Hsieh, Yi-Cheng Lin, Chee-En Yu, I-Hsiang Chiu, Heitor R. Guimarães, Jionghao Han, Tzu-Quan Lin, Tzu-Yuan Lin, Homu Chang, Ting-Wu Chang, Chun Wei Chen, Shou-Jen Chen, Yu-Hua Chen, Hsi-Chun Cheng, Kunal Dhawan, Jia-Lin Fang, Shi-Xin Fang, Kuan-Yu Fang Chiang, Chi An Fu, Hsien-Fu Hsiao, Ching Yu Hsu, Shao-Syuan Huang, Lee Chen Wei, Hsi-Che Lin, Hsuan-Hao Lin, Hsuan-Ting Lin, Jian-Ren Lin, Ting-Chun Liu, Li-Chun Lu, Tsung-Min Pai, Ankita Pasad, Shih-Yun Shan Kuan, Suwon Shon, Yuxun Tang, Yun-Shao Tsai, Jui-Chiang Wei, Tzu-Chieh Wei, Chengxi Wu, Dien-Ruei Wu, Chao-Han Huck Yang, Chieh-Chi Yang, Jia Qi Yip, Shao-Xiang Yuan, Vahid Noroozi, Zhehuai Chen, Haibin Wu, Karen Livescu, David Harwath, Shinji Watanabe, Hung-yi Lee:
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks. CoRR abs/2411.05361 (2024) - [i221]Chih-Kai Yang, Yu-Kuan Fu, Chen-An Li, Yi-Cheng Lin, Yu-Xiang Lin, Wei-Chih Chen, Ho-Lam Chung, Chun-Yi Kuan, Wei-Ping Huang, Ke-Han Lu, Tzu-Quan Lin, Hsiu-Hsuan Wang, En-Pei Hu, Chan-Jan Hsu, Liang-Hsuan Tseng, I-Hsiang Chiu, Ulin Sanga, Xuanjun Chen, Po-Chun Hsu, Shu-Wen Yang, Hung-yi Lee:
Building a Taiwanese Mandarin Spoken Language Model: A First Attempt. CoRR abs/2411.07111 (2024) - [i220]Shih-Heng Wang, Jiatong Shi, Chien-yu Huang, Shinji Watanabe, Hung-yi Lee:
Fusion of Discrete Representations and Self-Augmented Representations for Multilingual Automatic Speech Recognition. CoRR abs/2411.18107 (2024) - [i219]Shih-Heng Wang, Zih-Ching Chen, Jiatong Shi, Ming-To Chuang, Guan-Ting Lin, Kuan-Po Huang, David Harwath, Shang-Wen Li, Hung-yi Lee:
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario. CoRR abs/2411.18217 (2024) - [i218]Shao-Syuan Huang, Kuan-Po Huang, Andy T. Liu, Hung-yi Lee:
Enhancing Multilingual ASR for Unseen Languages via Language Embedding Modeling. CoRR abs/2412.16474 (2024) - [i217]Hua Farn, Hsuan Su, Shachi H. Kumar, Saurav Sahay, Shang-Tse Chen, Hung-yi Lee:
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging. CoRR abs/2412.19512 (2024) - 2023
- [j24]Po-Chun Hsu
, Da-Rong Liu
, Andy T. Liu
, Hung-yi Lee
:
Parallel Synthesis for Autoregressive Speech Generation. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3095-3111 (2023) - [j23]Yun-Yen Chuang
, Hung-Min Hsu
, Kevin Lin
, Ray-I Chang
, Hung-Yi Lee
:
MetaEx-GAN: Meta Exploration to Improve Natural Language Generation via Generative Adversarial Networks. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3968-3980 (2023) - [c195]David Cheng-Han Chiang, Hung-yi Lee:
Are Synonym Substitution Attacks Really Synonym Substitution Attacks? ACL (Findings) 2023: 1853-1878 - [c194]Suwon Shon, Siddhant Arora, Chyi-Jiunn Lin, Ankita Pasad, Felix Wu, Roshan S. Sharma, Wei-Lun Wu, Hung-yi Lee, Karen Livescu, Shinji Watanabe:
SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks. ACL (1) 2023: 8906-8937 - [c193]Derek Xu, Shuyan Dong, Changhan Wang, Suyoun Kim, Zhaojiang Lin, Bing Liu, Akshat Shrivastava, Shang-Wen Li, Liang-Hsuan Tseng, Guan-Ting Lin, Alexei Baevski, Hung-yi Lee, Yizhou Sun, Wei Wang:
Introducing Semantics into Speech Encoders. ACL (1) 2023: 11413-11429 - [c192]David Cheng-Han Chiang, Hung-yi Lee:
Can Large Language Models Be an Alternative to Human Evaluations? ACL (1) 2023: 15607-15631 - [c191]Kai-Wei Chang, Ming-Hsin Chen, Yun-Ping Lin, Jing Neng Hsu, Paul Kuo-Ming Huang, Chien-Yu Huang, Shang-Wen Li, Hung-Yi Lee:
Prompting and Adapter Tuning For Self-Supervised Encoder-Decoder Speech Model. ASRU 2023: 1-8 - [c190]Wei-Ping Huang, Sung-Feng Huang, Hung-Yi Lee:
Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization. ASRU 2023: 1-8 - [c189]Chun-Yi Kuan, Chen-An Li, Tsu-Yuan Hsu, Tse-Yang Lin, Ho-Lam Chung, Kai-Wei Chang, Shuo-Yiin Chang, Hung-Yi Lee:
Towards General-Purpose Text-Instruction-Guided Voice Conversion. ASRU 2023: 1-8 - [c188]Tzu-Quan Lin, Hung-Yi Lee, Hao Tang:
MelHuBERT: A Simplified Hubert on Mel Spectrograms. ASRU 2023: 1-8 - [c187]Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chuang, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Shinji Watanabe
:
Findings of the 2023 ML-Superb Challenge: Pre-Training And Evaluation Over More Languages And Beyond. ASRU 2023: 1-8 - [c186]Yu-Hsiang Wang, Huang-Yu Chen, Kai-Wei Chang, Winston H. Hsu, Hung-Yi Lee:
Minisuperb: Lightweight Benchmark for Self-Supervised Speech Models. ASRU 2023: 1-8 - [c185]Jun-You Wang, Hung-Yi Lee, Jyh-Shing Roger Jang, Li Su:
Zero-Shot Singing Voice Synthesis from Musical Score. ASRU 2023: 1-8 - [c184]Haibin Wu, Jiawen Kang, Lingwei Meng, Helen Meng, Hung-yi Lee:
The Defender's Perspective on Automatic Speaker Verification: An Overview. DADA@IJCAI 2023: 6-11 - [c183]David Cheng-Han Chiang, Hung-yi Lee:
A Closer Look into Using Large Language Models for Automatic Evaluation. EMNLP (Findings) 2023: 8928-8942 - [c182]Layne Berry, Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Hung-Yi Lee, David Harwath:
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval. ICASSP 2023: 1-5 - [c181]Hsuan-Jui Chen, Yen Meng, Hung-yi Lee:
Once-for-All Sequence Compression for Self-Supervised Speech Models. ICASSP 2023: 1-5 - [c180]Zih-Ching Chen, Yu-Shun Sung, Hung-Yi Lee:
Chapter: Exploiting Convolutional Neural Network Adapters for Self-Supervised Speech Models. ICASSP Workshops 2023: 1-5 - [c179]Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola García, Hung-Yi Lee, Shinji Watanabe
, Sanjeev Khudanpur:
Euro: Espnet Unsupervised ASR Open-Source Toolkit. ICASSP 2023: 1-5 - [c178]Chan-Jan Hsu, Ho-Lam Chung, Hung-Yi Lee, Yu Tsao:
T5lephone: Bridging Speech and Text Self-Supervised Models for Spoken Language Understanding Via Phoneme Level T5. ICASSP 2023: 1-5 - [c177]Sung-Feng Huang, Chia-Ping Chen, Zhi-Sheng Chen, Yu-Pao Tsai, Hung-Yi Lee:
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning. ICASSP 2023: 1-5 - [c176]Kuan-Po Huang, Tzu-hsun Feng, Yu-Kuan Fu, Tsu-Yuan Hsu, Po-Chieh Yen, Wei-Cheng Tseng, Kai-Wei Chang, Hung-Yi Lee:
Ensemble Knowledge Distillation of Self-Supervised Speech Models. ICASSP 2023: 1-5 - [c175]Jiatong Shi, Chan-Jan Hsu, Ho-Lam Chung, Dongji Gao, Paola García, Shinji Watanabe
, Ann Lee, Hung-Yi Lee:
Bridging Speech and Textual Pre-Trained Models With Unsupervised ASR. ICASSP 2023: 1-5 - [c174]Yuan Tseng, Cheng-I Jeff Lai, Hung-Yi Lee:
Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences. ICASSP 2023: 1-5 - [c173]Guan-Ting Liu, En-Pei Hu, Pu-Jen Cheng, Hung-Yi Lee, Shao-Hua Sun:
Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs. ICML 2023: 21672-21697 - [c172]Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? INTERSPEECH 2023: 456-460 - [c171]Jiatong Shi, Dan Berrebbi, William Chen, En-Pei Hu, Wei-Ping Huang, Ho-Lam Chung, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe
:
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark. INTERSPEECH 2023: 884-888 - [c170]Guan-Wei Wu, Guan-Ting Lin, Shang-Wen Li, Hung-yi Lee:
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target. INTERSPEECH 2023: 1503-1507 - [c169]Cheng-Han Chiang, Wei-Ping Huang, Hung-yi Lee:
Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously. INTERSPEECH 2023: 5551-5555 - [c168]David Cheng-Han Chiang, Hung-yi Lee, Yung-Sung Chuang, James R. Glass:
Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS. RepL4NLP@ACL 2023: 289-302 - [e2]C. Maria Keet, Hung-Yi Lee, Sina Zarrieß:
Proceedings of the 16th International Natural Language Generation Conference, INLG 2023, Prague, Czechia, September 11 - 15, 2023. Association for Computational Linguistics 2023, ISBN 979-8-89176-001-1 [contents] - [i216]Guan-Ting Liu, En-Pei Hu, Pu-Jen Cheng, Hung-Yi Lee, Shao-Hua Sun:
Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs. CoRR abs/2301.12950 (2023) - [i215]Hsuan Su, Shachi H. Kumar, Sahisnu Mazumder, Wenda Chen
, Ramesh Manuvinakurike, Eda Okur, Saurav Sahay, Lama Nachman, Shang-Tse Chen, Hung-yi Lee:
Position Matters! Empirical Study of Order Effect in Knowledge-grounded Dialogue. CoRR abs/2302.05888 (2023) - [i214]Kuan-Po Huang, Tzu-hsun Feng, Yu-Kuan Fu, Tsu-Yuan Hsu, Po-Chieh Yen, Wei-Cheng Tseng, Kai-Wei Chang, Hung-yi Lee:
Ensemble knowledge distillation of self-supervised speech models. CoRR abs/2302.12757 (2023) - [i213]Kai-Wei Chang, Yu-Kai Wang, Hua Shen, Iu-thing Kang, Wei-Cheng Tseng, Shang-Wen Li, Hung-yi Lee:
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks. CoRR abs/2303.00733 (2023) - [i212]Yuan Tseng, Cheng-I Lai, Hung-yi Lee:
Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences. CoRR abs/2303.08809 (2023) - [i211]Sung-Feng Huang, Chia-Ping Chen, Zhi-Sheng Chen, Yu-Pao Tsai, Hung-yi Lee:
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning. CoRR abs/2303.11816 (2023) - [i210]David Cheng-Han Chiang, Hung-yi Lee:
Can Large Language Models Be an Alternative to Human Evaluations? CoRR abs/2305.01937 (2023) - [i209]Yu-Kuan Fu, Liang-Hsuan Tseng, Jiatong Shi, Chen-An Li, Tsu-Yuan Hsu, Shinji Watanabe, Hung-Yi Lee:
Improving Cascaded Unsupervised Speech Translation with Denoising Back-translation. CoRR abs/2305.07455 (2023) - [i208]Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei-Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe
:
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark. CoRR abs/2305.10615 (2023) - [i207]Haibin Wu, Jiawen Kang, Lingwei Meng, Helen Meng, Hung-yi Lee:
The defender's perspective on automatic speaker verification: An overview. CoRR abs/2305.12804 (2023) - [i206]Guan-Wei Wu, Guan-Ting Lin, Shang-Wen Li, Hung-yi Lee:
Improving Textless Spoken Language Understanding with Discrete Units as Intermediate Target. CoRR abs/2305.18096 (2023) - [i205]Yu-Hsiang Wang, Huang-Yu Chen, Kai-Wei Chang, Winston H. Hsu, Hung-yi Lee:
MiniSUPERB: Lightweight Benchmark for Self-supervised Speech Models. CoRR abs/2305.19011 (2023) - [i204]Zih-Ching Chen, Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Shuo-Yiin Chang, Rohit Prabhavalkar, Hung-yi Lee, Tara N. Sainath:
How to Estimate Model Transferability of Pre-Trained Speech Models? CoRR abs/2306.01015 (2023) - [i203]Haibin Wu, Kai-Wei Chang, Yuan-Kuei Wu, Hung-yi Lee:
SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts. CoRR abs/2306.02207 (2023) - [i202]David Cheng-Han Chiang, Yung-Sung Chuang, James R. Glass, Hung-yi Lee:
Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS. CoRR abs/2306.05083 (2023) - [i201]Shen-sian Syu, Juncheng Xie, Hung-yi Lee:
Improving Non-autoregressive Translation Quality with Pretrained Language Model, Embedding Distillation and Upsampling Strategy for CTC. CoRR abs/2306.06345 (2023) - [i200]Chien-yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng
, Roshan S. Sharma, Shinji Watanabe
, Bhiksha Ramakrishnan, Shady Shehata, Hung-yi Lee:
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech. CoRR abs/2309.09510 (2023) - [i199]Yuan Tseng, Layne Berry, Yi-Ting Chen, I-Hsiang Chiu
, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Po-Yao Huang, Chun-Mao Lai, Shang-Wen Li, David Harwath, Yu Tsao, Shinji Watanabe, Abdelrahman Mohamed, Chi-Luen Feng, Hung-yi Lee:
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models. CoRR abs/2309.10787 (2023) - [i198]Chun-Yi Kuan, Chen-An Li, Tsu-Yuan Hsu, Tse-Yang Lin, Ho-Lam Chung, Kai-Wei Chang, Shuo-Yiin Chang, Hung-yi Lee:
Towards General-Purpose Text-Instruction-Guided Voice Conversion. CoRR abs/2309.14324 (2023) - [i197]Dennis Y. Menn, Tzu-hsun Feng, Sriram Vishwanath, Hung-yi Lee:
Investigating Human-Identifiable Features Hidden in Adversarial Perturbations. CoRR abs/2309.16878 (2023) - [i196]Po-Chun Hsu, Ali Elkahky, Wei-Ning Hsu, Yossi Adi, Tu Anh Nguyen, Jade Copet, Emmanuel Dupoux, Hung-yi Lee, Abdelrahman Mohamed:
Low-Resource Self-Supervised Learning with SSL-Enhanced TTS. CoRR abs/2309.17020 (2023) - [i195]Kai-Wei Chang, Ming-Hsin Chen, Yun-Ping Lin, Jing Neng Hsu, Paul Kuo-Ming Huang, Chien-Yu Huang, Shang-Wen Li, Hung-yi Lee:
Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model. CoRR abs/2310.02971 (2023) - [i194]Kuan-Po Huang, Chih-Kai Yang, Yu-Kuan Fu, Ewan Dunbar, Hung-yi Lee:
Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages. CoRR abs/2310.03018 (2023) - [i193]Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu-Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, Hung-yi Lee:
Chat Vector: A Simple Approach to Equip LLMs With New Language Chat Capabilities. CoRR abs/2310.04799 (2023) - [i192]Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chung, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe
:
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond. CoRR abs/2310.05513 (2023) - [i191]David Cheng-Han Chiang, Hung-yi Lee:
A Closer Look into Automatic Evaluation Using Large Language Models. CoRR abs/2310.05657 (2023) - [i190]Hsuan Su, Cheng-Chu Cheng, Hua Farn, Shachi H. Kumar, Saurav Sahay, Shang-Tse Chen, Hung-yi Lee:
Learning from Red Teaming: Gender Bias Provocation and Mitigation in Large Language Models. CoRR abs/2310.11079 (2023) - [i189]Ming-Hao Hsu, Kai-Wei Chang, Shang-Wen Li, Hung-yi Lee:
An Exploration of In-Context Learning for Speech Language Model. CoRR abs/2310.12477 (2023) - [i188]Hsuan Su, Rebecca Qian, Chinnadhurai Sankar, Shahin Shayandeh, Shang-Tse Chen, Hung-yi Lee, Daniel M. Bikel:
Step by Step to Fairness: Attributing Societal Bias in Task-oriented Dialogue Systems. CoRR abs/2311.06513 (2023) - [i187]Haibin Wu, Heng-Cheng Kuo, Yu Tsao, Hung-yi Lee:
Scalable Ensemble-based Detection Method against Adversarial Attacks for speaker verification. CoRR abs/2312.08622 (2023) - [i186]Min-Han Shih, Ho-Lam Chung, Yu-Chi Pai, Ming-Hao Hsu, Guan-Ting Lin, Shang-Wen Li, Hung-Yi Lee:
GSQA: An End-to-End Model for Generative Spoken Question Answering. CoRR abs/2312.09781 (2023) - [i185]Fabian Ritter Gutierrez, Kuan-Po Huang, Dianwen Ng, Jeremy Heng Meng Wong, Hung-yi Lee, Eng Siong Chng, Nancy F. Chen
:
Noise robust distillation of self-supervised speech models via correlation metrics. CoRR abs/2312.12153 (2023) - [i184]Guan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-yi Lee, Ivan Bulyko:
Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue. CoRR abs/2312.15316 (2023) - 2022
- [j22]Hung-Yi Lee, Shinji Watanabe
, Karen Livescu
, Abdelrahman Mohamed, Tara N. Sainath
:
Editorial Editorial of Special Issue on Self-Supervised Learning for Speech and Audio Processing. IEEE J. Sel. Top. Signal Process. 16(6): 1174-1178 (2022) - [j21]Abdelrahman Mohamed, Hung-yi Lee
, Lasse Borgholt, Jakob D. Havtorn, Joakim Edin
, Christian Igel, Katrin Kirchhoff, Shang-Wen Li, Karen Livescu
, Lars Maaløe, Tara N. Sainath
, Shinji Watanabe
:
Self-Supervised Speech Representation Learning: A Review. IEEE J. Sel. Top. Signal Process. 16(6): 1179-1210 (2022) - [j20]Haibin Wu
, Xu Li
, Andy T. Liu
, Zhiyong Wu
, Helen Meng, Hung-Yi Lee
:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. IEEE ACM Trans. Audio Speech Lang. Process. 30: 202-217 (2022) - [j19]Da-Rong Liu, Po-Chun Hsu, Yi-Chen Chen, Sung-Feng Huang
, Shun-Po Chuang, Da-Yi Wu, Hung-yi Lee
:
Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network. IEEE ACM Trans. Audio Speech Lang. Process. 30: 230-243 (2022) - [j18]Sung-Feng Huang
, Chyi-Jiunn Lin
, Da-Rong Liu, Yi-Chen Chen
, Hung-yi Lee
:
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1558-1571 (2022) - [j17]Yi-Long Liou, Jui-Yang Hsu
, Chen-Sheng Chen, Alexander H. Liu, Hung-Yi Lee
, Tsung-Te Liu
:
A Fully Integrated 1.7mW Attention-Based Automatic Speech Recognition Processor. IEEE Trans. Circuits Syst. II Express Briefs 69(10): 4178-4182 (2022) - [c167]David Cheng-Han Chiang, Hung-Yi Lee:
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets. AAAI 2022: 10518-10525 - [c166]Chan-Jan Hsu, Hung-yi Lee, Yu Tsao:
XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding. ACL (2) 2022: 479-489 - [c165]Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, Zili Huang, Kushal Lakhotia, Shu-Wen Yang, Shuyan Dong, Andy T. Liu, Cheng-I Lai, Jiatong Shi, Xuankai Chang, Phil Hall, Hsuan-Jui Chen, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities. ACL (1) 2022: 8479-8492 - [c164]Haibin Wu, Po-Chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-Yi Lee:
Adversarial Sample Detection for Speaker Verification by Neural Vocoders. ICASSP 2022: 236-240 - [c163]Haibin Wu, Bo Zheng, Xu Li, Xixin Wu, Hung-Yi Lee, Helen Meng:
Characterizing the Adversarial Vulnerability of Speech self-Supervised Learning. ICASSP 2022: 3164-3168 - [c162]Yen Meng, Yi-Hui Chou, Andy T. Liu, Hung-yi Lee:
Don't Speak Too Fast: The Impact of Data Bias on Self-Supervised Speech Models. ICASSP 2022: 3258-3262 - [c161]Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Hung-Yi Lee, Shinji Watanabe
, Tomoki Toda:
S3PRL-VC: Open-Source Voice Conversion Framework with Self-Supervised Speech Representations. ICASSP 2022: 6552-6556 - [c160]Chien-yu Huang, Kai-Wei Chang, Hung-Yi Lee:
Toward Degradation-Robust Voice Conversion. ICASSP 2022: 6777-6781 - [c159]Heng-Jui Chang
, Shu-Wen Yang, Hung-yi Lee:
Distilhubert: Speech Representation Learning by Layer-Wise Distillation of Hidden-Unit Bert. ICASSP 2022: 7087-7091 - [c158]Guan-Ting Lin, Chan-Jan Hsu, Da-Rong Liu, Hung-Yi Lee, Yu Tsao:
Analyzing The Robustness of Unsupervised Speech Recognition. ICASSP 2022: 8202-8206 - [c157]Haibin Wu, Heng-Cheng Kuo, Naijun Zheng
, Kuo-Hsuan Hung, Hung-Yi Lee, Yu Tsao, Hsin-Min Wang
, Helen Meng:
Partially Fake Audio Detection by Self-Attention-Based Fake Span Discovery. ICASSP 2022: 9236-9240 - [c156]Yang Zhang, Zhiqiang Lv, Haibin Wu, Shanshan Zhang, Pengfei Hu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification. INTERSPEECH 2022: 306-310 - [c155]Kuan-Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-yi Lee:
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation. INTERSPEECH 2022: 2193-2197 - [c154]Guan-Ting Lin, Shang-Wen Li, Hung-yi Lee:
Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation for Automatic Speech Recognition. INTERSPEECH 2022: 2198-2202 - [c153]Haibin Wu, Lingwei Meng, Jiawen Kang, Jinchao Li, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng:
Spoofing-Aware Speaker Verification by Multi-Level Fusion. INTERSPEECH 2022: 4357-4361 - [c152]Wei-Cheng Tseng, Wei-Tsung Kao, Hung-yi Lee:
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores. INTERSPEECH 2022: 4541-4545 - [c151]Wei-Ping Huang, Po-Chun Chen, Sung-Feng Huang, Hung-yi Lee:
Few Shot Cross-Lingual TTS Using Transferable Phoneme Embedding. INTERSPEECH 2022: 4566-4570 - [c150]Kai-Wei Chang, Wei-Cheng Tseng, Shang-Wen Li, Hung-yi Lee:
An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks. INTERSPEECH 2022: 5005-5009 - [c149]Wei-Cheng Tseng, Wei-Tsung Kao, Hung-yi Lee:
Membership Inference Attacks Against Self-supervised Speech Models. INTERSPEECH 2022: 5040-5044 - [c148]Guan-Ting Lin, Yung-Sung Chuang, Ho-Lam Chung, Shu-Wen Yang, Hsuan-Jui Chen, Shuyan Annie Dong, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Lin-Shan Lee:
DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering. INTERSPEECH 2022: 5165-5169 - [c147]Chih-Chiang Chang, Hung-yi Lee:
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation. INTERSPEECH 2022: 5175-5179 - [c146]Chih-Chiang Chang, Shun-Po Chuang, Hung-yi Lee:
Anticipation-Free Training for Simultaneous Machine Translation. IWSLT@ACL 2022: 43-61 - [c145]Hung-yi Lee, Shang-Wen Li, Thang Vu:
Meta Learning for Natural Language Processing: A Survey. NAACL-HLT 2022: 666-684 - [c144]Chin-Lun Fu, Zih-Ching Chen, Yun-Ru Lee, Hung-yi Lee:
AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks. NAACL-HLT (Findings) 2022: 2608-2621 - [c143]Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion. Odyssey 2022: 92-99 - [c142]Wei-Tsung Kao, Yuan-Kuei Wu
, Chia-Ping Chen, Zhi-Sheng Chen, Yu-Pao Tsai, Hung-Yi Lee:
On the Efficiency of Integrating Self-Supervised Learning and Meta-Learning for User-Defined Few-Shot Keyword Spotting. SLT 2022: 414-421 - [c141]Xuanjun Chen, Haibin Wu, Helen Meng, Hung-yi Lee, Jyh-Shing Roger Jang:
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection. SLT 2022: 692-699 - [c140]Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Layne Berry, Hung-yi Lee, David Harwath:
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model. SLT 2022: 715-722 - [c139]Tzu-hsun Feng, Shuyan Annie Dong, Ching-Feng Yeh, Shu-Wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe
, Abdelrahman Mohamed, Shang-Wen Li, Hung-yi Lee:
Superb @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning. SLT 2022: 1096-1103 - [c138]Guan-Ting Lin, Chi-Luen Feng, Wei-Ping Huang, Yuan Tseng, Tzu-Han Lin, Chen-An Li, Hung-yi Lee, Nigel G. Ward:
On the Utility of Self-Supervised Models for Prosody-Related Tasks. SLT 2022: 1104-1111 - [c137]Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-yi Lee:
Improving Generalizability of Distilled Self-Supervised Speech Processing Models Under Distorted Settings. SLT 2022: 1112-1119 - [c136]Zih-Ching Chen, Chin-Lun Fu, Chih-Ying Liu, Shang-Wen (Daniel) Li, Hung-yi Lee:
Exploring Efficient-Tuning Methods in Self-Supervised Speech Models. SLT 2022: 1120-1127 - [c135]Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe
, Paola García, Hung-yi Lee, Hao Tang:
On Compressing Sequences for Self-Supervised Speech Models. SLT 2022: 1128-1135 - [e1]Kong Aik Lee, Hung-yi Lee, Yanfeng Lu, Minghui Dong:
13th International Symposium on Chinese Spoken Language Processing, ISCSLP 2022, Singapore, December 11-14, 2022. IEEE 2022, ISBN 979-8-3503-9796-3 [contents] - [i183]Chih-Chiang Chang, Shun-Po Chuang, Hung-yi Lee:
Anticipation-free Training for Simultaneous Translation. CoRR abs/2201.12868 (2022) - [i182]Haibin Wu, Heng-Cheng Kuo, Naijun Zheng, Kuo-Hsuan Hung, Hung-Yi Lee, Yu Tsao, Hsin-Min Wang, Helen Meng:
Partially Fake Audio Detection by Self-attention-based Fake Span Discovery. CoRR abs/2202.06684 (2022) - [i181]Guan-Ting Lin, Yung-Sung Chuang, Ho-Lam Chung, Shu-Wen Yang, Hsuan-Jui Chen, Shuyan Dong, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Lin-Shan Lee:
DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering. CoRR abs/2203.04911 (2022) - [i180]Kuan-Po Huang, Yuan-Kuei Wu, Hung-yi Lee:
Improving the transferability of speech separation by meta-learning. CoRR abs/2203.05882 (2022) - [i179]Hsiang-Sheng Tsai, Heng-Jui Chang, Wen-Chin Huang, Zili Huang, Kushal Lakhotia, Shu-Wen Yang, Shuyan Dong, Andy T. Liu, Cheng-I Jeff Lai, Jiatong Shi, Xuankai Chang, Phil Hall, Hsuan-Jui Chen, Shang-Wen Li, Shinji Watanabe
, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities. CoRR abs/2203.06849 (2022) - [i178]Guan-Ting Lin, Shang-Wen Li, Hung-yi Lee:
Listen, Adapt, Better WER: Source-free Single-utterance Test-time Adaptation for Automatic Speech Recognition. CoRR abs/2203.14222 (2022) - [i177]Yang Zhang, Zhiqiang Lv, Haibin Wu, Shanshan Zhang, Pengfei Hu, Zhiyong Wu, Hung-Yi Lee, Helen Meng:
MFA-Conformer: Multi-scale Feature Aggregation Conformer for Automatic Speaker Verification. CoRR abs/2203.15249 (2022) - [i176]Haibin Wu, Lingwei Meng, Jiawen Kang, Jinchao Li, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng:
Spoofing-Aware Speaker Verification by Multi-Level Fusion. CoRR abs/2203.15377 (2022) - [i175]Kuan-Po Huang, Yu-Kuan Fu, Yu Zhang, Hung-yi Lee:
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation. CoRR abs/2203.16104 (2022) - [i174]Kai-Wei Chang, Wei-Cheng Tseng, Shang-Wen Li, Hung-yi Lee:
An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks. CoRR abs/2203.16773 (2022) - [i173]Fan-Lin Wang, Po-Chun Hsu, Da-Rong Liu, Hung-yi Lee:
Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis. CoRR abs/2204.00170 (2022) - [i172]Wei-Tsung Kao, Yuen-Kwei Wu, Chia-Ping Chen, Zhi-Sheng Chen, Yu-Pao Tsai, Hung-Yi Lee:
On the Efficiency of Integrating Self-supervised Learning and Meta-learning for User-defined Few-shot Keyword Spotting. CoRR abs/2204.00352 (2022) - [i171]Wei-Cheng Tseng, Wei-Tsung Kao, Hung-yi Lee:
DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores. CoRR abs/2204.03219 (2022) - [i170]David Cheng-Han Chiang, Hung-Yi Lee:
Understanding, Detecting, and Separating Out-of-Distribution Samples and Adversarial Samples in Text Classification. CoRR abs/2204.04458 (2022) - [i169]David Cheng-Han Chiang, Hung-Yi Lee:
Re-Examining Human Annotations for Interpretable NLP. CoRR abs/2204.04580 (2022) - [i168]Chan-Jan Hsu, Hung-yi Lee, Yu Tsao:
XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems to Improve Language Understanding. CoRR abs/2204.07316 (2022) - [i167]Chih-Chiang Chang, Hung-yi Lee:
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation. CoRR abs/2204.09595 (2022) - [i166]Po-Chun Hsu, Da-Rong Liu, Andy T. Liu, Hung-yi Lee:
Parallel Synthesis for Autoregressive Speech Generation. CoRR abs/2204.11806 (2022) - [i165]Chin-Lun Fu, Zih-Ching Chen, Yun-Ru Lee, Hung-yi Lee:
AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks. CoRR abs/2205.00305 (2022) - [i164]Hung-yi Lee, Shang-Wen Li, Ngoc Thang Vu:
Meta Learning for Natural Language Processing: A Survey. CoRR abs/2205.01500 (2022) - [i163]Chi-Luen Feng, Po-Chun Hsu, Hung-yi Lee:
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information. CoRR abs/2205.03759 (2022) - [i162]Abdelrahman Mohamed, Hung-yi Lee, Lasse Borgholt, Jakob D. Havtorn, Joakim Edin, Christian Igel, Katrin Kirchhoff, Shang-Wen Li, Karen Livescu
, Lars Maaløe, Tara N. Sainath, Shinji Watanabe
:
Self-Supervised Speech Representation Learning: A Review. CoRR abs/2205.10643 (2022) - [i161]Chi-Liang Liu, Hung-yi Lee, Wen-tau Yih:
Structured Prompt Tuning. CoRR abs/2205.12309 (2022) - [i160]Dennis Y. Menn, Hung-yi Lee:
Searching for the Essence of Adversarial Perturbations. CoRR abs/2205.15357 (2022) - [i159]Hsuan Su, Po-Han Chi, Shih-Cheng Huang, Ho-Lam Chung, Saurav Sahay, Shang-Tse Chen, Hung-Yi Lee:
Few-shot Prompting Towards Controllable Response Generation. CoRR abs/2206.03931 (2022) - [i158]Haibin Wu, Jiawen Kang, Lingwei Meng, Yang Zhang, Xixin Wu, Zhiyong Wu, Hung-yi Lee, Helen Meng:
Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion. CoRR abs/2206.09131 (2022) - [i157]Wei-Ping Huang, Po-Chun Chen, Sung-Feng Huang, Hung-yi Lee:
Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding. CoRR abs/2206.15427 (2022) - [i156]Da-Rong Liu, Po-Chun Hsu, Yi-Chen Chen, Sung-Feng Huang, Shun-Po Chuang, Da-Yi Wu, Hung-yi Lee:
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network. CoRR abs/2207.14568 (2022) - [i155]Tung-Yu Wu, Chen-An Li, Tzu-Han Lin, Tsu-Yuan Hsu, Hung-Yi Lee:
The Ability of Self-Supervised Speech Models for Audio Representations. CoRR abs/2209.12900 (2022) - [i154]Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Layne Berry, Hung-yi Lee, David Harwath:
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model. CoRR abs/2210.00705 (2022) - [i153]Xuanjun Chen, Haibin Wu, Helen Meng, Hung-yi Lee, Jyh-Shing Roger Jang:
Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection. CoRR abs/2210.00753 (2022) - [i152]David Cheng-Han Chiang, Hung-yi Lee:
How Far Are We from Real Synonym Substitution Attacks? CoRR abs/2210.02844 (2022) - [i151]Zih-Ching Chen, Chin-Lun Fu, Chih-Ying Liu, Shang-Wen Li, Hung-yi Lee:
Exploring Efficient-tuning Methods in Self-supervised Speech Models. CoRR abs/2210.06175 (2022) - [i150]Guan-Ting Lin, Chi-Luen Feng, Wei-Ping Huang, Yuan Tseng, Tzu-Han Lin, Chen-An Li, Hung-yi Lee, Nigel G. Ward:
On the Utility of Self-supervised Models for Prosody-related Tasks. CoRR abs/2210.07185 (2022) - [i149]Yen Meng, Hsuan-Jui Chen, Jiatong Shi, Shinji Watanabe
, Paola García, Hung-yi Lee, Hao Tang:
On Compressing Sequences for Self-Supervised Speech Models. CoRR abs/2210.07189 (2022) - [i148]Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-yi Lee:
Improving generalizability of distilled self-supervised speech processing models under distorted settings. CoRR abs/2210.07978 (2022) - [i147]Tzu-hsun Feng, Shuyan Annie Dong, Ching-Feng Yeh, Shu-Wen Yang, Tzu-Quan Lin, Jiatong Shi, Kai-Wei Chang, Zili Huang, Haibin Wu, Xuankai Chang, Shinji Watanabe
, Abdelrahman Mohamed, Shang-Wen Li, Hung-yi Lee:
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning. CoRR abs/2210.08634 (2022) - [i146]Xuanjun Chen, Haibin Wu, Chung-Che Wang, Hung-yi Lee, Jyh-Shing Roger Jang:
Multimodal Transformer Distillation for Audio-Visual Synchronization. CoRR abs/2210.15563 (2022) - [i145]Chan-Jan Hsu, Ho-Lam Chung, Hung-yi Lee, Yu Tsao:
T5lephone: Bridging Speech and Text Self-supervised Models for Spoken Language Understanding via Phoneme level T5. CoRR abs/2211.00586 (2022) - [i144]Layne Berry, Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Hung-yi Lee, David Harwath:
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval. CoRR abs/2211.01180 (2022) - [i143]Hsuan-Jui Chen, Yen Meng, Hung-yi Lee:
Once-for-All Sequence Compression for Self-Supervised Speech Models. CoRR abs/2211.02332 (2022) - [i142]Jiatong Shi, Chan-Jan Hsu, Ho-Lam Chung, Dongji Gao, Paola García, Shinji Watanabe
, Ann Lee, Hung-yi Lee:
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR. CoRR abs/2211.03025 (2022) - [i141]Derek Xu, Shuyan Dong, Changhan Wang, Suyoun Kim, Zhaojiang Lin, Akshat Shrivastava, Shang-Wen Li, Liang-Hsuan Tseng, Alexei Baevski, Guan-Ting Lin, Hung-yi Lee, Yizhou Sun, Wei Wang:
Introducing Semantics into Speech Encoders. CoRR abs/2211.08402 (2022) - [i140]Tzu-Quan Lin, Hung-yi Lee, Hao Tang:
MelHuBERT: A simplified HuBERT on Mel spectrogram. CoRR abs/2211.09944 (2022) - [i139]Tzu-Quan Lin, Tsung-Huan Yang, Chun-Yao Chang, Kuang-Ming Chen, Tzu-hsun Feng, Hung-yi Lee, Hao Tang:
Compressing Transformer-based self-supervised models for speech processing. CoRR abs/2211.09949 (2022) - [i138]Tsu-Yuan Hsu, Chen-An Li, Tung-Yu Wu, Hung-yi Lee:
Model Extraction Attack against Self-supervised Speech Models. CoRR abs/2211.16044 (2022) - [i137]Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola García, Hung-yi Lee, Shinji Watanabe
, Sanjeev Khudanpur:
EURO: ESPnet Unsupervised ASR Open-source Toolkit. CoRR abs/2211.17196 (2022) - [i136]Shih-Cheng Huang, Shih-Heng Wang, Min-Han Shih, Saurav Sahay, Hung-yi Lee:
General Framework for Self-Supervised Model Priming for Parameter-Efficient Fine-tuning. CoRR abs/2212.01032 (2022) - [i135]Zih-Ching Chen, Yu-Shun Sung, Hung-yi Lee:
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models. CoRR abs/2212.01282 (2022) - [i134]Suwon Shon, Siddhant Arora, Chyi-Jiunn Lin, Ankita Pasad, Felix Wu, Roshan S. Sharma, Wei-Lun Wu, Hung-Yi Lee, Karen Livescu, Shinji Watanabe
:
SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks. CoRR abs/2212.10525 (2022) - 2021
- [j16]Shun-Po Chuang
, Alexander H. Liu, Tzu-Wei Sung, Hung-yi Lee
:
Improving Automatic Speech Recognition and Speech Translation via Word Embedding Prediction. IEEE ACM Trans. Audio Speech Lang. Process. 29: 93-105 (2021) - [j15]Andy T. Liu
, Shang-Wen Li
, Hung-yi Lee
:
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech. IEEE ACM Trans. Audio Speech Lang. Process. 29: 2351-2366 (2021) - [c134]Yu-Ching Chiu, Bo-Hao Chang, Tzu-Yu Chen, Cheng-Fu Yang, Nanyi Bi, Richard Tzong-Han Tsai, Hung-yi Lee, Jane Yung-jen Hsu:
Multi-modal User Intent Classification Under the Scenario of Smart Factory (Student Abstract). AAAI 2021: 15771-15772 - [c133]Shun-Po Chuang, Yung-Sung Chuang
, Chih-Chiang Chang, Hung-yi Lee:
Investigating the Reordering Capability in CTC-based Non-Autoregressive End-to-End Speech Translation. ACL/IJCNLP (Findings) 2021: 1068-1077 - [c132]Xuankai Chang, Takashi Maekaku, Pengcheng Guo, Jing Shi, Yen-Ju Lu, Aswin Shanmugam Subramanian, Tianzi Wang, Shu-Wen Yang, Yu Tsao, Hung-yi Lee, Shinji Watanabe
:
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition. ASRU 2021: 228-235 - [c131]Shun-Po Chuang, Heng-Jui Chang
, Sung-Feng Huang, Hung-yi Lee:
Non-Autoregressive Mandarin-English Code-Switching Speech Recognition. ASRU 2021: 465-472 - [c130]Wei-Tsung Kao, Hung-yi Lee:
Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of Pre-trained Models' Transferability. EMNLP (Findings) 2021: 2195-2208 - [c129]Tung-Yu Wu, Tsu-Yuan Hsu, Chen-An Li, Tzu-Han Lin, Hung-yi Lee:
The Efficacy of Self-Supervised Speech Models for Audio Representations. HEAR@NeurIPS 2021: 90-110 - [c128]Yuan-Kuei Wu
, Kuan-Po Huang, Yu Tsao, Hung-yi Lee:
One Shot Learning for Speech Separation. ICASSP 2021: 5769-5773 - [c127]Yist Y. Lin, Chung-Ming Chien, Jheng-Hao Lin, Hung-yi Lee, Lin-Shan Lee:
Fragmentvc: Any-To-Any Voice Conversion by End-To-End Extracting and Fusing Fine-Grained Voice Fragments with Attention. ICASSP 2021: 5939-5943 - [c126]Yen-Hao Chen, Da-Yi Wu, Tsung-Han Wu, Hung-yi Lee:
Again-VC: A One-Shot Voice Conversion Using Activation Guidance and Adaptive Instance Normalization. ICASSP 2021: 5954-5958 - [c125]Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Adversarial Defense for Automatic Speaker Verification by Cascaded Self-Supervised Learning Models. ICASSP 2021: 6718-6722 - [c124]Cheng-I Lai, Yung-Sung Chuang, Hung-Yi Lee, Shang-Wen Li, James R. Glass:
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining. ICASSP 2021: 7468-7472 - [c123]Chung-Ming Chien, Jheng-Hao Lin, Chien-yu Huang, Po-Chun Hsu, Hung-yi Lee:
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech. ICASSP 2021: 8588-8592 - [c122]Jheng-Hao Lin, Yist Y. Lin, Chung-Ming Chien, Hung-yi Lee:
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations. Interspeech 2021: 836-840 - [c121]Shu-Wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe
, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB: Speech Processing Universal PERformance Benchmark. Interspeech 2021: 1194-1198 - [c120]Heng-Jui Chang
, Hung-yi Lee, Lin-Shan Lee:
Towards Lifelong Learning of End-to-End ASR. Interspeech 2021: 2551-2555 - [c119]Wei-Cheng Tseng, Chien-yu Huang, Wei-Tsung Kao, Yist Y. Lin, Hung-yi Lee:
Utilizing Self-Supervised Representations for MOS Prediction. Interspeech 2021: 2781-2785 - [c118]Sung-Feng Huang, Shun-Po Chuang, Da-Rong Liu, Yi-Chen Chen, Gene-Ping Yang, Hung-yi Lee:
Stabilizing Label Assignment for Speech Separation by Self-Supervised Pre-Training. Interspeech 2021: 3056-3060 - [c117]Jingsong Wang, Yuxuan He, Chunyu Zhao, Qijie Shao, Wei-Wei Tu, Tom Ko, Hung-yi Lee, Lei Xie:
Auto-KWS 2021 Challenge: Task, Datasets, and Baselines. Interspeech 2021: 4244-4248 - [c116]Haibin Wu, Yang Zhang, Zhiyong Wu, Dong Wang, Hung-yi Lee:
Voting for the Right Answer: Adversarial Defense for Speaker Verification. Interspeech 2021: 4294-4298 - [c115]Hsuan Su, Jiun-Hao Jhan, Fan-Yun Sun, Saurav Sahay, Hung-yi Lee:
Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn Chatbot Responding with Intention. NAACL-HLT 2021: 1559-1569 - [c114]Heng-Jui Chang
, Alexander H. Liu, Hung-yi Lee, Lin-Shan Lee:
End-to-End Whispered Speech Recognition with Frequency-Weighted Approaches and Pseudo Whisper Pre-training. SLT 2021: 186-193 - [c113]Po-Han Chi, Pei-Hung Chung
, Tsung-Han Wu, Chun-Cheng Hsieh, Yen-Hao Chen, Shang-Wen Li, Hung-yi Lee:
Audio Albert: A Lite Bert for Self-Supervised Learning of Audio Representation. SLT 2021: 344-350 - [c112]Chung-Ming Chien, Hung-yi Lee:
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis. SLT 2021: 446-453 - [c111]Tzu-hsien Huang, Jheng-Hao Lin, Hung-yi Lee:
How Far Are We from Robust Voice Conversion: A Survey. SLT 2021: 514-521 - [c110]Chien-yu Huang, Yist Y. Lin, Hung-yi Lee, Lin-Shan Lee:
Defending Your Voice: Adversarial Attack on Voice Conversion. SLT 2021: 552-559 - [i133]Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Adversarial defense for automatic speaker verification by cascaded self-supervised learning models. CoRR abs/2102.07047 (2021) - [i132]Chung-Ming Chien, Jheng-Hao Lin, Chien-yu Huang, Po-Chun Hsu, Hung-yi Lee:
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech. CoRR abs/2103.04088 (2021) - [i131]Wei-Tsung Kao, Hung-Yi Lee:
Is BERT a Cross-Disciplinary Knowledge Learner? A Surprising Finding of Pre-trained Models' Transferability. CoRR abs/2103.07162 (2021) - [i130]Hsuan Su, Jiun-Hao Jhan, Fan-Yun Sun, Saurav Sahay, Hung-yi Lee:
Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn Chatbot Responding with Intention. CoRR abs/2103.16429 (2021) - [i129]Jingsong Wang, Yuxuan He, Chunyu Zhao, Qijie Shao, Wei-Wei Tu, Tom Ko, Hung-yi Lee, Lei Xie:
Auto-KWS 2021 Challenge: Task, Datasets, and Baselines. CoRR abs/2104.00513 (2021) - [i128]Heng-Jui Chang, Hung-yi Lee, Lin-Shan Lee:
Towards Lifelong Learning of End-to-end ASR. CoRR abs/2104.01616 (2021) - [i127]Shun-Po Chuang, Heng-Jui Chang, Sung-Feng Huang, Hung-yi Lee:
Non-autoregressive Mandarin-English Code-switching Speech Recognition with Pinyin Mask-CTC and Word Embedding Regularization. CoRR abs/2104.02258 (2021) - [i126]Jheng-Hao Lin, Yist Y. Lin, Chung-Ming Chien, Hung-yi Lee:
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations. CoRR abs/2104.02901 (2021) - [i125]Wei-Cheng Tseng, Chien-yu Huang, Wei-Tsung Kao, Yist Y. Lin, Hung-yi Lee:
Utilizing Self-supervised Representations for MOS Prediction. CoRR abs/2104.03017 (2021) - [i124]Shu-Wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB: Speech processing Universal PERformance Benchmark. CoRR abs/2105.01051 (2021) - [i123]Yi-Chen Chen, Po-Han Chi, Shu-Wen Yang, Kai-Wei Chang, Jheng-Hao Lin, Sung-Feng Huang, Da-Rong Liu, Chi-Liang Liu, Cheng-Kuang Lee, Hung-yi Lee:
SpeechNet: A Universal Modularized Model for Speech Processing Tasks. CoRR abs/2105.03070 (2021) - [i122]Wei-Cheng Huang, Chien-yu Huang, Hung-yi Lee:
Improving Cross-Lingual Reading Comprehension with Self-Training. CoRR abs/2105.03627 (2021) - [i121]Shun-Po Chuang, Yung-Sung Chuang, Chih-Chiang Chang, Hung-yi Lee:
Investigating the Reordering Capability in CTC-based Non-Autoregressive End-to-End Speech Translation. CoRR abs/2105.04840 (2021) - [i120]Haibin Wu, Xu Li, Andy T. Liu, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. CoRR abs/2106.00273 (2021) - [i119]Yung-Sung Chuang, Mingye Gao, Hongyin Luo, James R. Glass, Hung-Yi Lee, Yun-Nung Chen, Shang-Wen Li:
Mitigating Biases in Toxic Language Detection through Invariant Rationalization. CoRR abs/2106.07240 (2021) - [i118]Haibin Wu, Yang Zhang, Zhiyong Wu, Dong Wang, Hung-yi Lee:
Voting for the right answer: Adversarial defense for speaker verification. CoRR abs/2106.07868 (2021) - [i117]Kuan-Po Huang, Yuan-Kuei Wu, Hung-yi Lee:
Multi-accent Speech Separation with One Shot Learning. CoRR abs/2106.11713 (2021) - [i116]Haibin Wu, Po-Chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-yi Lee:
Spotting adversarial samples for speaker verification by neural vocoders. CoRR abs/2107.00309 (2021) - [i115]Zih-Yun Chiu, Yi-Lin Tuan, Hung-yi Lee, Li-Chen Fu:
Parallelized Reverse Curriculum Generation. CoRR abs/2108.02128 (2021) - [i114]David Cheng-Han Chiang, Hung-yi Lee:
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets. CoRR abs/2109.03537 (2021) - [i113]Heng-Jui Chang, Shu-Wen Yang, Hung-yi Lee:
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT. CoRR abs/2110.01900 (2021) - [i112]Liang-Hsuan Tseng, Yu-Kuan Fu, Heng-Jui Chang, Hung-yi Lee:
Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models. CoRR abs/2110.03504 (2021) - [i111]Jiun-Hao Jhan, Chao-Peng Liu, Shyh-Kang Jeng, Hung-Yi Lee:
CheerBots: Chatbots toward Empathy and Emotionusing Reinforcement Learning. CoRR abs/2110.03949 (2021) - [i110]Xuankai Chang, Takashi Maekaku, Pengcheng Guo, Jing Shi, Yen-Ju Lu, Aswin Shanmugam Subramanian, Tianzi Wang, Shu-Wen Yang, Yu Tsao, Hung-yi Lee, Shinji Watanabe:
An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition. CoRR abs/2110.04590 (2021) - [i109]Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Hung-Yi Lee, Shinji Watanabe, Tomoki Toda:
S3PRL-VC: Open-source Voice Conversion Framework with Self-supervised Speech Representations. CoRR abs/2110.06280 (2021) - [i108]Chien-yu Huang, Kai-Wei Chang, Hung-yi Lee:
Toward Degradation-Robust Voice Conversion. CoRR abs/2110.07537 (2021) - [i107]Yen Meng, Yi-Hui Chou, Andy T. Liu, Hung-yi Lee:
Don't speak too fast: The impact of data bias on self-supervised speech models. CoRR abs/2110.07957 (2021) - [i106]Yi-Chen Chen, Shu-Wen Yang, Cheng-Kuang Lee, Simon See, Hung-yi Lee:
Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning. CoRR abs/2110.09930 (2021) - [i105]Sung-Feng Huang, Chyi-Jiunn Lin, Hung-yi Lee:
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech. CoRR abs/2111.04040 (2021) - [i104]Haibin Wu, Bo Zheng, Xu Li, Xixin Wu, Hung-yi Lee, Helen Meng:
Characterizing the adversarial vulnerability of speech self-supervised learning. CoRR abs/2111.04330 (2021) - [i103]Wei-Cheng Tseng, Wei-Tsung Kao, Hung-yi Lee:
Membership Inference Attacks Against Self-supervised Speech Models. CoRR abs/2111.05113 (2021) - 2020
- [j14]Zixing Zhang
, Dimitris N. Metaxas, Hung-yi Lee
, Björn W. Schuller
:
Guest Editorial Special Issue on Adversarial Learning in Computational Intelligence. IEEE Trans. Emerg. Top. Comput. Intell. 4(4): 414-416 (2020) - [c109]Che-Ping Tsai, Hung-yi Lee:
Order-Free Learning Alleviating Exposure Bias in Multi-Label Classification. AAAI 2020: 6038-6045 - [c108]Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, Hung-yi Lee:
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation. ACL 2020: 5998-6003 - [c107]David Cheng-Han Chiang, Sung-Feng Huang, Hung-yi Lee:
Pretrained Language Model Embryology: The Birth of ALBERT. EMNLP (1) 2020: 6813-6828 - [c106]Chun-Hao Chao, Pin-Lun Hsu, Hung-Yi Lee, Yu-Chiang Frank Wang:
Self-Supervised Deep Learning for Fisheye Image Rectification. ICASSP 2020: 2248-2252 - [c105]Gene-Ping Yang, Szu-Lin Wu, Yao-Wen Mao, Hung-yi Lee, Lin-Shan Lee:
Interrupted and Cascaded Permutation Invariant Training for Speech Separation. ICASSP 2020: 6369-6373 - [c104]Andy T. Liu
, Shu-Wen Yang, Po-Han Chi
, Po-Chun Hsu, Hung-yi Lee:
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders. ICASSP 2020: 6419-6423 - [c103]Chung-Yi Li, Pei-Chieh Yuan, Hung-yi Lee:
What Does a Network Layer Hear? Analyzing Hidden Representations of End-to-End ASR Through Speech Synthesis. ICASSP 2020: 6434-6438 - [c102]Haibin Wu, Songxiang Liu, Helen Meng, Hung-yi Lee:
Defense Against Adversarial Attacks on Spoofing Countermeasures of ASV. ICASSP 2020: 6564-6568 - [c101]Alexander H. Liu, Tao Tu, Hung-yi Lee, Lin-Shan Lee:
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning. ICASSP 2020: 7259-7263 - [c100]Da-Yi Wu, Hung-Yi Lee:
One-Shot Voice Conversion by Vector Quantization. ICASSP 2020: 7734-7738 - [c99]Jui-Yang Hsu, Yuan-Jui Chen, Hung-yi Lee:
Meta Learning for End-To-End Low-Resource Speech Recognition. ICASSP 2020: 7844-7848 - [c98]Alexander H. Liu, Tzu-Wei Sung, Shun-Po Chuang, Hung-yi Lee, Lin-Shan Lee:
Sequence-to-Sequence Automatic Speech Recognition with Word Embedding Regularization and Fused Decoding. ICASSP 2020: 7879-7883 - [c97]Shun-Po Chuang, Tzu-Wei Sung, Hung-yi Lee:
Training Code-Switching Language Model with Monolingual Data. ICASSP 2020: 7949-7953 - [c96]Fan-Keng Sun, Cheng-Hao Ho, Hung-Yi Lee:
LAMOL: LAnguage MOdeling for Lifelong Language Learning. ICLR 2020 - [c95]Po-Chun Hsu, Hung-yi Lee:
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis Without GPU. INTERSPEECH 2020: 210-214 - [c94]Yi-Chen Chen, Jui-Yang Hsu, Cheng-Kuang Lee, Hung-yi Lee:
DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation. INTERSPEECH 2020: 1803-1807 - [c93]Tao Tu, Yuan-Jui Chen, Alexander H. Liu, Hung-yi Lee:
Semi-Supervised Learning for Multi-Speaker Text-to-Speech Synthesis Using Discrete Speech Representation. INTERSPEECH 2020: 3191-3195 - [c92]Haibin Wu, Andy T. Liu, Hung-yi Lee:
Defense for Black-Box Attacks on Anti-Spoofing Models by Self-Supervised Learning. INTERSPEECH 2020: 3780-3784 - [c91]Shu-Wen Yang, Andy T. Liu, Hung-yi Lee:
Understanding Self-Attention of Self-Supervised Audio Transformers. INTERSPEECH 2020: 3785-3789 - [c90]Yung-Sung Chuang
, Chi-Liang Liu, Hung-yi Lee, Lin-Shan Lee:
SpeechBERT: An Audio-and-Text Jointly Learned Language Model for End-to-End Spoken Question Answering. INTERSPEECH 2020: 4168-4172 - [c89]Da-Yi Wu, Yen-Hao Chen, Hung-yi Lee:
VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net Architecture. INTERSPEECH 2020: 4691-4695 - [c88]Chun-Hsing Lin, Siang-Ruei Wu, Hung-yi Lee, Yun-Nung Chen:
TaylorGAN: Neighbor-Augmented Policy Update Towards Sample-Efficient Natural Language Generation. NeurIPS 2020 - [i102]Wei-Tsung Kao, Tsung-Han Wu, Po-Han Chi, Chun-Cheng Hsieh, Hung-yi Lee:
Further Boosting BERT-based Models by Duplicating Existing Layers: Some Intriguing Phenomena inside BERT. CoRR abs/2001.09309 (2020) - [i101]Haibin Wu, Songxiang Liu, Helen Meng, Hung-yi Lee:
Defense against adversarial attacks on spoofing countermeasures of ASV. CoRR abs/2003.03065 (2020) - [i100]Chi-Liang Liu, Tsung-Yuan Hsu, Yung-Sung Chuang, Hung-yi Lee:
A Study of Cross-Lingual Ability and Language-specific Information in Multilingual BERT. CoRR abs/2004.09205 (2020) - [i99]Yau-Shian Wang, Hung-yi Lee, Yun-Nung Chen:
Learning Interpretable and Discrete Representations with Adversarial Training for Unsupervised Text Classification. CoRR abs/2004.13255 (2020) - [i98]Heng-Jui Chang, Alexander H. Liu, Hung-yi Lee, Lin-Shan Lee:
End-to-end Whispered Speech Recognition with Frequency-weighted Approaches and Layer-wise Transfer Learning. CoRR abs/2005.01972 (2020) - [i97]Yi-Chen Chen, Jui-Yang Hsu, Cheng-Kuang Lee, Hung-yi Lee:
DARTS-ASR: Differentiable Architecture Search for Multilingual Speech Recognition and Adaptation. CoRR abs/2005.07029 (2020) - [i96]Po-Chun Hsu, Hung-yi Lee:
WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU. CoRR abs/2005.07412 (2020) - [i95]Tao Tu, Yuan-Jui Chen, Alexander H. Liu, Hung-yi Lee:
Semi-supervised Learning for Multi-speaker Text-to-speech Synthesis Using Discrete Speech Representation. CoRR abs/2005.08024 (2020) - [i94]Po-Han Chi, Pei-Hung Chung, Tsung-Han Wu, Chun-Cheng Hsieh, Shang-wen Li, Hung-yi Lee:
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation. CoRR abs/2005.08575 (2020) - [i93]Chien-yu Huang, Yist Y. Lin, Hung-yi Lee, Lin-Shan Lee:
Defending Your Voice: Adversarial Attack on Voice Conversion. CoRR abs/2005.08781 (2020) - [i92]Yuan-Kuei Wu, Chao-I Tuan, Hung-yi Lee, Yu Tsao:
SADDEL: Joint Speech Separation and Denoising Model based on Multitask Learning. CoRR abs/2005.09966 (2020) - [i91]Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, Hung-yi Lee:
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation. CoRR abs/2005.10678 (2020) - [i90]Haibin Wu, Andy T. Liu
, Hung-yi Lee:
Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning. CoRR abs/2006.03214 (2020) - [i89]Shu-Wen Yang, Andy T. Liu
, Hung-yi Lee:
Understanding Self-Attention of Self-Supervised Audio Transformers. CoRR abs/2006.03265 (2020) - [i88]Da-Yi Wu, Yen-Hao Chen, Hung-Yi Lee:
VQVC+: One-Shot Voice Conversion by Vector Quantization and U-Net architecture. CoRR abs/2006.04154 (2020) - [i87]Tsung-Han Wu, Chun-Cheng Hsieh, Yen-Hao Chen, Po-Han Chi, Hung-yi Lee:
Hand-crafted Attention is All You Need? A Study of Attention on Self-supervised Audio Transformer. CoRR abs/2006.05174 (2020) - [i86]Andy T. Liu, Shang-wen Li, Hung-yi Lee:
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech. CoRR abs/2007.06028 (2020) - [i85]Hung-yi Lee, Cheng-Hao Ho, Chien-Fu Lin, Chiung-Chih Chang, Chih-Wei Lee, Yau-Shian Wang, Tsung-Yuan Hsu, Kuan-Yu Chen:
Investigation of Sentiment Controllable Chatbot. CoRR abs/2007.07196 (2020) - [i84]David Cheng-Han Chiang, Sung-Feng Huang, Hung-yi Lee:
Pretrained Language Model Embryology: The Birth of ALBERT. CoRR abs/2010.02480 (2020) - [i83]Chi-Liang Liu, Tsung-Yuan Hsu, Yung-Sung Chuang, Chung-Yi Li, Hung-yi Lee:
Language Representation in Multilingual BERT and its applications to improve Cross-lingual Generalization. CoRR abs/2010.10041 (2020) - [i82]Chi-Liang Liu, Tsung-Yuan Hsu, Yung-Sung Chuang, Hung-yi Lee:
What makes multilingual BERT multilingual? CoRR abs/2010.10938 (2020) - [i81]Chi-Liang Liu, Hung-yi Lee:
Unsupervised Deep Learning based Multiple Choices Question Answering: Start Learning from Basic Knowledge. CoRR abs/2010.11003 (2020) - [i80]Cheng-I Lai, Yung-Sung Chuang, Hung-yi Lee, Shang-wen Li, James R. Glass:
Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining. CoRR abs/2010.13826 (2020) - [i79]Yist Y. Lin, Chung-Ming Chien, Jheng-Hao Lin, Hung-yi Lee, Lin-Shan Lee:
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and Fusing Fine-Grained Voice Fragments With Attention. CoRR abs/2010.14150 (2020) - [i78]Sung-Feng Huang, Shun-Po Chuang, Da-Rong Liu, Yi-Chen Chen, Gene-Ping Yang, Hung-yi Lee:
Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation. CoRR abs/2010.15366 (2020) - [i77]Yen-Hao Chen, Da-Yi Wu, Tsung-Han Wu, Hung-yi Lee:
AGAIN-VC: A One-shot Voice Conversion using Activation Guidance and Adaptive Instance Normalization. CoRR abs/2011.00316 (2020) - [i76]Chung-Ming Chien, Hung-yi Lee:
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis. CoRR abs/2011.06465 (2020) - [i75]Yuan-Kuei Wu, Kuan-Po Huang
, Yu Tsao, Hung-yi Lee:
One Shot Learning for Speech Separation. CoRR abs/2011.10233 (2020) - [i74]Tzu-hsien Huang, Jheng-Hao Lin, Chien-yu Huang, Hung-yi Lee:
How Far Are We from Robust Voice Conversion: A Survey. CoRR abs/2011.12063 (2020) - [i73]Chun-Hsing Lin, Siang-Ruei Wu, Hung-yi Lee, Yun-Nung Chen:
TaylorGAN: Neighbor-Augmented Policy Update for Sample-Efficient Natural Language Generation. CoRR abs/2011.13527 (2020) - [i72]David Cheng-Han Chiang, Hung-yi Lee:
Pre-Training a Language Model Without Human Language. CoRR abs/2012.11995 (2020)
2010 – 2019
- 2019
- [j13]Shun-Yao Shih
, Fan-Keng Sun, Hung-Yi Lee:
Temporal pattern attention for multivariate time series forecasting. Mach. Learn. 108(8-9): 1421-1441 (2019) - [j12]Yi-Lin Tuan, Hung-yi Lee
:
Improving Conditional Sequence Generative Adversarial Networks by Stepwise Evaluation. IEEE ACM Trans. Audio Speech Lang. Process. 27(4): 788-798 (2019) - [j11]Chia-Hsuan Lee
, Hung-yi Lee
, Szu-Lin Wu, Chi-Liang Liu, Wei Fang
, Juei-Yang Hsu, Bo-Hsiang Tseng:
Machine Comprehension of Spoken Content: TOEFL Listening Test and Spoken SQuAD. IEEE ACM Trans. Audio Speech Lang. Process. 27(9): 1469-1480 (2019) - [j10]Yi-Chen Chen, Sung-Feng Huang
, Hung-yi Lee
, Yu-Hsuan Wang, Chia-Hao Shen
:
Audio Word2vec: Sequence-to-Sequence Autoencoding for Unsupervised Learning of Audio Segmentation and Representation. IEEE ACM Trans. Audio Speech Lang. Process. 27(9): 1481-1493 (2019) - [c87]Songxiang Liu, Haibin Wu, Hung-yi Lee, Helen Meng:
Adversarial Attacks on Spoofing Countermeasures of Automatic Speaker Verification. ASRU 2019: 312-319 - [c86]Yau-Shian Wang, Hung-yi Lee, Yun-Nung Chen:
Tree Transformer: Integrating Tree Structures into Self-Attention. EMNLP/IJCNLP (1) 2019: 1061-1070 - [c85]Yi-Lin Tuan, Yun-Nung Chen, Hung-yi Lee:
DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs. EMNLP/IJCNLP (1) 2019: 1855-1865 - [c84]Tsung-Yuan Hsu, Chi-Liang Liu, Hung-yi Lee:
Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model. EMNLP/IJCNLP (1) 2019: 5932-5939 - [c83]Hongren Mao, Hung-yi Lee:
Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets. EMNLP/IJCNLP (1) 2019: 5959-5967 - [c82]Chia-Hung Wan, Shun-Po Chuang, Hung-yi Lee:
Towards Audio to Scene Image Synthesis Using Generative Adversarial Network. ICASSP 2019: 496-500 - [c81]Che-Ping Tsai, Hung-yi Lee:
Adversarial Learning of Label Dependency: A Novel Framework for Multi-class Classification. ICASSP 2019: 3847-3851 - [c80]Alexander H. Liu, Hung-yi Lee, Lin-Shan Lee:
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model. ICASSP 2019: 6176-6180 - [c79]Tzu-Wei Sung, Jun-You Liu, Hung-yi Lee, Lin-Shan Lee:
Towards End-to-end Speech-to-text Translation with Two-pass Decoding. ICASSP 2019: 7175-7179 - [c78]Chia-Hsuan Lee, Yun-Nung Chen, Hung-yi Lee:
Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation. ICASSP 2019: 7300-7304 - [c77]Richard Tzong-Han Tsai, Chia-Hao Chen, Chun-Kai Wu, Yu-Cheng Hsiao
, Hung-yi Lee:
Using Deep-Q Network to Select Candidates from N-best Speech Recognition Hypotheses for Enhancing Dialogue State Tracking. ICASSP 2019: 7375-7379 - [c76]Ching-Ting Chang, Shun-Po Chuang, Hung-yi Lee:
Code-Switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation. INTERSPEECH 2019: 554-558 - [c75]Ju-Chieh Chou, Hung-yi Lee:
One-Shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization. INTERSPEECH 2019: 664-668 - [c74]Li-Wei Chen, Hung-yi Lee, Yu Tsao
:
Generative Adversarial Networks for Unpaired Voice Transformation on Impaired Speech. INTERSPEECH 2019: 719-723 - [c73]Andy T. Liu
, Po-Chun Hsu, Hung-yi Lee:
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion. INTERSPEECH 2019: 1108-1112 - [c72]Gene-Ping Yang, Chao-I Tuan, Hung-yi Lee, Lin-Shan Lee:
Improved Speech Separation with Time-and-Frequency Cross-Domain Joint Embedding and Clustering. INTERSPEECH 2019: 1363-1367 - [c71]Kuan-Yu Chen, Che-Ping Tsai, Da-Rong Liu, Hung-yi Lee, Lin-Shan Lee:
Completely Unsupervised Phoneme Recognition by a Generative Adversarial Network Harmonized with Iteratively Refined Hidden Markov Models. INTERSPEECH 2019: 1856-1860 - [c70]Yuan-Jui Chen, Tao Tu, Cheng-chieh Yeh, Hung-yi Lee:
End-to-End Text-to-Speech for Low-Resource Languages by Cross-Lingual Transfer Learning. INTERSPEECH 2019: 2075-2079 - [c69]Chien-Feng Liao, Yu Tsao
, Hung-yi Lee, Hsin-Min Wang
:
Noise Adaptive Speech Enhancement Using Domain Adversarial Training. INTERSPEECH 2019: 3148-3152 - [c68]Feng-Guang Su, Aliyah R. Hsu, Yi-Lin Tuan, Hung-yi Lee:
Personalized Dialogue Response Generation Learned from Monologues. INTERSPEECH 2019: 4160-4164 - [i71]Kuan-Yu Chen, Che-Ping Tsai, Da-Rong Liu, Hung-yi Lee, Lin-Shan Lee:
Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models. CoRR abs/1904.04100 (2019) - [i70]Yi-Chen Chen, Sung-Feng Huang, Hung-yi Lee, Lin-Shan Lee:
From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings. CoRR abs/1904.05078 (2019) - [i69]Ju-Chieh Chou, Cheng-chieh Yeh, Hung-yi Lee:
One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization. CoRR abs/1904.05742 (2019) - [i68]Tao Tu, Yuan-Jui Chen, Cheng-chieh Yeh, Hung-yi Lee:
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning. CoRR abs/1904.06508 (2019) - [i67]Gene-Ping Yang, Chao-I Tuan, Hung-yi Lee, Lin-Shan Lee:
Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering. CoRR abs/1904.07845 (2019) - [i66]Chia-Hsuan Lee, Yun-Nung Chen, Hung-yi Lee:
Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation. CoRR abs/1904.07904 (2019) - [i65]Andy T. Liu
, Po-Chun Hsu, Hung-yi Lee:
Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion. CoRR abs/1905.11563 (2019) - [i64]Chia-Hsuan Lee, Hung-yi Lee:
Cross-Lingual Transfer Learning for Question Answering. CoRR abs/1907.06042 (2019) - [i63]Hongren Mao, Hung-yi Lee:
Polly Want a Cracker: Analyzing Performance of Parroting on Paraphrase Generation Datasets. CoRR abs/1908.07831 (2019) - [i62]Fan-Keng Sun, Cheng-Hao Ho, Hung-Yi Lee:
LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning. CoRR abs/1909.03329 (2019) - [i61]Che-Ping Tsai, Hung-yi Lee:
Order-free Learning Alleviating Exposure Bias in Multi-label Classification. CoRR abs/1909.03434 (2019) - [i60]Yau-Shian Wang, Hung-yi Lee, Yun-Nung Chen:
Tree Transformer: Integrating Tree Structures into Self-Attention. CoRR abs/1909.06639 (2019) - [i59]Tsung-Yuan Hsu, Chi-Liang Liu, Hung-yi Lee:
Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model. CoRR abs/1909.09587 (2019) - [i58]Yi-Lin Tuan, Yun-Nung Chen, Hung-yi Lee:
DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs. CoRR abs/1910.00610 (2019) - [i57]Songxiang Liu, Haibin Wu, Hung-yi Lee, Helen Meng:
Adversarial Attacks on Spoofing Countermeasures of automatic speaker verification. CoRR abs/1910.08716 (2019) - [i56]Yung-Sung Chuang, Chi-Liang Liu, Hung-yi Lee:
SpeechBERT: Cross-Modal Pre-trained Language Model for End-to-end Spoken Question Answering. CoRR abs/1910.11559 (2019) - [i55]Jui-Yang Hsu, Yuan-Jui Chen, Hung-yi Lee:
Meta Learning for End-to-End Low-Resource Speech Recognition. CoRR abs/1910.12094 (2019) - [i54]Andy T. Liu
, Shu-Wen Yang, Po-Han Chi, Po-Chun Hsu, Hung-yi Lee:
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders. CoRR abs/1910.12638 (2019) - [i53]Gene-Ping Yang, Szu-Lin Wu, Yao-Wen Mao, Hung-yi Lee, Lin-Shan Lee:
Interrupted and cascaded permutation invariant training for speech separation. CoRR abs/1910.12706 (2019) - [i52]Alexander H. Liu, Tao Tu, Hung-yi Lee, Lin-Shan Lee:
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning. CoRR abs/1910.12729 (2019) - [i51]Alexander H. Liu, Tzu-Wei Sung, Shun-Po Chuang, Hung-yi Lee, Lin-Shan Lee:
Sequence-to-sequence Automatic Speech Recognition with Word Embedding Regularization and Fused Decoding. CoRR abs/1910.12740 (2019) - [i50]Chung-Yi Li, Pei-Chieh Yuan, Hung-yi Lee:
What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesis. CoRR abs/1911.01102 (2019) - [i49]Shun-Po Chuang, Tzu-Wei Sung, Hung-yi Lee:
Training a code-switching language model with monolingual data. CoRR abs/1911.06003 (2019) - [i48]Ya-Liang Chang, Kuan-Ying Lee, Po-Yu Wu, Hung-yi Lee, Winston H. Hsu:
Deep Long Audio Inpainting. CoRR abs/1911.06476 (2019) - [i47]Bo-Wen Chen, Yen-Min Hsu, Hung-Yi Lee:
J-Net: Randomly weighted U-Net for audio source separation. CoRR abs/1911.12926 (2019) - [i46]Po-Chun Hsu, Chun-hsuan Wang, Andy T. Liu
, Hung-yi Lee:
Towards Robust Neural Vocoding for Speech Generation: A Survey. CoRR abs/1912.02461 (2019) - [i45]Chao-I Tuan, Yuan-Kuei Wu, Hung-yi Lee, Yu Tsao:
MITAS: A Compressed Time-Domain Audio Separation Network with Parameter Sharing. CoRR abs/1912.03884 (2019) - 2018
- [j9]Hung-yi Lee
, Pei-Hung Chung
, Yen-Chen Wu, Tzu-Hsiang Lin, Tsung-Hsien Wen:
Interactive Spoken Content Retrieval by Deep Reinforcement Learning. IEEE ACM Trans. Audio Speech Lang. Process. 26(12): 2447-2459 (2018) - [c67]Yau-Shian Wang, Hung-yi Lee:
Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks. EMNLP 2018: 4187-4195 - [c66]Chia-Hao Shen, Janet Y. Sung, Hung-yi Lee:
Language Transfer of Audio Word2Vec: Learning Audio Segment Representations Without Target Language Data. ICASSP 2018: 2231-2235 - [c65]Hsien-Chin Lin, Chi-Yu Yang, Hung-yi Lee, Lin-Shan Lee:
Domain Independent Key Term Extraction from Spoken Content Based on Context and Term Location Information in the Utterances. ICASSP 2018: 6044-6048 - [c64]Chih-Wei Lee, Yau-Shian Wang, Tsung-Yuan Hsu, Kuan-Yu Chen, Hung-yi Lee, Lin-Shan Lee:
Scalable Sentiment for Sequence-to-Sequence Chatbot Response with Performance Analysis. ICASSP 2018: 6164-6168 - [c63]Chia-Wei Ao, Hung-yi Lee:
Query-by-Example Spoken Term Detection Using Attention-Based Multi-Hop Networks. ICASSP 2018: 6264-6268 - [c62]Yu-Hsuan Wang, Hung-yi Lee, Lin-Shan Lee:
Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection. ICASSP 2018: 6269-6273 - [c61]Ju-Chieh Chou, Cheng-chieh Yeh, Hung-yi Lee, Lin-Shan Lee:
Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations. INTERSPEECH 2018: 501-505 - [c60]Pei-Hung Chung
, Kuan Tung, Ching-Lun Tai
, Hung-yi Lee:
Joint Learning of Interactive Spoken Content Retrieval and Trainable User Simulator. INTERSPEECH 2018: 2032-2036 - [c59]Chia-Hsuan Li, Szu-Lin Wu, Chi-Liang Liu, Hung-yi Lee:
Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension. INTERSPEECH 2018: 3459-3463 - [c58]Da-Rong Liu, Kuan-Yu Chen, Hung-yi Lee, Lin-Shan Lee:
Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings. INTERSPEECH 2018: 3748-3752 - [c57]Yu-An Chung, Hung-yi Lee, James R. Glass:
Supervised and Unsupervised Transfer Learning for Question Answering. NAACL-HLT 2018: 1585-1594 - [c56]Cheng-chieh Yeh, Po-Chun Hsu, Ju-Chieh Chou, Hung-yi Lee, Lin-Shan Lee:
Rhythm-Flexible Voice Conversion Without Parallel Data Using Cycle-GAN Over Phoneme Posteriorgram Sequences. SLT 2018: 274-281 - [c55]Da-Rong Liu, Chi-Yu Yang, Szu-Lin Wu, Hung-yi Lee:
Improving Unsupervised Style Transfer in end-to-end Speech Synthesis with end-to-end Speech Recognition. SLT 2018: 640-647 - [c54]Yi-Chen Chen, Sung-Feng Huang, Chia-Hao Shen, Hung-yi Lee, Lin-Shan Lee:
Phonetic-and-Semantic Embedding of Spoken words with Applications in Spoken Content Retrieval. SLT 2018: 941-948 - [c53]Chia-Hsuan Lee, Shang-Ming Wang, Huan-Cheng Chang, Hung-yi Lee:
ODSQA: Open-Domain Spoken Question Answering Dataset. SLT 2018: 949-956 - [i44]Yi-Chen Chen, Chia-Hao Shen, Sung-Feng Huang, Hung-yi Lee:
Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only. CoRR abs/1803.10952 (2018) - [i43]Da-Rong Liu, Kuan-Yu Chen, Hung-yi Lee, Lin-Shan Lee:
Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings. CoRR abs/1804.00316 (2018) - [i42]Pei-Hung Chung, Kuan Tung, Ching-Lun Tai, Hung-yi Lee:
Joint Learning of Interactive Spoken Content Retrieval and Trainable User Simulator. CoRR abs/1804.00318 (2018) - [i41]Chia-Hsuan Li, Szu-Lin Wu, Chi-Liang Liu, Hung-yi Lee:
Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension. CoRR abs/1804.00320 (2018) - [i40]Chih-Wei Lee, Yau-Shian Wang, Tsung-Yuan Hsu, Kuan-Yu Chen, Hung-yi Lee, Lin-Shan Lee:
Scalable Sentiment for Sequence-to-sequence Chatbot Response with Performance Analysis. CoRR abs/1804.02504 (2018) - [i39]Ju-Chieh Chou, Cheng-chieh Yeh, Hung-yi Lee, Lin-Shan Lee:
Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations. CoRR abs/1804.02812 (2018) - [i38]Chien-Feng Liao, Yu Tsao, Hung-yi Lee, Hsin-Min Wang:
Noise Adaptive Speech Enhancement using Domain Adversarial Training. CoRR abs/1807.07501 (2018) - [i37]Yi-Chen Chen, Sung-Feng Huang, Chia-Hao Shen, Hung-yi Lee, Lin-Shan Lee:
Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval. CoRR abs/1807.08089 (2018) - [i36]Yu-Hsuan Wang, Hung-yi Lee, Lin-Shan Lee:
Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection. CoRR abs/1808.02228 (2018) - [i35]Chia-Hsuan Lee, Shang-Ming Wang, Huan-Cheng Chang, Hung-yi Lee:
ODSQA: Open-domain Spoken Question Answering Dataset. CoRR abs/1808.02280 (2018) - [i34]Cheng-chieh Yeh, Po-Chun Hsu, Ju-Chieh Chou, Hung-yi Lee, Lin-Shan Lee:
Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences. CoRR abs/1808.03113 (2018) - [i33]Chia-Hung Wan, Shun-Po Chuang, Hung-yi Lee:
Towards Audio to Scene Image Synthesis using Generative Adversarial Network. CoRR abs/1808.04108 (2018) - [i32]Yi-Lin Tuan, Hung-yi Lee:
Improving Conditional Sequence Generative Adversarial Networks by Stepwise Evaluation. CoRR abs/1808.05599 (2018) - [i31]Yi-Lin Tuan, Jinzhi Zhang, Yujia Li, Hung-yi Lee:
Proximal Policy Optimization and its Dynamic Version for Sequence Generation. CoRR abs/1808.07982 (2018) - [i30]Shun-Yao Shih, Fan-Keng Sun, Hung-Yi Lee:
Temporal Pattern Attention for Multivariate Time Series Forecasting. CoRR abs/1809.04206 (2018) - [i29]Yau-Shian Wang, Hung-yi Lee:
Learning to Encode Text as Human-Readable Summaries using Generative Adversarial Networks. CoRR abs/1810.02851 (2018) - [i28]Yi-Chen Chen, Chia-Hao Shen, Sung-Feng Huang, Hung-yi Lee, Lin-Shan Lee:
Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data. CoRR abs/1810.12566 (2018) - [i27]Li-Wei Chen, Hung-yi Lee, Yu Tsao:
Generative Adversarial Networks for Unpaired Voice Transformation on Impaired Speech. CoRR abs/1810.12656 (2018) - [i26]Alexander H. Liu, Hung-yi Lee, Lin-Shan Lee:
Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model. CoRR abs/1811.00787 (2018) - [i25]Ching-Ting Chang, Shun-Po Chuang, Hung-yi Lee:
Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation. CoRR abs/1811.02356 (2018) - [i24]Sung-Feng Huang, Yi-Chen Chen, Hung-yi Lee, Lin-Shan Lee:
Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection. CoRR abs/1811.02775 (2018) - [i23]Che-Ping Tsai, Hung-yi Lee:
Adversarial Learning of Label Dependency: A Novel Framework for Multi-class Classification. CoRR abs/1811.04689 (2018) - 2017
- [j8]Hung-yi Lee, Bo-Hsiang Tseng, Tsung-Hsien Wen, Yu Tsao
:
Personalizing Recurrent-Neural-Network-Based Language Model by Social Network. IEEE ACM Trans. Audio Speech Lang. Process. 25(3): 519-530 (2017) - [c52]Shun-Po Chuang, Chia-Hung Wan, Pang-Chi Huang, Chi-Yu Yang, Hung-yi Lee:
Seeing and hearing too: Audio representation for video captioning. ASRU 2017: 381-388 - [c51]Pin-Jung Chen, I-Hung Hsu, Yi Yao Huang, Hung-yi Lee:
Mitigating the impact of speech recognition errors on chatbot using sequence-to-sequence model. ASRU 2017: 497-503 - [c50]Zih-Wei Lin, Tzu-Wei Sung, Hung-yi Lee, Lin-Shan Lee:
Personalized word representations carrying personalized semantics learned from social network posts. ASRU 2017: 533-540 - [c49]Tzu-Ray Su, Hung-yi Lee:
Learning Chinese Word Representations From Glyphs Of Characters. EMNLP 2017: 264-273 - [c48]Cheng-Kuang Wei, Cheng-Tao Chung, Hung-yi Lee, Lin-Shan Lee:
Personalized acoustic modeling by weakly supervised multi-task deep learning using acoustic tokens discovered from unlabeled data. ICASSP 2017: 5165-5169 - [c47]Wei-Jen Ko, Bo-Hsiang Tseng, Hung-yi Lee:
Recurrent Neural Network based language modeling with controllable external Memory. ICASSP 2017: 5705-5709 - [c46]Bo-Ru Lu, Frank Shyu, Yun-Nung Chen, Hung-yi Lee, Lin-Shan Lee:
Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification. INTERSPEECH 2017: 2899-2903 - [c45]Yu-Hsuan Wang, Cheng-Tao Chung, Hung-yi Lee:
Gate Activation Signal Analysis for Gated Recurrent Neural Networks and its Correlation with Phoneme Boundaries. INTERSPEECH 2017: 3822-3826 - [i22]Yu-Hsuan Wang, Cheng-Tao Chung, Hung-yi Lee:
Gate Activation Signal Analysis for Gated Recurrent Neural Networks and Its Correlation with Phoneme Boundaries. CoRR abs/1703.07588 (2017) - [i21]Cheng-Kuang Wei, Cheng-Tao Chung, Hung-yi Lee, Lin-Shan Lee:
Personalized Acoustic Modeling by Weakly Supervised Multi-Task Deep Learning using Acoustic Tokens Discovered from Unlabeled Data. CoRR abs/1706.07793 (2017) - [i20]Chia-Hao Shen, Janet Y. Sung, Hung-yi Lee:
Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data. CoRR abs/1707.06519 (2017) - [i19]Tzu-Ray Su, Hung-yi Lee:
Learning Chinese Word Representations From Glyphs Of Characters. CoRR abs/1708.04755 (2017) - [i18]Chia-Wei Ao, Hung-yi Lee:
Query-by-example Spoken Term Detection using Attention-based Multi-hop Networks. CoRR abs/1709.00354 (2017) - [i17]Tzu-Chien Liu, Yu-Hsueh Wu, Hung-yi Lee:
Attention-based CNN Matching Net. CoRR abs/1709.05036 (2017) - [i16]Bo-Ru Lu, Frank Shyu, Yun-Nung Chen, Hung-yi Lee, Lin-Shan Lee:
Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification. CoRR abs/1709.05475 (2017) - [i15]Pin-Jung Chen, I-Hung Hsu, Yi Yao Huang, Hung-yi Lee:
Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-sequence Model. CoRR abs/1709.07862 (2017) - [i14]Zih-Wei Lin, Tzu-Wei Sung, Hung-yi Lee, Lin-Shan Lee:
Personalized word representations Carrying Personalized Semantics Learned from Social Network Posts. CoRR abs/1710.10574 (2017) - [i13]Yu-An Chung, Hung-yi Lee, James R. Glass:
Supervised and Unsupervised Transfer Learning for Question Answering. CoRR abs/1711.05345 (2017) - 2016
- [c44]Yu-An Chung, Chao-Chung Wu, Chia-Hao Shen, Hung-yi Lee, Lin-Shan Lee:
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder. INTERSPEECH 2016: 765-769 - [c43]Yen-Chen Wu, Tzu-Hsiang Lin, Yang-De Chen, Hung-yi Lee, Lin-Shan Lee:
Interactive Spoken Content Retrieval by Deep Reinforcement Learning. INTERSPEECH 2016: 943-947 - [c42]Sheng-syun Shen, Hung-yi Lee:
Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection. INTERSPEECH 2016: 2716-2720 - [c41]Bo-Hsiang Tseng, Sheng-syun Shen, Hung-yi Lee, Lin-Shan Lee:
Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine. INTERSPEECH 2016: 2731-2735 - [c40]Lang-Chi Yu, Hung-yi Lee, Lin-Shan Lee:
Abstractive headline generation for spoken content by attentive recurrent neural networks with ASR error modeling. SLT 2016: 151-157 - [c39]Wei Fang
, Juei-Yang Hsu, Hung-yi Lee, Lin-Shan Lee:
Hierarchical attention model for improved machine comprehension of spoken content. SLT 2016: 232-238 - [i12]Cheng-Tao Chung, Cheng-Yu Tsai, Hsiang-Hung Lu, Chia-Hsiang Liu, Hung-yi Lee, Lin-Shan Lee:
An Iterative Deep Learning Framework for Unsupervised Discovery of Speech Features and Linguistic Units with Applications on Spoken Term Detection. CoRR abs/1602.00426 (2016) - [i11]Yu-An Chung, Chao-Chung Wu, Chia-Hao Shen, Hung-yi Lee, Lin-Shan Lee:
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder. CoRR abs/1603.00982 (2016) - [i10]Sheng-syun Shen, Hung-yi Lee:
Neural Attention Models for Sequence Classification: Analysis and Application to Key Term Extraction and Dialogue Act Detection. CoRR abs/1604.00077 (2016) - [i9]Bo-Hsiang Tseng, Sheng-syun Shen, Hung-yi Lee, Lin-Shan Lee:
Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine. CoRR abs/1608.06378 (2016) - [i8]Wei Fang, Juei-Yang Hsu, Hung-yi Lee, Lin-Shan Lee:
Hierarchical Attention Model for Improved Machine Comprehension of Spoken Content. CoRR abs/1608.07775 (2016) - [i7]Yen-Chen Wu, Tzu-Hsiang Lin, Yang-De Chen, Hung-yi Lee, Lin-Shan Lee:
Interactive Spoken Content Retrieval by Deep Reinforcement Learning. CoRR abs/1609.05234 (2016) - [i6]Da-Rong Liu, Shun-Po Chuang, Hung-yi Lee:
Attention-based Memory Selection Recurrent Network for Language Modeling. CoRR abs/1611.08656 (2016) - [i5]Lang-Chi Yu, Hung-yi Lee, Lin-Shan Lee:
Abstractive Headline Generation for Spoken Content by Attentive Recurrent Neural Networks with ASR Error Modeling. CoRR abs/1612.08375 (2016) - 2015
- [j7]Lin-Shan Lee, James R. Glass, Hung-yi Lee
, Chun-an Chan:
Spoken Content Retrieval - Beyond Cascading Speech Recognition with Text Retrieval. IEEE ACM Trans. Audio Speech Lang. Process. 23(9): 1389-1420 (2015) - [c38]Bo-Hsiang Tseng, Hung-yi Lee, Lin-Shan Lee:
Personalizing universal recurrent neural network language model with user characteristic features by social network crowdsourcing. ASRU 2015: 84-91 - [c37]Yi-Hsiu Liao, Hung-yi Lee, Lin-Shan Lee:
Towards structured deep neural network for automatic speech recognition. ASRU 2015: 137-144 - [c36]Cheng-Tao Chung, Cheng-Yu Tsai, Hsiang-Hung Lu, Chia-Hsiang Liu, Hung-yi Lee, Lin-Shan Lee:
An iterative deep learning framework for unsupervised discovery of speech features and linguistic units with applications on spoken term detection. ASRU 2015: 245-251 - [c35]Hung-tsung Lu, Yuan-ming Liou, Hung-yi Lee, Lin-Shan Lee:
Semantic retrieval of personal photos using a deep autoencoder fusing visual features with speech annotations represented as word/paragraph vectors. INTERSPEECH 2015: 140-144 - [c34]Sheng-syun Shen, Hung-yi Lee, Shang-wen Li, Victor Zue, Lin-Shan Lee:
Structuring lectures in massive open online courses (MOOCs) for efficient learning by linking similar sections and predicting prerequisites. INTERSPEECH 2015: 1363-1367 - [c33]Ching-feng Yeh, Yuan-ming Liou, Hung-yi Lee, Lin-Shan Lee:
Personalized speech recognizer with keyword-based personalized lexicon and language model using word vector representations. INTERSPEECH 2015: 3521-3525 - [i4]Yi-Hsiu Liao, Hung-yi Lee, Lin-Shan Lee:
Towards Structured Deep Neural Network for Automatic Speech Recognition. CoRR abs/1506.01163 (2015) - [i3]Bo-Hsiang Tseng, Hung-yi Lee, Lin-Shan Lee:
Personalizing a Universal Recurrent Neural Network Language Model with User Characteristic Features by Crowdsouring over Social Networks. CoRR abs/1506.01192 (2015) - [i2]Cheng-Tao Chung, Cheng-Yu Tsai, Hsiang-Hung Lu, Yuan-ming Liou, Yen-Chen Wu, Yen-Ju Lu, Hung-yi Lee, Lin-Shan Lee:
A Multi-layered Acoustic Tokenizing Deep Neural Network (MAT-DNN) for Unsupervised Discovery of Linguistic Units and Generation of High Quality Features. CoRR abs/1506.02327 (2015) - [i1]Yi-Hsiu Liao, Hung-yi Lee, Lin-Shan Lee:
Towards Structured Deep Neural Network for Automatic Speech Recognition. CoRR abs/1511.02506 (2015) - 2014
- [j6]Hung-yi Lee
, Po-wei Chou, Lin-Shan Lee:
Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity. Comput. Speech Lang. 28(5): 1045-1065 (2014) - [j5]Hung-yi Lee
, Lin-Shan Lee:
Improved Semantic Retrieval of Spoken Content by Document/Query Expansion with Random Walk Over Acoustic Similarity Graphs. IEEE ACM Trans. Audio Speech Lang. Process. 22(1): 80-94 (2014) - [j4]Hung-yi Lee
, Sz-Rung Shiang, Ching-feng Yeh, Yun-Nung Chen, Yu Huang, Sheng-yi Kong, Lin-Shan Lee:
Spoken Knowledge Organization by Semantic Structuring and a Prototype Course Lecture System for Personalized Learning. IEEE ACM Trans. Audio Speech Lang. Process. 22(5): 881-896 (2014) - [c32]Hung-yi Lee
, Han-Pang Huang, Huan-Kun Hsu:
Lifting motion planning for humanoid robots. CASE 2014: 1174-1179 - [c31]Sz-Rung Shiang, Hung-yi Lee, Lin-Shan Lee:
Spoken question answering using tree-structured conditional random fields and two-layer random walk. INTERSPEECH 2014: 263-267 - [c30]Han Lu, Sheng-syun Shen, Sz-Rung Shiang, Hung-yi Lee, Lin-Shan Lee:
Alignment of spoken utterances with slide content for easier learning with recorded lectures using structured support vector machine (SVM). INTERSPEECH 2014: 1473-1477 - [c29]Yuan-ming Liou, Yi-Sheng Fu, Hung-yi Lee, Lin-Shan Lee:
Semantic retrieval of personal photos using matrix factorization and two-layer random walk fusing sparse speech annotations with visual features. INTERSPEECH 2014: 1762-1766 - [c28]Hung-yi Lee, Yu Zhang, Ekapol Chuangsuwanich
, James R. Glass:
Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages. INTERSPEECH 2014: 2479-2483 - 2013
- [j3]Hung-yi Lee, Lin-Shan Lee:
Enhanced Spoken Term Detection Using Support Vector Machines and Weighted Pseudo Examples. IEEE Trans. Speech Audio Process. 21(6): 1272-1284 (2013) - [c27]Yun-Chiao Li, Hung-yi Lee
, Cheng-Tao Chung, Chun-an Chan, Lin-Shan Lee:
Towards unsupervised semantic retrieval of spoken content with query expansion based on automatically discovered acoustic patterns. ASRU 2013: 198-203 - [c26]Hung-yi Lee
, Yun-Chiao Li, Cheng-Tao Chung, Lin-Shan Lee:
Enhancing query expansion for semantic retrieval of spoken content with automatically discovered acoustic patterns. ICASSP 2013: 8297-8301 - [c25]Hung-yi Lee
, Yu-Yu Chou, Yow-Bang Wang
, Lin-Shan Lee:
Unsupervised domain adaptation for spoken document summarization with structured support vector machine. ICASSP 2013: 8347-8351 - [c24]Tsung-Hsien Wen, Hung-yi Lee
, Pei-hao Su, Lin-Shan Lee:
Interactive spoken content retrieval by extended query model and continuous state space Markov Decision Process. ICASSP 2013: 8510-8514 - [c23]Hung-yi Lee, Ting-Yao Hu, How Jing, Yun-Fan Chang, Yu Tsao, Yu-Cheng Kao, Tsang-Long Pao:
Ensemble of machine learning and acoustic segment model techniques for speech emotion and autism spectrum disorders recognition. INTERSPEECH 2013: 215-219 - [c22]Ching-feng Yeh, Hung-yi Lee, Lin-Shan Lee:
Speaking rate normalization with lattice-based context-dependent phoneme duration modeling for personalized speech recognizers on mobile devices. INTERSPEECH 2013: 1741-1745 - [c21]Tsung-Hsien Wen, Aaron Heidel, Hung-yi Lee, Yu Tsao, Lin-Shan Lee:
Recurrent neural network based language model personalization by social network crowdsourcing. INTERSPEECH 2013: 2703-2707 - [c20]Sz-Rung Shiang, Hung-yi Lee, Lin-Shan Lee:
Supervised spoken document summarization based on structured support vector machine with utterance clusters as hidden variables. INTERSPEECH 2013: 2728-2732 - 2012
- [j2]Yi-Cheng Pan, Hung-yi Lee
, Lin-Shan Lee:
Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process. IEEE Trans. Speech Audio Process. 20(2): 632-645 (2012) - [j1]Hung-yi Lee
, Chia-Ping Chen, Lin-Shan Lee:
Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection. IEEE Trans. Speech Audio Process. 20(7): 2095-2110 (2012) - [c19]Ching-feng Yeh, Aaron Heidel, Hung-yi Lee, Lin-Shan Lee:
Recognition of highly imbalanced code-mixed bilingual speech with frame-level language detection based on blurred posteriorgram. ICASSP 2012: 4873-4876 - [c18]Yun-Nung Chen, Yu Huang, Hung-yi Lee
, Lin-Shan Lee:
Unsupervised two-stage keyword extraction from spoken documents by topic coherence and support vector machine. ICASSP 2012: 5041-5044 - [c17]Hung-yi Lee
, Yun-Nung Chen, Lin-Shan Lee:
Utterance-level latent topic transition modeling for spoken documents and its application in automatic summarization. ICASSP 2012: 5065-5068 - [c16]Tsung-wei Tu, Hung-yi Lee
, Yu-Yu Chou, Lin-Shan Lee:
Semantic query expansion and context-based discriminative term modeling for spoken document retrieval. ICASSP 2012: 5085-5088 - [c15]Hung-yi Lee, Po-wei Chou, Lin-Shan Lee:
Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature Similarity. INTERSPEECH 2012: 2077-2080 - [c14]Hung-yi Lee, Yu-Yu Chou, Yow-Bang Wang, Lin-Shan Lee:
Supervised Spoken Document Summarization jointly Considering Utterance Importance and Redundancy by Structured Support Vector Machine. INTERSPEECH 2012: 2342-2345 - [c13]Tsung-Hsien Wen, Hung-yi Lee, Lin-Shan Lee:
Interactive Spoken Content Retrieval with Different Types of Actions Optimized By a Markov Decision Process. INTERSPEECH 2012: 2458-2461 - [c12]Hung-yi Lee
, Tsung-Hsien Wen, Lin-Shan Lee:
Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph. SLT 2012: 182-187 - [c11]Tsung-Hsien Wen, Hung-yi Lee
, Tai-Yuan Chen, Lin-Shan Lee:
Personalized language modeling by crowd sourcing with social network data for voice access of cloud applications. SLT 2012: 188-193 - 2011
- [c10]Tsung-wei Tu, Hung-yi Lee
, Lin-Shan Lee:
Improved spoken term detection using support vector machines with acoustic and context features from pseudo-relevance feedback. ASRU 2011: 383-388 - [c9]Yun-Nung Chen, Chia-Ping Chen, Hung-yi Lee
, Chun-an Chan, Lin-Shan Lee:
Improved spoken term detection with graph-based re-ranking in feature space. ICASSP 2011: 5644-5647 - [c8]Hung-yi Lee
, Tsung-wei Tu, Chia-Ping Chen, Chao-Yu Huang, Lin-Shan Lee:
Improved spoken term detection using support vector machines based on lattice context consistency. ICASSP 2011: 5648-5651 - 2010
- [c7]Yu-Hui Chen, Chia-Chen Chou, Hung-yi Lee
, Lin-Shan Lee:
An initial attempt to improve spoken term detection by learning optimal weights for different indexing features. ICASSP 2010: 5278-5281 - [c6]Hung-yi Lee
, Lin-Shan Lee:
Integrating recognition and retrieval with user feedback: A new framework for spoken term detection. ICASSP 2010: 5290-5293 - [c5]Hung-yi Lee, Chia-Ping Chen, Ching-feng Yeh, Lin-Shan Lee:
Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback. INTERSPEECH 2010: 1273-1276 - [c4]Chia-Ping Chen, Hung-yi Lee, Ching-feng Yeh, Lin-Shan Lee:
Improved spoken term detection by feature space pseudo-relevance feedback. INTERSPEECH 2010: 1672-1675 - [c3]Hung-yi Lee
, Chia-Ping Chen, Ching-feng Yeh, Lin-Shan Lee:
A framework integrating different relevance feedback scenarios and approaches for spoken term detection. SLT 2010: 389-394
2000 – 2009
- 2009
- [c2]Hung-yi Lee
, Yueh-Lien Tang, Hao Tang, Lin-Shan Lee:
Spoken term detection from bilingual spontaneous speech using code-switched lattice-based structures for words and subword units. ASRU 2009: 410-415 - [c1]Chao-Hong Meng, Hung-yi Lee
, Lin-Shan Lee:
Improved lattice-based spoken document retrieval by directly learning from the evaluation measures. ICASSP 2009: 4893-4896
Coauthor Index
Kai-Wei Chang
aka: Kaiwei Chang
aka: Kaiwei Chang
Shuyan Dong
aka: Shuyan Annie Dong
aka: Shuyan Annie Dong
David F. Harwath
aka: David Harwath
aka: David Harwath
Chien-Yu Huang
aka: Chien-yu Huang
aka: Chien-yu Huang
Cheng-I Lai
aka: Cheng-I Jeff Lai
aka: Cheng-I Jeff Lai
[c215] [c210] [i275] [i273] [c148] [i181] [c127] [c120] [c114] [c110] [i128] [c105] [c101] [c98] [c90] [i98] [i93] [i79] [c80] [c79] [c72] [c71] [i71] [i70] [i67] [i53] [i52] [i51] [c65] [c64] [c62] [c61] [c58] [c56] [c54] [i43] [i40] [i39] [i37] [i36] [i34] [i28] [i26] [i24] [c50] [c48] [c46] [i21] [i16] [i14] [c44] [c43] [c41] [c40] [c39] [i12] [i11] [i9] [i8] [i7] [i5] [j7] [c38] [c37] [c36] [c35] [c34] [c33] [i4] [i3] [i2] [i1] [j6] [j5] [j4] [c31] [c30] [c29] [j3] [c27] [c26] [c25] [c24] [c22] [c21] [c20] [j2] [j1] [c19] [c18] [c17] [c16] [c15] [c14] [c13] [c12] [c11] [c10] [c9] [c8] [c7] [c6] [c5] [c4] [c3] [c2] [c1]
[j27] [j26] [c220] [c215] [i275] [i265] [i236] [i219] [c193] [c191] [c187] [c171] [c170] [i213] [i208] [i206] [i199] [i195] [i192] [i189] [i186] [j21] [c165] [c154] [c150] [c148] [c145] [c139] [c136] [i181] [i179] [i178] [i174] [i164] [i162] [i151] [i147] [i141] [j15] [c124] [c121] [c113] [i124] [i119] [i94] [i86] [i80] [c34]
Helen M. Meng
aka: Helen Meng
aka: Helen Meng
Abdel-rahman Mohamed
aka: Abdelrahman Mohamed
aka: Abdelrahman Mohamed