default search action
Yuan Gong 0001
Person information
- affiliation: Massachusetts Institute of Technology, Cambridge, MA, USA
Other persons with the same name
- Yuan Gong — disambiguation page
- Yuan Gong 0002 — Tsinghua University, Beijing, China
- Yuan Gong 0003 — Ohio State University, Columbus, OH, USA
- Yuan Gong 0004 — Peking University, School of Software and Microelectronics, Beijing, China
- Yuan Gong 0005 — University of Electronic Science & Technology of China, Chengdu, China (and 1 more)
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c20]Yuan Gong, Hongyin Luo, Alexander H. Liu, Leonid Karlinsky, James R. Glass:
Listen, Think, and Understand. ICLR 2024 - [c19]Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Yoon Kim, Xixin Wu, Helen Meng, Jim Glass:
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning. NAACL-HLT (Findings) 2024: 4131-4155 - [i24]Andrew Rouditchenko, Yuan Gong, Samuel Thomas, Leonid Karlinsky, Hilde Kuehne, Rogério Feris, James R. Glass:
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation. CoRR abs/2406.10082 (2024) - [i23]Liming Wang, Yuan Gong, Nauman Dawalatabad, Marco Vilela, Katerina Placek, Brian Tracey, Yishu Gong, Alan Premasiri, Fernando Vieira, James R. Glass:
Automatic Prediction of Amyotrophic Lateral Sclerosis Progression using Longitudinal Speech Transformer. CoRR abs/2406.18625 (2024) - 2023
- [c18]Yuan Gong, Alexander H. Liu, Hongyin Luo, Leonid Karlinsky, James R. Glass:
Joint Audio and Speech Understanding. ASRU 2023: 1-8 - [c17]Hongyin Luo, Tianhua Zhang, Yung-Sung Chuang, Yuan Gong, Yoon Kim, Xixin Wu, Helen Meng, James R. Glass:
Search Augmented Instruction Learning. EMNLP (Findings) 2023: 3717-3729 - [c16]Yuan Gong, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James R. Glass:
Contrastive Audio-Visual Masked Autoencoder. ICLR 2023 - [c15]Yuan Gong, Sameer Khurana, Leonid Karlinsky, James R. Glass:
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers. INTERSPEECH 2023: 2798-2802 - [i22]Yuan Gong, Hongyin Luo, Alexander H. Liu, Leonid Karlinsky, James R. Glass:
Listen, Think, and Understand. CoRR abs/2305.10790 (2023) - [i21]Hongyin Luo, Yung-Sung Chuang, Yuan Gong, Tianhua Zhang, Yoon Kim, Xixin Wu, Danny Fox, Helen Meng, James R. Glass:
SAIL: Search-Augmented Instruction Learning. CoRR abs/2305.15225 (2023) - [i20]Yuan Gong, Sameer Khurana, Leonid Karlinsky, James R. Glass:
Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers. CoRR abs/2307.03183 (2023) - [i19]Tianhua Zhang, Jiaxin Ge, Hongyin Luo, Yung-Sung Chuang, Mingye Gao, Yuan Gong, Xixin Wu, Yoon Kim, Helen Meng, James R. Glass:
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning. CoRR abs/2309.10814 (2023) - [i18]Yuan Gong, Alexander H. Liu, Hongyin Luo, Leonid Karlinsky, James R. Glass:
Joint Audio and Speech Understanding. CoRR abs/2309.14405 (2023) - 2022
- [j3]Yuan Gong, Alexander H. Liu, Andrew Rouditchenko, James R. Glass:
UAVM: Towards Unifying Audio and Visual Models. IEEE Signal Process. Lett. 29: 2437-2441 (2022) - [c14]Yuan Gong, Cheng-I Lai, Yu-An Chung, James R. Glass:
SSAST: Self-Supervised Audio Spectrogram Transformer. AAAI 2022: 10699-10709 - [c13]Nauman Dawalatabad, Yuan Gong, Sameer Khurana, Rhoda Au, James R. Glass:
Detecting Dementia from Long Neuropsychological Interviews. EMNLP (Findings) 2022: 5270-5283 - [c12]Yuan Gong, Jin Yu, James R. Glass:
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition. ICASSP 2022: 151-155 - [c11]Yuan Gong, Ziyi Chen, Iek-Heng Chu, Peng Chang, James R. Glass:
Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment. ICASSP 2022: 7262-7266 - [i17]Yuan Gong, Sameer Khurana, Andrew Rouditchenko, James R. Glass:
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification. CoRR abs/2203.06760 (2022) - [i16]Yuan Gong, Ziyi Chen, Iek-Heng Chu, Peng Chang, James R. Glass:
Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment. CoRR abs/2205.03432 (2022) - [i15]Yuan Gong, Jin Yu, James R. Glass:
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition. CoRR abs/2205.03433 (2022) - [i14]Yuan Gong, Alexander H. Liu, Andrew Rouditchenko, James R. Glass:
UAVM: A Unified Model for Audio-Visual Learning. CoRR abs/2208.00061 (2022) - [i13]Yuan Gong, Andrew Rouditchenko, Alexander H. Liu, David Harwath, Leonid Karlinsky, Hilde Kuehne, James R. Glass:
Contrastive Audio-Visual Masked Autoencoder. CoRR abs/2210.07839 (2022) - 2021
- [j2]Yuan Gong, Yu-An Chung, James R. Glass:
PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3292-3306 (2021) - [c10]Yuan Gong, Yu-An Chung, James R. Glass:
AST: Audio Spectrogram Transformer. Interspeech 2021: 571-575 - [i12]Yuan Gong, Yu-An Chung, James R. Glass:
PSLA: Improving Audio Event Classification with Pretraining, Sampling, Labeling, and Aggregation. CoRR abs/2102.01243 (2021) - [i11]Yuan Gong, Yu-An Chung, James R. Glass:
AST: Audio Spectrogram Transformer. CoRR abs/2104.01778 (2021) - [i10]Yuan Gong, Cheng-I Jeff Lai, Yu-An Chung, James R. Glass:
SSAST: Self-Supervised Audio Spectrogram Transformer. CoRR abs/2110.09784 (2021) - 2020
- [j1]Yuan Gong, Jian Yang, Christian Poellabauer:
Detecting Replay Attacks Using Multi-Channel Audio: A Neural Network-Based Method. IEEE Signal Process. Lett. 27: 920-924 (2020) - [d1]Yuan Gong, Jian Yang, Jacob Huber, Mitchell MacKnight, Christian Poellabauer:
ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems. IEEE DataPort, 2020 - [i9]Yuan Gong, Jian Yang, Christian Poellabauer:
Detecting Replay Attacks Using Multi-Channel Audio: A Neural Network-Based Method. CoRR abs/2003.08225 (2020)
2010 – 2019
- 2019
- [c9]Bryan Bryan, Yuan Gong, Yizhe Zhang, Christian Poellabauer:
Second-Order Non-Local Attention Networks for Person Re-Identification. ICCV 2019: 3759-3768 - [c8]Yuan Gong, Boyang Li, Christian Poellabauer, Yiyu Shi:
Real-Time Adversarial Attacks. IJCAI 2019: 4672-4680 - [c7]Yuan Gong, Jian Yang, Jacob Huber, Mitchell MacKnight, Christian Poellabauer:
ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems. INTERSPEECH 2019: 2355-2359 - [i8]Yuan Gong, Jian Yang, Jacob Huber, Mitchell MacKnight, Christian Poellabauer:
ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems. CoRR abs/1904.03365 (2019) - [i7]Yuan Gong, Boyang Li, Christian Poellabauer, Yiyu Shi:
Real-Time Adversarial Attacks. CoRR abs/1905.13399 (2019) - [i6]Bryan (Ning) Xia, Yuan Gong, Yizhe Zhang, Christian Poellabauer:
Second-order Non-local Attention Networks for Person Re-identification. CoRR abs/1909.00295 (2019) - 2018
- [c6]Yuan Gong, Hasini Yatawatte, Christian Poellabauer, Sandra L. Schneider, Susan Latham:
Automatic Autism Spectrum Disorder Detection Using Everyday Vocalizations Captured by Smart Devices. BCB 2018: 465-473 - [c5]Yuan Gong, Kevin Shin, Christian Poellabauer:
Improving LIWC Using Soft Word Matching. BCB 2018: 523 - [c4]Yuan Gong, Christian Poellabauer:
Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic Cues. ICCCN 2018: 1-9 - [c3]Yuan Gong, Christian Poellabauer:
Impact of Aliasing on Deep CNN-Based End-to-End Acoustic Models. INTERSPEECH 2018: 2698-2702 - [i5]Yuan Gong, Christian Poellabauer:
An Overview of Vulnerabilities of Voice Controlled Systems. CoRR abs/1803.09156 (2018) - [i4]Yuan Gong, Christian Poellabauer:
Topic Modeling Based Multi-modal Depression Detection. CoRR abs/1803.10384 (2018) - [i3]Yuan Gong, Christian Poellabauer:
Towards Learning Fine-Grained Disentangled Representations from Speech. CoRR abs/1808.02939 (2018) - [i2]Yuan Gong, Christian Poellabauer:
Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic Cues. CoRR abs/1811.07018 (2018) - 2017
- [c2]Yuan Gong, Christian Poellabauer:
Continuous Assessment of Children's Emotional States Using Acoustic Analysis. ICHI 2017: 171-178 - [c1]Yuan Gong, Christian Poellabauer:
Topic Modeling Based Multi-modal Depression Detection. AVEC@ACM Multimedia 2017: 69-76 - [i1]Yuan Gong, Christian Poellabauer:
Crafting Adversarial Examples For Speech Paralinguistics Applications. CoRR abs/1711.03280 (2017)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-23 21:27 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint