default search action
Florian Metze
Person information
- affiliation: Carnegie Mellon University, Pittsburgh, USA
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c215]Jackson Michaels, Juncheng B. Li, Laura Yao, Lijun Yu, Zach Wood-Doughty, Florian Metze:
Audio-Journey: Open Domain Latent Diffusion Based Text-To-Audio Generation. ICASSP 2024: 6960-6964 - 2023
- [j16]Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed:
LegoNN: Building Modular Encoder-Decoder Models. IEEE ACM Trans. Audio Speech Lang. Process. 31: 3112-3126 (2023) - [c214]Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W. Black, Shinji Watanabe:
CTC Alignments Improve Autoregressive Translation. EACL 2023: 1615-1631 - 2022
- [c213]Xinjian Li, Florian Metze, David R. Mortensen, Shinji Watanabe, Alan W. Black:
Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble. ACL (Findings) 2022: 2106-2115 - [c212]Triantafyllos Afouras, Yuki M. Asano, Francois Fagan, Andrea Vedaldi, Florian Metze:
Self-supervised object detection from audio-visual correspondence. CVPR 2022: 10565-10576 - [c211]Yookoon Park, Mahmoud Azab, Seungwhan Moon, Bo Xiong, Florian Metze, Gourab Kundu, Kirmani Ahmed:
Normalized Contrastive Learning for Text-Video Retrieval. EMNLP 2022: 248-260 - [c210]Shruti Palaskar, Akshita Bhagia, Yonatan Bisk, Florian Metze, Alan W. Black, Ana Marasovic:
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization. EMNLP (Findings) 2022: 2644-2657 - [c209]Siddhant Arora, Siddharth Dalmia, Brian Yan, Florian Metze, Alan W. Black, Shinji Watanabe:
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models. EMNLP (Findings) 2022: 5419-5429 - [c208]Juncheng B. Li, Shuhui Qu, Xinjian Li, Bernie Po-Yao Huang, Florian Metze:
On Adversarial Robustness Of Large-Scale Audio Visual Learning. ICASSP 2022: 231-235 - [c207]Roshan Sharma, Shruti Palaskar, Alan W. Black, Florian Metze:
End-to-End Speech Summarization Using Restricted Self-Attention. ICASSP 2022: 8072-8076 - [c206]Juncheng Li, Shuhui Qu, Po-Yao Huang, Florian Metze:
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification. INTERSPEECH 2022: 1521-1525 - [c205]Xinjian Li, Florian Metze, David R. Mortensen, Alan W. Black, Shinji Watanabe:
ASR2K: Speech Recognition for Around 2000 Languages without Audio. INTERSPEECH 2022: 4885-4889 - [c204]Xinjian Li, Florian Metze, David R. Mortensen, Alan W. Black, Shinji Watanabe:
Phone Inventories and Recognition for Every Language. LREC 2022: 1061-1067 - [c203]Po-Yao Huang, Hu Xu, Juncheng Li, Alexei Baevski, Michael Auli, Wojciech Galuba, Florian Metze, Christoph Feichtenhofer:
Masked Autoencoders that Listen. NeurIPS 2022 - [i78]Juncheng B. Li, Shuhui Qu, Xinjian Li, Po-Yao Huang, Florian Metze:
On Adversarial Robustness of Large-scale Audio Visual Learning. CoRR abs/2203.12122 (2022) - [i77]Juncheng B. Li, Shuhui Qu, Po-Yao Huang, Florian Metze:
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification. CoRR abs/2203.13448 (2022) - [i76]Juncheng B. Li, Shuhui Qu, Florian Metze:
Robustness of Neural Architectures for Audio Event Detection. CoRR abs/2205.03268 (2022) - [i75]Shruti Palaskar, Akshita Bhagia, Yonatan Bisk, Florian Metze, Alan W. Black, Ana Marasovic:
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization. CoRR abs/2205.11686 (2022) - [i74]Siddharth Dalmia, Dmytro Okhonko, Mike Lewis, Sergey Edunov, Shinji Watanabe, Florian Metze, Luke Zettlemoyer, Abdelrahman Mohamed:
LegoNN: Building Modular Encoder-Decoder Models. CoRR abs/2206.03318 (2022) - [i73]Po-Yao Huang, Hu Xu, Juncheng Li, Alexei Baevski, Michael Auli, Wojciech Galuba, Florian Metze, Christoph Feichtenhofer:
Masked Autoencoders that Listen. CoRR abs/2207.06405 (2022) - [i72]Xinjian Li, Florian Metze, David R. Mortensen, Alan W. Black, Shinji Watanabe:
ASR2K: Speech Recognition for Around 2000 Languages without Audio. CoRR abs/2209.02842 (2022) - [i71]Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W. Black, Shinji Watanabe:
CTC Alignments Improve Autoregressive Translation. CoRR abs/2210.05200 (2022) - [i70]Zheng Wang, Juncheng B. Li, Shuhui Qu, Florian Metze, Emma Strubell:
SQuAT: Sharpness- and Quantization-Aware Training for BERT. CoRR abs/2210.07171 (2022) - [i69]Siddhant Arora, Siddharth Dalmia, Brian Yan, Florian Metze, Alan W. Black, Shinji Watanabe:
Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models. CoRR abs/2210.15734 (2022) - [i68]Zheng Wang, Juncheng B. Li, Shuhui Qu, Florian Metze, Emma Strubell:
Error-aware Quantization through Noise Tempering. CoRR abs/2212.05603 (2022) - [i67]Yookoon Park, Mahmoud Azab, Bo Xiong, Seungwhan Moon, Florian Metze, Gourab Kundu, Kirmani Ahmed:
Normalized Contrastive Learning for Text-Video Retrieval. CoRR abs/2212.11790 (2022) - 2021
- [c202]Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer:
VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding. ACL/IJCNLP (Findings) 2021: 4227-4239 - [c201]Amanda Cardoso Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giró-i-Nieto:
How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language. CVPR 2021: 2735-2744 - [c200]Abhilasha Ravichander, Siddharth Dalmia, Maria Ryskina, Florian Metze, Eduard H. Hovy, Alan W. Black:
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering. EACL 2021: 2976-2992 - [c199]Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer:
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding. EMNLP (1) 2021: 6787-6800 - [c198]Juncheng B. Li, Kaixin Ma, Shuhui Qu, Po-Yao Huang, Florian Metze:
Audio-Visual Event Recognition Through the Lens of Adversary. ICASSP 2021: 616-620 - [c197]Xinjian Li, David R. Mortensen, Florian Metze, Alan W. Black:
Multilingual Phonetic Dataset for Low Resource Speech Recognition. ICASSP 2021: 6958-6962 - [c196]Xinjian Li, Juncheng Li, Jiali Yao, Alan W. Black, Florian Metze:
Phone Distribution Estimation for Low Resource Languages. ICASSP 2021: 7233-7237 - [c195]Mandela Patrick, Po-Yao Huang, Ishan Misra, Florian Metze, Andrea Vedaldi, Yuki M. Asano, João F. Henriques:
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning. ICCV 2021: 10540-10552 - [c194]Mandela Patrick, Po-Yao Huang, Yuki Markus Asano, Florian Metze, Alexander G. Hauptmann, João F. Henriques, Andrea Vedaldi:
Support-set bottlenecks for video-text representation learning. ICLR 2021 - [c193]Shruti Palaskar, Ruslan Salakhutdinov, Alan W. Black, Florian Metze:
Multimodal Speech Summarization Through Semantic Concept Learning. Interspeech 2021: 791-795 - [c192]Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black:
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding. Interspeech 2021: 1264-1268 - [c191]Xinjian Li, Juncheng Li, Florian Metze, Alan W. Black:
Hierarchical Phone Recognition with Compositional Phonetics. Interspeech 2021: 2461-2465 - [c190]Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe:
Differentiable Allophone Graphs for Language-Universal Speech Recognition. Interspeech 2021: 2471-2475 - [c189]Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe:
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks. NAACL-HLT 2021: 1882-1896 - [c188]Poyao Huang, Mandela Patrick, Junjie Hu, Graham Neubig, Florian Metze, Alex Hauptmann:
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models. NAACL-HLT 2021: 2443-2459 - [c187]Mandela Patrick, Dylan Campbell, Yuki M. Asano, Ishan Misra, Florian Metze, Christoph Feichtenhofer, Andrea Vedaldi, João F. Henriques:
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. NeurIPS 2021: 12493-12506 - [e5]Heng Tao Shen, Yueting Zhuang, John R. Smith, Yang Yang, Pablo César, Florian Metze, Balakrishnan Prabhakaran:
MM '21: ACM Multimedia Conference, Virtual Event, China, October 20 - 24, 2021. ACM 2021, ISBN 978-1-4503-8651-7 [contents] - [i66]Abhilasha Ravichander, Siddharth Dalmia, Maria Ryskina, Florian Metze, Eduard H. Hovy, Alan W. Black:
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering. CoRR abs/2102.08345 (2021) - [i65]Po-Yao Huang, Mandela Patrick, Junjie Hu, Graham Neubig, Florian Metze, Alexander G. Hauptmann:
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models. CoRR abs/2103.08849 (2021) - [i64]Mandela Patrick, Yuki Markus Asano, Bernie Huang, Ishan Misra, Florian Metze, João F. Henriques, Andrea Vedaldi:
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning. CoRR abs/2103.10211 (2021) - [i63]Triantafyllos Afouras, Yuki Markus Asano, Francois Fagan, Andrea Vedaldi, Florian Metze:
Self-supervised object detection from audio-visual correspondence. CoRR abs/2104.06401 (2021) - [i62]Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe:
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks. CoRR abs/2105.00573 (2021) - [i61]Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer:
VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding. CoRR abs/2105.09996 (2021) - [i60]Mandela Patrick, Dylan Campbell, Yuki Markus Asano, Ishan Misra, Florian Metze, Christoph Feichtenhofer, Andrea Vedaldi, João F. Henriques:
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. CoRR abs/2106.05392 (2021) - [i59]Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black:
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding. CoRR abs/2106.15065 (2021) - [i58]Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe:
Differentiable Allophone Graphs for Language-Universal Speech Recognition. CoRR abs/2107.11628 (2021) - [i57]Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer:
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding. CoRR abs/2109.14084 (2021) - [i56]Roshan Sharma, Shruti Palaskar, Alan W. Black, Florian Metze:
Speech Summarization using Restricted Self-Attention. CoRR abs/2110.06263 (2021) - 2020
- [j15]Shruti Palaskar, Ramon Sanabria, Florian Metze:
Transfer learning for multimodal dialog. Comput. Speech Lang. 64: 101093 (2020) - [j14]Lucia Specia, Loïc Barrault, Ozan Caglayan, Amanda Cardoso Duarte, Desmond Elliott, Spandana Gella, Nils Holzenberger, Chiraag Lala, Sun Jae Lee, Jindrich Libovický, Pranava Madhyastha, Florian Metze, Karl Mulligan, Alissa Ostapenko, Shruti Palaskar, Ramon Sanabria, Josiah Wang, Raman Arora:
Grounded Sequence to Sequence Transduction. IEEE J. Sel. Top. Signal Process. 14(3): 577-591 (2020) - [j13]Odette Scharenborg, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, Emmanuel Dupoux, Laurent Besacier, Alan W. Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stüker, Pierre Godard, Markus Müller:
Speech Technology for Unwritten Languages. IEEE ACM Trans. Audio Speech Lang. Process. 28: 964-975 (2020) - [j12]Fengquan Dong, Kun Qian, Zhao Ren, Alice Baird, Xinjian Li, Zhenyu Dai, Bo Dong, Florian Metze, Yoshiharu Yamamoto, Björn W. Schuller:
Machine Listening for Heart Status Monitoring: Introducing and Benchmarking HSS - The Heart Sounds Shenzhen Corpus. IEEE J. Biomed. Health Informatics 24(7): 2082-2092 (2020) - [c186]Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. Black, Florian Metze:
Towards Zero-Shot Learning for Automatic Phonemic Transcription. AAAI 2020: 8261-8268 - [c185]Zhong Zhou, Isak Czeresnia Etinger, Florian Metze, Alexander Hauptmann, Alexander Waibel:
Gun Source and Muzzle Head Detection. IMAWM 2020: 1-11 - [c184]Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott:
Fine-Grained Grounding for Multimodal Speech Recognition. EMNLP (Findings) 2020: 2667-2677 - [c183]Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze:
On Long-Tailed Phenomena in Neural Machine Translation. EMNLP (Findings) 2020: 3088-3095 - [c182]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Looking Enhances Listening: Recovering Missing Speech Using Images. ICASSP 2020: 6304-6308 - [c181]Anirudh Mani, Shruti Palaskar, Nimshi Venkat Meripo, Sandeep Konam, Florian Metze:
ASR Error Correction and Domain Adaptation Using Machine Translation. ICASSP 2020: 6344-6348 - [c180]Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. Black, Florian Metze:
Universal Phone Recognition with a Multilingual Allophone System. ICASSP 2020: 8249-8253 - [c179]Mahaveer Jain, Gil Keren, Jay Mahadeokar, Geoffrey Zweig, Florian Metze, Yatharth Saraf:
Contextual RNN-T for Open Domain ASR. INTERSPEECH 2020: 11-15 - [c178]Zimeng Qiu, Yiyuan Li, Xinjian Li, Florian Metze, William M. Campbell:
Towards Context-Aware End-to-End Code-Switching Speech Recognition. INTERSPEECH 2020: 4776-4780 - [c177]David R. Mortensen, Xinjian Li, Patrick Littell, Alexis Michaud, Shruti Rijhwani, Antonios Anastasopoulos, Alan W. Black, Florian Metze, Graham Neubig:
AlloVera: A Multilingual Allophone Database. LREC 2020: 5329-5336 - [c176]Vikas Raunak, Vaibhav Kumar, Vivek Gupta, Florian Metze:
On Dimensional Linguistic Properties of the Word Embedding Space. RepL4NLP@ACL 2020: 156-165 - [i55]Zhong Zhou, Isak Czeresnia Etinger, Florian Metze, Alexander G. Hauptmann, Alexander Waibel:
Gun Source and Muzzle Head Detection. CoRR abs/2001.11120 (2020) - [i54]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Looking Enhances Listening: Recovering Missing Speech Using Images. CoRR abs/2002.05639 (2020) - [i53]Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. Black, Florian Metze:
Towards Zero-shot Learning for Automatic Phonemic Transcription. CoRR abs/2002.11781 (2020) - [i52]Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. Black, Florian Metze:
Universal Phone Recognition with a Multilingual Allophone System. CoRR abs/2002.11800 (2020) - [i51]Anirudh Mani, Shruti Palaskar, Nimshi Venkat Meripo, Sandeep Konam, Florian Metze:
ASR Error Correction and Domain Adaptation Using Machine Translation. CoRR abs/2003.07692 (2020) - [i50]David R. Mortensen, Xinjian Li, Patrick Littell, Alexis Michaud, Shruti Rijhwani, Antonios Anastasopoulos, Alan W. Black, Florian Metze, Graham Neubig:
AlloVera: A Multilingual Allophone Database. CoRR abs/2004.08031 (2020) - [i49]Amanda Cardoso Duarte, Shruti Palaskar, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giró-i-Nieto:
How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language. CoRR abs/2008.08143 (2020) - [i48]Ze Cheng, Juncheng Li, Chenxu Wang, Jixuan Gu, Hao Xu, Xinjian Li, Florian Metze:
Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations. CoRR abs/2009.05739 (2020) - [i47]Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott:
Fine-Grained Grounding for Multimodal Speech Recognition. CoRR abs/2010.02384 (2020) - [i46]Mandela Patrick, Po-Yao Huang, Yuki Markus Asano, Florian Metze, Alexander G. Hauptmann, João F. Henriques, Andrea Vedaldi:
Support-set bottlenecks for video-text representation learning. CoRR abs/2010.02824 (2020) - [i45]Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze:
On Long-Tailed Phenomena in Neural Machine Translation. CoRR abs/2010.04924 (2020) - [i44]Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott:
Multimodal Speech Recognition with Unstructured Audio Masking. CoRR abs/2010.08642 (2020) - [i43]Juncheng B. Li, Kaixin Ma, Shuhui Qu, Po-Yao Huang, Florian Metze:
Audio-Visual Event Recognition through the lens of Adversary. CoRR abs/2011.07430 (2020)
2010 – 2019
- 2019
- [j11]Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury:
Joint embeddings with multimodal cues for video-text retrieval. Int. J. Multim. Inf. Retr. 8(1): 3-18 (2019) - [j10]Okko Räsänen, Shreyas Seshadri, Julien Karadayi, Eric Riebling, John P. Bunce, Alejandrina Cristià, Florian Metze, Marisa Casillas, Celia Rosemberg, Elika Bergelson, Melanie Soderstrom:
Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech. Speech Commun. 113: 63-80 (2019) - [c175]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion. ACL (1) 2019: 1131-1141 - [c174]Shruti Palaskar, Jindrich Libovický, Spandana Gella, Florian Metze:
Multimodal Abstractive Summarization for How2 Videos. ACL (1) 2019: 6587-6596 - [c173]Yun Wang, Juncheng Li, Florian Metze:
A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling. ICASSP 2019: 31-35 - [c172]Yun Wang, Florian Metze:
Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling. ICASSP 2019: 745-749 - [c171]Siddharth Dalmia, Xinjian Li, Alan W. Black, Florian Metze:
Phoneme Level Language Models for Sequence Based Low Resource ASR. ICASSP 2019: 6091-6095 - [c170]Shruti Palaskar, Vikas Raunak, Florian Metze:
Learned in Speech Recognition: Contextual Acoustic Word Embeddings. ICASSP 2019: 6530-6534 - [c169]Nils Holzenberger, Shruti Palaskar, Pranava Madhyastha, Florian Metze, Raman Arora:
Learning from Multiview Correlations in Open-domain Videos. ICASSP 2019: 8628-8632 - [c168]Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze:
Multimodal Grounding for Sequence-to-sequence Speech Recognition. ICASSP 2019: 8648-8652 - [c167]Vikas Raunak, Sang Keun Choe, Quanyang Lu, Yi Xu, Florian Metze:
On Leveraging the Visual Modality for Neural Machine Translation. INLG 2019: 147-151 - [c166]Xinjian Li, Siddharth Dalmia, Alan W. Black, Florian Metze:
Multilingual Speech Recognition with Corpus Relatedness Sampling. INTERSPEECH 2019: 2120-2124 - [c165]Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W. Black, Florian Metze:
SANTLR: Speech Annotation Toolkit for Low Resource Languages. INTERSPEECH 2019: 3681-3682 - [c164]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Cross-Attention End-to-End ASR for Two-Party Conversations. INTERSPEECH 2019: 4380-4384 - [c163]Florian Metze:
Survey Talk: Multimodal Processing of Speech and Language. INTERSPEECH 2019 - [c162]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
CMU's Machine Translation System for IWSLT 2019. IWSLT 2019 - [c161]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Multitask Learning For Different Subword Segmentations In Neural Machine Translation. IWSLT 2019 - [c160]Yasufumi Moriya, Ramon Sanabria, Florian Metze, Gareth J. F. Jones:
MediaEval 2019: Eyes and Ears Together. MediaEval 2019 - [c159]Suyoun Kim, Florian Metze:
Acoustic-to-Word Models with Conversational Context Information. NAACL-HLT (1) 2019: 2766-2771 - [c158]Juncheng Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, Florian Metze:
Adversarial Music: Real world Audio Adversary against Wake-word Detection System. NeurIPS 2019: 11908-11918 - [c157]Vikas Raunak, Vivek Gupta, Florian Metze:
Effective Dimensionality Reduction for Word Embeddings. RepL4NLP@ACL 2019: 235-243 - [i42]Eduard H. Hovy, Jaime G. Carbonell, Hans Chalupsky, Anatole Gershman, Alex Hauptmann, Florian Metze, Teruko Mitamura, Zaid Sheikh, Ankit Dangi, Aditi Chaudhary, Xianyang Chen, Xiang Kong, Bernie Huang, Salvador Medina, Hector Liu, Xuezhe Ma, Maria Ryskina, Ramon Sanabria, Varun Gangal:
OPERA: Operations-oriented Probabilistic Extraction, Reasoning, and Analysis. TAC 2019 - [i41]Shruti Palaskar, Vikas Raunak, Florian Metze:
Learned In Speech Recognition: Contextual Acoustic Word Embeddings. CoRR abs/1902.06833 (2019) - [i40]Siddharth Dalmia, Xinjian Li, Alan W. Black, Florian Metze:
Phoneme Level Language Models for Sequence Based Low Resource ASR. CoRR abs/1902.07613 (2019) - [i39]Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori S. Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard H. Hovy, Alan W. Black, Jaime G. Carbonell, Graham Horwood, Shabnam Tafreshi, Mona T. Diab, Efsun Sarioglu Kayi, Noura Farra, Kathleen R. McKeown:
The ARIEL-CMU Systems for LoReHLT18. CoRR abs/1902.08899 (2019) - [i38]Suyoun Kim, Florian Metze:
Acoustic-to-Word Models with Conversational Context Information. CoRR abs/1905.08796 (2019) - [i37]Yasufumi Moriya, Ramon Sanabria, Florian Metze, Gareth J. F. Jones:
Grounding Object Detections With Transcriptions. CoRR abs/1906.06147 (2019) - [i36]Shruti Palaskar, Jindrich Libovický, Spandana Gella, Florian Metze:
Multimodal Abstractive Summarization for How2 Videos. CoRR abs/1906.07901 (2019) - [i35]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion. CoRR abs/1906.11604 (2019) - [i34]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions. CoRR abs/1907.00477 (2019) - [i33]Suyoun Kim, Siddharth Dalmia, Florian Metze: