default search action
Daan van Esch
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2022
- [j1]Julia Kreutzer, Isaac Caswell, Lisa Wang, Ahsan Wahab, Daan van Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, Sokhar Samb, Benoît Sagot, Clara Rivera, Annette Rios, Isabel Papadimitriou, Salomey Osei, Pedro Javier Ortiz Suárez, Iroro Orife, Kelechi Ogueji, Andre Niyongabo Rubungo, Toan Q. Nguyen, Mathias Müller, André Müller, Shamsuddeen Hassan Muhammad, Nanda Muhammad, Ayanda Mnyakeni, Jamshidbek Mirzakhalov, Tapiwanashe Matangira, Colin Leong, Nze Lawson, Sneha Kudugunta, Yacine Jernite, Mathias Jenny, Orhan Firat, Bonaventure F. P. Dossou, Sakhile Dlamini, Nisansa de Silva, Sakine Çabuk Balli, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur Bapna, Pallavi Baljekar, Israel Abebe Azime, Ayodele Awokoya, Duygu Ataman, Orevaoghene Ahia, Oghenefego Ahia, Sweta Agrawal, Mofetoluwa Adeyemi:
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets. Trans. Assoc. Comput. Linguistics 10: 50-72 (2022)
Conference and Workshop Papers
- 2024
- [c15]Daan van Esch, Sandy Ritchie, Sebastian Ruder, Julia Kreutzer, Clara Rivera, Ishank Saxena, Isaac Caswell:
Connecting Language Technologies with Rich, Diverse Data Sources Covering Thousands of Languages. LREC/COLING 2024: 3729-3746 - [c14]Sandy Ritchie, Daan van Esch, Uche Okonkwo, Shikhar Vashishth, Emily Drummond:
LinguaMeta: Unified Metadata for Thousands of Languages. LREC/COLING 2024: 10530-10538 - [c13]Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa:
Multimodal Modeling for Spoken Language Identification. ICASSP 2024: 11526-11530 - 2022
- [c12]Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. INTERSPEECH 2022: 3248-3252 - [c11]Daan van Esch, Tamar Lucassen, Sebastian Ruder, Isaac Caswell, Clara Rivera:
Writing System and Speaker Metadata for 2, 800+ Language Varieties. LREC 2022: 5035-5046 - 2020
- [c10]Isaac Caswell, Theresa Breiner, Daan van Esch, Ankur Bapna:
Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus. COLING 2020: 6588-6608 - [c9]Sandy Ritchie, Eoin Mahon, Kim Heiligenstein, Nikos Bampounis, Daan van Esch, Christian Schallhart, Jonas Fromseier Mortensen, Benoît Brard:
Data-Driven Parametric Text Normalization: Rapidly Scaling Finite-State Transduction Verbalizers to New Languages. SLTU-CCURL@LREC 2020: 218-225 - 2019
- [c8]Manasa Prasad, Daan van Esch, Sandy Ritchie, Jonas Fromseier Mortensen:
Building Large-Vocabulary ASR Systems for Languages Without Any Audio Training Data. INTERSPEECH 2019: 271-275 - [c7]Harry Bleyan, Sandy Ritchie, Jonas Fromseier Mortensen, Daan van Esch:
Developing Pronunciation Models in New Languages Faster by Exploiting Common Grapheme-to-Phoneme Correspondences Across Languages. INTERSPEECH 2019: 2100-2104 - [c6]Sandy Ritchie, Richard Sproat, Kyle Gorman, Daan van Esch, Christian Schallhart, Nikos Bampounis, Benoît Brard, Jonas Fromseier Mortensen, Millie Holt, Eoin Mahon:
Unified Verbalization for Speech Recognition & Synthesis Across Languages. INTERSPEECH 2019: 3530-3534 - 2018
- [c5]Mason Chua, Daan van Esch, Noah Coccaro, Eunjoon Cho, Sujeet Bhandari, Libin Jia:
Text Normalization Infrastructure that Scales to Hundreds of Language Varieties. LREC 2018 - [c4]Manasa Prasad, Theresa Breiner, Daan van Esch:
Mining Training Data for Language Modeling Across the World's Languages. SLTU 2018: 61-65 - [c3]Ben Foley, Joshua T. Arnold, Rolando Coto-Solano, Gautier Durantin, T. Mark Ellison, Daan van Esch, Scott Heath, Frantisek Kratochvil, Zara Maxwell-Smith, David Nash, Ola Olsson, Mark Richards, Nay San, Hywel Stoakes, Nick Thieberger, Janet Wiles:
Building Speech Recognition Systems for Language Documentation: The CoEDL Endangered Language Pipeline and Inference System (ELPIS). SLTU 2018: 205-209 - 2017
- [c2]Daan van Esch, Richard Sproat:
An Expanded Taxonomy of Semiotic Classes for Text Normalization. INTERSPEECH 2017: 4016-4020 - 2016
- [c1]Daan van Esch, Mason Chua, Kanishka Rao:
Predicting Pronunciations with Syllabification and Stress with Recurrent Neural Networks. INTERSPEECH 2016: 2841-2845
Informal and Other Publications
- 2023
- [i11]Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa:
Multimodal Modeling For Spoken Language Identification. CoRR abs/2309.10567 (2023) - 2022
- [i10]Andreas Kabel, Keith B. Hall, Tom Ouyang, David Rybach, Daan van Esch, Françoise Beaufays:
Handling Compounding in Mobile Keyboard Input. CoRR abs/2201.06469 (2022) - [i9]Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. CoRR abs/2203.10752 (2022) - [i8]Ankur Bapna, Isaac Caswell, Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, Pallavi Baljekar, Xavier Garcia, Wolfgang Macherey, Theresa Breiner, Vera Axelrod, Jason Riesa, Yuan Cao, Mia Xu Chen, Klaus Macherey, Maxim Krikun, Pidong Wang, Alexander Gutkin, Apurva Shah, Yanping Huang, Zhifeng Chen, Yonghui Wu, Macduff Hughes:
Building Machine Translation Systems for the Next Thousand Languages. CoRR abs/2205.03983 (2022) - [i7]Alëna Aksënova, Zhehuai Chen, Chung-Cheng Chiu, Daan van Esch, Pavel Golik, Wei Han, Levi King, Bhuvana Ramabhadran, Andrew Rosenberg, Suzan Schwartz, Gary Wang:
Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data. CoRR abs/2205.08014 (2022) - [i6]Sandy Ritchie, You-Chi Cheng, Mingqing Chen, Rajiv Mathews, Daan van Esch, Bo Li, Khe Chai Sim:
Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning. CoRR abs/2208.03067 (2022) - 2021
- [i5]Tania Chakraborty, Manasa Prasad, Theresa Breiner, Sandy Ritchie, Daan van Esch:
Mining Large-Scale Low-Resource Pronunciation Data From Wikipedia. CoRR abs/2101.11575 (2021) - [i4]Isaac Caswell, Julia Kreutzer, Lisa Wang, Ahsan Wahab, Daan van Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, Sokhar Samb, Benoît Sagot, Clara Rivera, Annette Rios, Isabel Papadimitriou, Salomey Osei, Pedro Javier Ortiz Suárez, Iroro Orife, Kelechi Ogueji, Rubungo Andre Niyongabo, Toan Q. Nguyen, Mathias Müller, André Müller, Shamsuddeen Hassan Muhammad, Nanda Muhammad, Ayanda Mnyakeni, Jamshidbek Mirzakhalov, Tapiwanashe Matangira, Colin Leong, Nze Lawson, Sneha Kudugunta, Yacine Jernite, Mathias Jenny, Orhan Firat, Bonaventure F. P. Dossou, Sakhile Dlamini, Nisansa de Silva, Sakine Çabuk Balli, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur Bapna, Pallavi Baljekar, Israel Abebe Azime, Ayodele Awokoya, Duygu Ataman, Orevaoghene Ahia, Oghenefego Ahia, Sweta Agrawal, Mofetoluwa Adeyemi:
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets. AfricaNLP 2021 - 2020
- [i3]Isaac Caswell, Theresa Breiner, Daan van Esch, Ankur Bapna:
Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus. CoRR abs/2010.14571 (2020) - 2019
- [i2]Theresa Breiner, Chieu Nguyen, Daan van Esch, Jeremy O'Brien:
Automatic Keyboard Layout Design for Low-Resource Latin-Script Languages. CoRR abs/1901.06039 (2019) - [i1]Daan van Esch, Elnaz Sarbar, Tamar Lucassen, Jeremy O'Brien, Theresa Breiner, Manasa Prasad, Evan Crew, Chieu Nguyen, Françoise Beaufays:
Writing Across the World's Languages: Deep Internationalization for Gboard, the Google Keyboard. CoRR abs/1912.01218 (2019)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 22:05 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint