default search action
Orhan Firat
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c70]Biao Zhang, Zhongtao Liu, Colin Cherry, Orhan Firat:
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method. ICLR 2024 - [i75]Biao Zhang, Zhongtao Liu, Colin Cherry, Orhan Firat:
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method. CoRR abs/2402.17193 (2024) - [i74]Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry Lepikhin, Timothy P. Lillicrap, Jean-Baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, Ioannis Antonoglou, Rohan Anil, Sebastian Borgeaud, Andrew M. Dai, Katie Millican, Ethan Dyer, Mia Glaese, Thibault Sottiaux, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, James Molloy, Jilin Chen, Michael Isard, Paul Barham, Tom Hennigan, Ross McIlroy, Melvin Johnson, Johan Schalkwyk, Eli Collins, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Clemens Meyer, Gregory Thornton, Zhen Yang, Henryk Michalewski, Zaheer Abbas, Nathan Schucher, Ankesh Anand, Richard Ives, James Keeling, Karel Lenc, Salem Haykal, Siamak Shakeri, Pranav Shyam, Aakanksha Chowdhery, Roman Ring, Stephen Spencer, Eren Sezener, et al.:
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context. CoRR abs/2403.05530 (2024) - [i73]Bernd Bohnet, Kevin Swersky, Rosanne Liu, Pranjal Awasthi, Azade Nova, Javier Snaider, Hanie Sedghi, Aaron T. Parisi, Michael Collins, Angeliki Lazaridou, Orhan Firat, Noah Fiedel:
Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation. CoRR abs/2406.00179 (2024) - [i72]Biao Zhang, Garrett Tanzer, Orhan Firat:
Scaling Sign Language Translation. CoRR abs/2407.11855 (2024) - 2023
- [j6]Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel:
PaLM: Scaling Language Modeling with Pathways. J. Mach. Learn. Res. 24: 240:1-240:113 (2023) - [j5]Parker Riley, Timothy Dozat, Jan A. Botha, Xavier Garcia, Dan Garrette, Jason Riesa, Orhan Firat, Noah Constant:
FRMT: A Benchmark for Few-Shot Region-Aware Machine Translation. Trans. Assoc. Comput. Linguistics 11: 671-685 (2023) - [c69]Alexander Jones, Isaac Caswell, Orhan Firat, Ishank Saxena:
GATITOS: Using a New Multilingual Lexicon for Low-resource Machine Translation. EMNLP 2023: 371-405 - [c68]Hyung Won Chung, Xavier Garcia, Adam Roberts, Yi Tay, Orhan Firat, Sharan Narang, Noah Constant:
UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining. ICLR 2023 - [c67]Patrick Fernandes, Behrooz Ghorbani, Xavier Garcia, Markus Freitag, Orhan Firat:
Scaling Laws for Multilingual Neural Machine Translation. ICML 2023: 10053-10071 - [c66]Xavier Garcia, Yamini Bansal, Colin Cherry, George F. Foster, Maxim Krikun, Melvin Johnson, Orhan Firat:
The Unreasonable Effectiveness of Few-shot Learning for Machine Translation. ICML 2023: 10867-10878 - [c65]Jonathan Pilault, Xavier Garcia, Arthur Brazinskas, Orhan Firat:
Interactive-Chain-Prompting: Ambiguity Resolution for Crosslingual Conditional Generation with Interaction. IJCNLP (1) 2023: 455-483 - [c64]Yichi Zhang, Ankush Garg, Yuan Cao, Lukasz Lew, Behrooz Ghorbani, Zhiru Zhang, Orhan Firat:
Binarized Neural Machine Translation. NeurIPS 2023 - [c63]Dami Choi, Derrick Xin, Hamid Dadkhahi, Justin Gilmer, Ankush Garg, Orhan Firat, Chih-Kuan Yeh, Andrew M. Dai, Behrooz Ghorbani:
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning. NeurIPS 2023 - [c62]Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat:
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset. NeurIPS 2023 - [c61]Jonathan Pilault, Mahan Fathi, Orhan Firat, Chris Pal, Pierre-Luc Bacon, Ross Goroshin:
Block-State Transformers. NeurIPS 2023 - [c60]Patrick Fernandes, Daniel Deutsch, Mara Finkelstein, Parker Riley, André Martins, Graham Neubig, Ankush Garg, Jonathan H. Clark, Markus Freitag, Orhan Firat:
The Devil Is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation. WMT 2023: 1066-1083 - [i71]Jonathan Pilault, Xavier Garcia, Arthur Brazinskas, Orhan Firat:
Interactive-Chain-Prompting: Ambiguity Resolution for Crosslingual Conditional Generation with Interaction. CoRR abs/2301.10309 (2023) - [i70]Xavier Garcia, Yamini Bansal, Colin Cherry, George F. Foster, Maxim Krikun, Fangxiaoyu Feng, Melvin Johnson, Orhan Firat:
The unreasonable effectiveness of few-shot learning for machine translation. CoRR abs/2302.01398 (2023) - [i69]Yichi Zhang, Ankush Garg, Yuan Cao, Lukasz Lew, Behrooz Ghorbani, Zhiru Zhang, Orhan Firat:
Binarized Neural Machine Translation. CoRR abs/2302.04907 (2023) - [i68]Patrick Fernandes, Behrooz Ghorbani, Xavier Garcia, Markus Freitag, Orhan Firat:
Scaling Laws for Multilingual Neural Machine Translation. CoRR abs/2302.09650 (2023) - [i67]Alexander Jones, Isaac Caswell, Ishank Saxena, Orhan Firat:
Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation. CoRR abs/2303.15265 (2023) - [i66]Hyung Won Chung, Noah Constant, Xavier Garcia, Adam Roberts, Yi Tay, Sharan Narang, Orhan Firat:
UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining. CoRR abs/2304.09151 (2023) - [i65]Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernández Ábrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan A. Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vladimir Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, et al.:
PaLM 2 Technical Report. CoRR abs/2305.10403 (2023) - [i64]Andrea Schioppa, Xavier Garcia, Orhan Firat:
Cross-Lingual Supervision improves Large Language Models Pre-training. CoRR abs/2305.11778 (2023) - [i63]Mahan Fathi, Jonathan Pilault, Pierre-Luc Bacon, Christopher Pal, Orhan Firat, Ross Goroshin:
Block-State Transformer. CoRR abs/2306.09539 (2023) - [i62]Patrick Fernandes, Daniel Deutsch, Mara Finkelstein, Parker Riley, André F. T. Martins, Graham Neubig, Ankush Garg, Jonathan H. Clark, Markus Freitag, Orhan Firat:
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation. CoRR abs/2308.07286 (2023) - [i61]Çaglar Gülçehre, Tom Le Paine, Srivatsan Srinivasan, Ksenia Konyushkova, Lotte Weerts, Abhishek Sharma, Aditya Siddhant, Alex Ahern, Miaosen Wang, Chenjie Gu, Wolfgang Macherey, Arnaud Doucet, Orhan Firat, Nando de Freitas:
Reinforced Self-Training (ReST) for Language Modeling. CoRR abs/2308.08998 (2023) - [i60]Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat:
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset. CoRR abs/2309.04662 (2023) - [i59]Dami Choi, Derrick Xin, Hamid Dadkhahi, Justin Gilmer, Ankush Garg, Orhan Firat, Chih-Kuan Yeh, Andrew M. Dai, Behrooz Ghorbani:
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning. CoRR abs/2312.06134 (2023) - [i58]Rohan Anil, Sebastian Borgeaud, Yonghui Wu, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Slav Petrov, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy P. Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul Ronald Barham, Tom Hennigan, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, Ryan Doherty, Eli Collins, Clemens Meyer, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, George Tucker, Enrique Piqueras, Maxim Krikun, Iain Barr, Nikolay Savinov, Ivo Danihelka, Becca Roelofs, Anaïs White, Anders Andreassen, Tamara von Glehn, Lakshman Yagati, Mehran Kazemi, Lucas Gonzalez, Misha Khalman, Jakub Sygnowski, et al.:
Gemini: A Family of Highly Capable Multimodal Models. CoRR abs/2312.11805 (2023) - 2022
- [j4]Julia Kreutzer, Isaac Caswell, Lisa Wang, Ahsan Wahab, Daan van Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, Sokhar Samb, Benoît Sagot, Clara Rivera, Annette Rios, Isabel Papadimitriou, Salomey Osei, Pedro Javier Ortiz Suárez, Iroro Orife, Kelechi Ogueji, Andre Niyongabo Rubungo, Toan Q. Nguyen, Mathias Müller, André Müller, Shamsuddeen Hassan Muhammad, Nanda Muhammad, Ayanda Mnyakeni, Jamshidbek Mirzakhalov, Tapiwanashe Matangira, Colin Leong, Nze Lawson, Sneha Kudugunta, Yacine Jernite, Mathias Jenny, Orhan Firat, Bonaventure F. P. Dossou, Sakhile Dlamini, Nisansa de Silva, Sakine Çabuk Balli, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur Bapna, Pallavi Baljekar, Israel Abebe Azime, Ayodele Awokoya, Duygu Ataman, Orevaoghene Ahia, Oghenefego Ahia, Sweta Agrawal, Mofetoluwa Adeyemi:
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets. Trans. Assoc. Comput. Linguistics 10: 50-72 (2022) - [c59]Yong Cheng, Ankur Bapna, Orhan Firat, Yuan Cao, Pidong Wang, Wolfgang Macherey:
Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation. ACL (1) 2022: 4092-4102 - [c58]Biao Zhang, Ankur Bapna, Melvin Johnson, Ali Dabirmoghaddam, Naveen Arivazhagan, Orhan Firat:
Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents. ACL (1) 2022: 4176-4192 - [c57]Behrooz Ghorbani, Orhan Firat, Markus Freitag, Ankur Bapna, Maxim Krikun, Xavier Garcia, Ciprian Chelba, Colin Cherry:
Scaling Laws for Neural Machine Translation. ICLR 2022 - [c56]Justin Gilmer, Behrooz Ghorbani, Ankush Garg, Sneha Kudugunta, Behnam Neyshabur, David Cardoze, George Edward Dahl, Zachary Nado, Orhan Firat:
A Loss Curvature Perspective on Training Instabilities of Deep Learning Models. ICLR 2022 - [c55]Yamini Bansal, Behrooz Ghorbani, Ankush Garg, Biao Zhang, Colin Cherry, Behnam Neyshabur, Orhan Firat:
Data Scaling Laws in NMT: The Effect of Noise and Architecture. ICML 2022: 1466-1482 - [c54]Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten P. Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen S. Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc V. Le, Yonghui Wu, Zhifeng Chen, Claire Cui:
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts. ICML 2022: 5547-5569 - [c53]Biao Zhang, Behrooz Ghorbani, Ankur Bapna, Yong Cheng, Xavier Garcia, Jonathan Shen, Orhan Firat:
Examining Scaling and Transfer of Language Model Architectures for Machine Translation. ICML 2022: 26176-26192 - [c52]Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. INTERSPEECH 2022: 3248-3252 - [c51]Derrick Xin, Behrooz Ghorbani, Justin Gilmer, Ankush Garg, Orhan Firat:
Do Current Multi-Task Optimization Methods in Deep Learning Even Help? NeurIPS 2022 - [i57]Aditya Siddhant, Ankur Bapna, Orhan Firat, Yuan Cao, Mia Xu Chen, Isaac Caswell, Xavier Garcia:
Towards the Next 1000 Languages in Multilingual Machine Translation: Exploring the Synergy Between Supervised and Self-Supervised Learning. CoRR abs/2201.03110 (2022) - [i56]Biao Zhang, Behrooz Ghorbani, Ankur Bapna, Yong Cheng, Xavier Garcia, Jonathan Shen, Orhan Firat:
Examining Scaling and Transfer of Language Model Architectures for Machine Translation. CoRR abs/2202.00528 (2022) - [i55]Yamini Bansal, Behrooz Ghorbani, Ankush Garg, Biao Zhang, Maxim Krikun, Colin Cherry, Behnam Neyshabur, Orhan Firat:
Data Scaling Laws in NMT: The Effect of Noise and Architecture. CoRR abs/2202.01994 (2022) - [i54]Xavier Garcia, Orhan Firat:
Using natural language prompts for machine translation. CoRR abs/2202.11822 (2022) - [i53]Yong Cheng, Ankur Bapna, Orhan Firat, Yuan Cao, Pidong Wang, Wolfgang Macherey:
Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation. CoRR abs/2203.07627 (2022) - [i52]Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson:
XTREME-S: Evaluating Cross-lingual Speech Representations. CoRR abs/2203.10752 (2022) - [i51]Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel:
PaLM: Scaling Language Modeling with Pathways. CoRR abs/2204.02311 (2022) - [i50]Ankur Bapna, Isaac Caswell, Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, Pallavi Baljekar, Xavier Garcia, Wolfgang Macherey, Theresa Breiner, Vera Axelrod, Jason Riesa, Yuan Cao, Mia Xu Chen, Klaus Macherey, Maxim Krikun, Pidong Wang, Alexander Gutkin, Apurva Shah, Yanping Huang, Zhifeng Chen, Yonghui Wu, Macduff Hughes:
Building Machine Translation Systems for the Next Thousand Languages. CoRR abs/2205.03983 (2022) - [i49]Derrick Xin, Behrooz Ghorbani, Ankush Garg, Orhan Firat, Justin Gilmer:
Do Current Multi-Task Optimization Methods in Deep Learning Even Help? CoRR abs/2209.11379 (2022) - [i48]Parker Riley, Timothy Dozat, Jan A. Botha, Xavier Garcia, Dan Garrette, Jason Riesa, Orhan Firat, Noah Constant:
FRMT: A Benchmark for Few-Shot Region-Aware Machine Translation. CoRR abs/2210.00193 (2022) - 2021
- [c50]Sneha Kudugunta, Yanping Huang, Ankur Bapna, Maxim Krikun, Dmitry Lepikhin, Minh-Thang Luong, Orhan Firat:
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference. EMNLP (Findings) 2021: 3577-3599 - [c49]Jamshidbek Mirzakhalov, Anoop Babu, Duygu Ataman, Sherzod Kariev, Francis M. Tyers, Otabek Abduraufov, Mammad Hajili, Sardana Ivanova, Abror Khaytbaev, Antonio Laverghetta Jr., Behzodbek Moydinboyev, Esra Onal, Shaxnoza Pulatova, Ahsan Wahab, Orhan Firat, Sriram Chellappan:
A Large-Scale Study of Machine Translation in Turkic Languages. EMNLP (1) 2021: 5876-5890 - [c48]Sebastian Ruder, Noah Constant, Jan A. Botha, Aditya Siddhant, Orhan Firat, Jinlan Fu, Pengfei Liu, Junjie Hu, Dan Garrette, Graham Neubig, Melvin Johnson:
XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation. EMNLP (1) 2021: 10215-10245 - [c47]Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, Zhifeng Chen:
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. ICLR 2021 - [c46]Zirui Wang, Yulia Tsvetkov, Orhan Firat, Yuan Cao:
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models. ICLR 2021 - [c45]Biao Zhang, Ankur Bapna, Rico Sennrich, Orhan Firat:
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation. ICLR 2021 - [c44]Xavier Garcia, Aditya Siddhant, Orhan Firat, Ankur P. Parikh:
Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages. NAACL-HLT 2021: 1126-1137 - [c43]Xavier Garcia, Noah Constant, Ankur P. Parikh, Orhan Firat:
Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution. NAACL-HLT 2021: 1184-1192 - [c42]Junjie Hu, Melvin Johnson, Orhan Firat, Aditya Siddhant, Graham Neubig:
Explicit Alignment Objectives for Multilingual Bidirectional Encoders. NAACL-HLT 2021: 3633-3643 - [c41]Jamshidbek Mirzakhalov, Anoop Babu, Aigiz Kunafin, Ahsan Wahab, Behzodbek Moydinboyev, Sardana Ivanova, Mokhiyakhon Uzokova, Shaxnoza Pulatova, Duygu Ataman, Julia Kreutzer, Francis M. Tyers, Orhan Firat, John Licato, Sriram Chellappan:
Evaluating Multiway Multilingual NMT in the Turkic Languages. WMT@EMNLP 2021: 518-530 - [i47]Xinyi Wang, Ankur Bapna, Melvin Johnson, Orhan Firat:
Gradient-guided Loss Masking for Neural Machine Translation. CoRR abs/2102.13549 (2021) - [i46]Xavier Garcia, Noah Constant, Ankur P. Parikh, Orhan Firat:
Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution. CoRR abs/2103.06799 (2021) - [i45]Isaac Caswell, Julia Kreutzer, Lisa Wang, Ahsan Wahab, Daan van Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, Sokhar Samb, Benoît Sagot, Clara Rivera, Annette Rios, Isabel Papadimitriou, Salomey Osei, Pedro Javier Ortiz Suárez, Iroro Orife, Kelechi Ogueji, Rubungo Andre Niyongabo, Toan Q. Nguyen, Mathias Müller, André Müller, Shamsuddeen Hassan Muhammad, Nanda Muhammad, Ayanda Mnyakeni, Jamshidbek Mirzakhalov, Tapiwanashe Matangira, Colin Leong, Nze Lawson, Sneha Kudugunta, Yacine Jernite, Mathias Jenny, Orhan Firat, Bonaventure F. P. Dossou, Sakhile Dlamini, Nisansa de Silva, Sakine Çabuk Balli, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur Bapna, Pallavi Baljekar, Israel Abebe Azime, Ayodele Awokoya, Duygu Ataman, Orevaoghene Ahia, Oghenefego Ahia, Sweta Agrawal, Mofetoluwa Adeyemi:
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets. AfricaNLP 2021 - [i44]Sebastian Ruder, Noah Constant, Jan A. Botha, Aditya Siddhant, Orhan Firat, Jinlan Fu, Pengfei Liu, Junjie Hu, Graham Neubig, Melvin Johnson:
XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation. CoRR abs/2104.07412 (2021) - [i43]Xavier Garcia, Noah Constant, Mandy Guo, Orhan Firat:
Towards Universality in Multilingual Text Rewriting. CoRR abs/2107.14749 (2021) - [i42]Jamshidbek Mirzakhalov, Anoop Babu, Duygu Ataman, Sherzod Kariev, Francis M. Tyers, Otabek Abduraufov, Mammad Hajili, Sardana Ivanova, Abror Khaytbaev, Antonio Laverghetta Jr., Behzodbek Moydinboyev, Esra Onal, Shaxnoza Pulatova, Ahsan Wahab, Orhan Firat, Sriram Chellappan:
A Large-Scale Study of Machine Translation in the Turkic Languages. CoRR abs/2109.04593 (2021) - [i41]Jamshidbek Mirzakhalov, Anoop Babu, Aigiz Kunafin, Ahsan Wahab, Behzod Moydinboyev, Sardana Ivanova, Mokhiyakhon Uzokova, Shaxnoza Pulatova, Duygu Ataman, Julia Kreutzer, Francis M. Tyers, Orhan Firat, John Licato, Sriram Chellappan:
Evaluating Multiway Multilingual NMT in the Turkic Languages. CoRR abs/2109.06262 (2021) - [i40]Behrooz Ghorbani, Orhan Firat, Markus Freitag, Ankur Bapna, Maxim Krikun, Xavier Garcia, Ciprian Chelba, Colin Cherry:
Scaling Laws for Neural Machine Translation. CoRR abs/2109.07740 (2021) - [i39]Zirui Wang, Adams Wei Yu, Orhan Firat, Yuan Cao:
Towards Zero-Label Language Learning. CoRR abs/2109.09193 (2021) - [i38]Biao Zhang, Ankur Bapna, Melvin Johnson, Ali Dabirmoghaddam, Naveen Arivazhagan, Orhan Firat:
Multilingual Document-Level Translation Enables Zero-Shot Transfer From Sentences to Documents. CoRR abs/2109.10341 (2021) - [i37]Sneha Kudugunta, Yanping Huang, Ankur Bapna, Maxim Krikun, Dmitry Lepikhin, Minh-Thang Luong, Orhan Firat:
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference. CoRR abs/2110.03742 (2021) - [i36]Justin Gilmer, Behrooz Ghorbani, Ankush Garg, Sneha Kudugunta, Behnam Neyshabur, David Cardoze, George E. Dahl, Zachary Nado, Orhan Firat:
A Loss Curvature Perspective on Training Instability in Deep Learning. CoRR abs/2110.04369 (2021) - [i35]Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathy Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc V. Le, Yonghui Wu, Zhifeng Chen, Claire Cui:
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts. CoRR abs/2112.06905 (2021) - 2020
- [c40]Aditya Siddhant, Melvin Johnson, Henry Tsai, Naveen Ari, Jason Riesa, Ankur Bapna, Orhan Firat, Karthik Raman:
Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation. AAAI 2020: 8854-8861 - [c39]Aditya Siddhant, Ankur Bapna, Yuan Cao, Orhan Firat, Mia Xu Chen, Sneha Reddy Kudugunta, Naveen Arivazhagan, Yonghui Wu:
Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation. ACL 2020: 2827-2835 - [c38]Jason Lee, Dustin Tran, Orhan Firat, Kyunghyun Cho:
On the Discrepancy between Density Estimation and Sequence Generation. SPNLP@EMNLP 2020: 84-94 - [c37]Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson:
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation. ICML 2020: 4411-4421 - [c36]Markus Freitag, Orhan Firat:
Complete Multilingual Neural Machine Translation. WMT@EMNLP 2020: 550-560 - [i34]Ankur Bapna, Naveen Arivazhagan, Orhan Firat:
Controlling Computation versus Quality for Neural Sequence Models. CoRR abs/2002.07106 (2020) - [i33]Jason Lee, Dustin Tran, Orhan Firat, Kyunghyun Cho:
On the Discrepancy between Density Estimation and Sequence Generation. CoRR abs/2002.07233 (2020) - [i32]Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson:
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization. CoRR abs/2003.11080 (2020) - [i31]Aditya Siddhant, Ankur Bapna, Yuan Cao, Orhan Firat, Mia Xu Chen, Sneha Reddy Kudugunta, Naveen Arivazhagan, Yonghui Wu:
Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation. CoRR abs/2005.04816 (2020) - [i30]Dmitry Lepikhin, HyoukJoong Lee, Yuanzhong Xu, Dehao Chen, Orhan Firat, Yanping Huang, Maxim Krikun, Noam Shazeer, Zhifeng Chen:
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding. CoRR abs/2006.16668 (2020) - [i29]Xavier Garcia, Aditya Siddhant, Orhan Firat, Ankur P. Parikh:
Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages. CoRR abs/2009.11201 (2020) - [i28]Zirui Wang, Yulia Tsvetkov, Orhan Firat, Yuan Cao:
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models. CoRR abs/2010.05874 (2020) - [i27]Junjie Hu, Melvin Johnson, Orhan Firat, Aditya Siddhant, Graham Neubig:
Explicit Alignment Objectives for Multilingual Bidirectional Encoders. CoRR abs/2010.07972 (2020) - [i26]Markus Freitag, Orhan Firat:
Complete Multilingual Neural Machine Translation. CoRR abs/2010.10239 (2020) - [i25]Elman Mansimov, Mitchell Stern, Mia Xu Chen, Orhan Firat, Jakob Uszkoreit, Puneet Jain:
Towards End-to-End In-Image Neural Machine Translation. CoRR abs/2010.10648 (2020) - [i24]Mahdis Mahdieh, Mia Xu Chen, Yuan Cao, Orhan Firat:
Rapid Domain Adaptation for Machine Translation with Monolingual Data. CoRR abs/2010.12652 (2020)
2010 – 2019
- 2019
- [c35]Duygu Ataman, Orhan Firat, Mattia Antonino Di Gangi, Marcello Federico, Alexandra Birch:
On the Importance of Word Boundaries in Character-level Neural Machine Translation. NGT@EMNLP-IJCNLP 2019: 187-193 - [c34]Ankur Bapna, Orhan Firat:
Simple, Scalable Adaptation for Neural Machine Translation. EMNLP/IJCNLP (1) 2019: 1538-1548 - [c33]Sneha Reddy Kudugunta, Ankur Bapna, Isaac Caswell, Orhan Firat:
Investigating Multilingual NMT Representations at Scale. EMNLP/IJCNLP (1) 2019: 1565-1575 - [c32]Ankur Bapna, Orhan Firat:
Non-Parametric Adaptation for Neural Machine Translation. NAACL-HLT (1) 2019: 1921-1931 - [c31]Roee Aharoni, Melvin Johnson, Orhan Firat:
Massively Multilingual Neural Machine Translation. NAACL-HLT (1) 2019: 3874-3884 - [c30]Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Dehao Chen, Mia Xu Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, Zhifeng Chen:
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism. NeurIPS 2019: 103-112 - [c29]Xian Li, Paul Michel, Antonios Anastasopoulos, Yonatan Belinkov, Nadir Durrani, Orhan Firat, Philipp Koehn, Graham Neubig, Juan Miguel Pino, Hassan Sajjad:
Findings of the First Shared Task on Machine Translation Robustness. WMT (2) 2019: 91-102 - [i23]Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia Xu Chen, Ye Jia, Anjuli Kannan, Tara N. Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George F. Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel Bacchiani, Thomas B. Jablin, Robert Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon:
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling. CoRR abs/1902.08295 (2019) - [i22]Ankur Bapna, Orhan Firat:
Non-Parametric Adaptation for Neural Machine Translation. CoRR abs/1903.00058 (2019) - [i21]Roee Aharoni, Melvin Johnson, Orhan Firat:
Massively Multilingual Neural Machine Translation. CoRR abs/1903.00089 (2019) - [i20]