default search action
Yacine Jernite
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Books and Theses
- 2018
- [b1]Yacine Jernite:
Learning Representations of Text through Language and Discourse Modeling: From Characters to Sentences. New York University, USA, 2018
Journal Articles
- 2023
- [j4]Denis Kocetkov, Raymond Li, Loubna Ben Allal, Jia Li, Chenghao Mou, Yacine Jernite, Margaret Mitchell, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Dzmitry Bahdanau, Leandro von Werra, Harm de Vries:
The Stack: 3 TB of permissively licensed source code. Trans. Mach. Learn. Res. 2023 (2023) - [j3]Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy V, Jason T. Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries:
StarCoder: may the source be with you! Trans. Mach. Learn. Res. 2023 (2023) - 2022
- [j2]Julia Kreutzer, Isaac Caswell, Lisa Wang, Ahsan Wahab, Daan van Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, Sokhar Samb, Benoît Sagot, Clara Rivera, Annette Rios, Isabel Papadimitriou, Salomey Osei, Pedro Javier Ortiz Suárez, Iroro Orife, Kelechi Ogueji, Andre Niyongabo Rubungo, Toan Q. Nguyen, Mathias Müller, André Müller, Shamsuddeen Hassan Muhammad, Nanda Muhammad, Ayanda Mnyakeni, Jamshidbek Mirzakhalov, Tapiwanashe Matangira, Colin Leong, Nze Lawson, Sneha Kudugunta, Yacine Jernite, Mathias Jenny, Orhan Firat, Bonaventure F. P. Dossou, Sakhile Dlamini, Nisansa de Silva, Sakine Çabuk Balli, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur Bapna, Pallavi Baljekar, Israel Abebe Azime, Ayodele Awokoya, Duygu Ataman, Orevaoghene Ahia, Oghenefego Ahia, Sweta Agrawal, Mofetoluwa Adeyemi:
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets. Trans. Assoc. Comput. Linguistics 10: 50-72 (2022) - 2019
- [j1]Nathaniel R. Greenbaum, Yacine Jernite, Yoni Halpern, Shelley Calder, Larry A. Nathanson, David A. Sontag, Steven Horng:
Improving documentation of presenting problems in the emergency department using a domain-specific ontology and machine learning-driven user interfaces. Int. J. Medical Informatics 132 (2019)
Conference and Workshop Papers
- 2024
- [c22]Sasha Luccioni, Yacine Jernite, Emma Strubell:
Power Hungry Processing: Watts Driving the Cost of AI Deployment? FAccT 2024: 85-99 - [c21]Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen K. Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan:
Position: On the Societal Impact of Open Foundation Models. ICML 2024 - [c20]Daniel McDuff, Tim Korjakow, Scott Cambo, Jesse Josua Benjamin, Jenny Lee, Yacine Jernite, Carlos Muñoz Ferrandis, Aaron Gokaslan, Alek Tarkowski, Joseph Lindley, A. Feder Cooper, Danish Contractor:
Position: Standardization of Behavioral Use Clauses is Necessary for the Adoption of Responsible Licensing of AI. ICML 2024 - 2023
- [c19]Aleksandra Piktus, Christopher Akiki, Paulo Villegas, Hugo Laurençon, Gérard Dupont, Sasha Luccioni, Yacine Jernite, Anna Rogers:
The ROOTS Search Tool: Data Transparency for LLMs. ACL (demo) 2023: 304-314 - [c18]Chris Chinenye Emezue, Sanchit Gandhi, Lewis Tunstall, Abubakar Abid, Joshua Meyer, Quentin Lhoest, Pete Allen, Patrick von Platen, Douwe Kiela, Yacine Jernite, Julien Chaumond, Merve Noyan, Omar Sanseviero:
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages. AfricaNLP 2023 - [c17]Hanlin Li, Nicholas Vincent, Yacine Jernite, Nick Merrill, Jesse Josua Benjamin, Alek Tarkowski:
Can Licensing Mitigate the Negative Implications of Commercial Web Scraping? CSCW Companion 2023: 553-555 - [c16]Giada Pistilli, Carlos Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell:
Stronger Together: on the Articulation of Ethical Charters, Legal Tools, and Technical Documentation in ML. FAccT 2023: 343-354 - [c15]Sasha Luccioni, Christopher Akiki, Margaret Mitchell, Yacine Jernite:
Stable Bias: Evaluating Societal Representations in Diffusion Models. NeurIPS 2023 - 2022
- [c14]Yacine Jernite, Huu Nguyen, Stella Biderman, Anna Rogers, Maraim Masoud, Valentin Danchev, Samson Tan, Alexandra Sasha Luccioni, Nishant Subramani, Isaac Johnson, Gérard Dupont, Jesse Dodge, Kyle Lo, Zeerak Talat, Dragomir R. Radev, Aaron Gokaslan, Somaieh Nikpoor, Peter Henderson, Rishi Bommasani, Margaret Mitchell:
Data Governance in the Age of Large-Scale Data-Driven Language Technology. FAccT 2022: 2206-2222 - [c13]Hugo Laurençon, Lucile Saulnier, Thomas Wang, Christopher Akiki, Albert Villanova del Moral, Teven Le Scao, Leandro von Werra, Chenghao Mou, Eduardo González Ponferrada, Huu Nguyen, Jörg Frohberg, Mario Sasko, Quentin Lhoest, Angelina McMillan-Major, Gérard Dupont, Stella Biderman, Anna Rogers, Loubna Ben Allal, Francesco De Toni, Giada Pistilli, Olivier Nguyen, Somaieh Nikpoor, Maraim Masoud, Pierre Colombo, Javier de la Rosa, Paulo Villegas, Tristan Thrush, Shayne Longpre, Sebastian Nagel, Leon Weber, Manuel Muñoz, Jian Zhu, Daniel van Strien, Zaid Alyafeai, Khalid Almubarak, Minh Chien Vu, Itziar Gonzalez-Dios, Aitor Soroa, Kyle Lo, Manan Dey, Pedro Ortiz Suarez, Aaron Gokaslan, Shamik Bose, David Ifeoluwa Adelani, Long Phan, Hieu Tran, Ian Yu, Suhas Pai, Jenny Chim, Violette Lepercq, Suzana Ilic, Margaret Mitchell, Alexandra Sasha Luccioni, Yacine Jernite:
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset. NeurIPS 2022 - 2021
- [c12]Quentin Lhoest, Albert Villanova del Moral, Yacine Jernite, Abhishek Thakur, Patrick von Platen, Suraj Patil, Julien Chaumond, Mariama Drame, Julien Plu, Lewis Tunstall, Joe Davison, Mario Sasko, Gunjan Chhablani, Bhavitvya Malik, Simon Brandeis, Teven Le Scao, Victor Sanh, Canwen Xu, Nicolas Patry, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugger, Clément Delangue, Théo Matussière, Lysandre Debut, Stas Bekman, Pierric Cistac, Thibault Goehringer, Victor Mustar, François Lagunas, Alexander M. Rush, Thomas Wolf:
Datasets: A Community Library for Natural Language Processing. EMNLP (Demos) 2021: 175-184 - [c11]Fabio Petroni, Aleksandra Piktus, Angela Fan, Patrick S. H. Lewis, Majid Yazdani, Nicola De Cao, James Thorne, Yacine Jernite, Vladimir Karpukhin, Jean Maillard, Vassilis Plachouras, Tim Rocktäschel, Sebastian Riedel:
KILT: a Benchmark for Knowledge Intensive Language Tasks. NAACL-HLT 2021: 2523-2544 - [c10]Alexander Borzunov, Max Ryabinin, Tim Dettmers, Quentin Lhoest, Lucile Saulnier, Michael Diskin, Yacine Jernite, Thomas Wolf:
Training Transformers Together. NeurIPS (Competition and Demos) 2021: 335-342 - [c9]Michael Diskin, Alexey Bukhtiyarov, Max Ryabinin, Lucile Saulnier, Quentin Lhoest, Anton Sinitsin, Dmitry Popov, Dmitry V. Pyrkin, Maxim Kashirin, Alexander Borzunov, Albert Villanova del Moral, Denis Mazur, Ilia Kobelev, Yacine Jernite, Thomas Wolf, Gennady Pekhimenko:
Distributed Deep Learning In Open Collaborations. NeurIPS 2021: 7879-7897 - 2020
- [c8]Kavya Srinet, Yacine Jernite, Jonathan Gray, Arthur Szlam:
CraftAssist Instruction Parsing: Semantic Parsing for a Voxel-World Assistant. ACL 2020: 4693-4714 - [c7]Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, Alexander M. Rush:
Transformers: State-of-the-Art Natural Language Processing. EMNLP (Demos) 2020: 38-45 - 2019
- [c6]Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, Michael Auli:
ELI5: Long Form Question Answering. ACL (1) 2019: 3558-3567 - 2017
- [c5]Yacine Jernite, Edouard Grave, Armand Joulin, Tomás Mikolov:
Variable Computation in Recurrent Neural Networks. ICLR (Poster) 2017 - [c4]Yacine Jernite, Anna Choromanska, David A. Sontag:
Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation. ICML 2017: 1665-1674 - 2016
- [c3]Yoon Kim, Yacine Jernite, David A. Sontag, Alexander M. Rush:
Character-Aware Neural Language Models. AAAI 2016: 2741-2749 - 2015
- [c2]Yacine Jernite, Alexander M. Rush, David A. Sontag:
A Fast Variational Approach for Learning Markov Random Field Language Models. ICML 2015: 2209-2217 - 2013
- [c1]Yacine Jernite, Yonatan Halpern, David A. Sontag:
Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests. NIPS 2013: 2355-2363
Informal and Other Publications
- 2024
- [i41]Daniel McDuff, Tim Korjakow, Scott Cambo, Jesse Josua Benjamin, Jenny Lee, Yacine Jernite, Carlos Muñoz Ferrandis, Aaron Gokaslan, Alek Tarkowski, Joseph Lindley, A. Feder Cooper, Danish Contractor:
On the Standardization of Behavioral Use Clauses and Their Adoption for Responsible Licensing of AI. CoRR abs/2402.05979 (2024) - [i40]Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Aspen K. Hopkins, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang, Arvind Narayanan:
On the Societal Impact of Open Foundation Models. CoRR abs/2403.07918 (2024) - [i39]Giada Pistilli, Alina Leidinger, Yacine Jernite, Atoosa Kasirzadeh, Alexandra Sasha Luccioni, Margaret Mitchell:
CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models. CoRR abs/2405.13974 (2024) - [i38]Shayne Longpre, Stella Biderman, Alon Albalak, Hailey Schoelkopf, Daniel McDuff, Sayash Kapoor, Kevin Klyman, Kyle Lo, Gabriel Ilharco, Nay San, Maribeth Rauh, Aviya Skowron, Bertie Vidgen, Laura Weidinger, Arvind Narayanan, Victor Sanh, David Ifeoluwa Adelani, Percy Liang, Rishi Bommasani, Peter Henderson, Sasha Luccioni, Yacine Jernite, Luca Soldaini:
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources. CoRR abs/2406.16746 (2024) - [i37]Shachar Don-Yehiya, Ben Burtenshaw, Ramón Fernandez Astudillo, Cailean Osborne, Mimansa Jaiswal, Tzu-Sheng Kuo, Wenting Zhao, Idan Shenfeld, Andi Peng, Mikhail Yurochkin, Atoosa Kasirzadeh, Yangsibo Huang, Tatsunori Hashimoto, Yacine Jernite, Daniel Vila-Suero, Omri Abend, Jennifer Ding, Sara Hooker, Hannah Rose Kirk, Leshem Choshen:
The Future of Open Human Feedback. CoRR abs/2408.16961 (2024) - 2023
- [i36]Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Muñoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy-Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra:
SantaCoder: don't reach for the stars! CoRR abs/2301.03988 (2023) - [i35]Jennifer Ding, Christopher Akiki, Yacine Jernite, Anne Lee Steele, Temi Popo:
Towards Openness Beyond Open Access: User Journeys through 3 Open AI Collaboratives. CoRR abs/2301.08488 (2023) - [i34]Aleksandra Piktus, Christopher Akiki, Paulo Villegas, Hugo Laurençon, Gérard Dupont, Alexandra Sasha Luccioni, Yacine Jernite, Anna Rogers:
The ROOTS Search Tool: Data Transparency for LLMs. CoRR abs/2302.14035 (2023) - [i33]Hugo Laurençon, Lucile Saulnier, Thomas Wang, Christopher Akiki, Albert Villanova del Moral, Teven Le Scao, Leandro von Werra, Chenghao Mou, Eduardo González Ponferrada, Huu Nguyen, Jörg Frohberg, Mario Sasko, Quentin Lhoest, Angelina McMillan-Major, Gérard Dupont, Stella Biderman, Anna Rogers, Loubna Ben Allal, Francesco De Toni, Giada Pistilli, Olivier Nguyen, Somaieh Nikpoor, Maraim Masoud, Pierre Colombo, Javier de la Rosa, Paulo Villegas, Tristan Thrush, Shayne Longpre, Sebastian Nagel, Leon Weber, Manuel Muñoz, Jian Zhu, Daniel van Strien, Zaid Alyafeai, Khalid Almubarak, Minh Chien Vu, Itziar Gonzalez-Dios, Aitor Soroa, Kyle Lo, Manan Dey, Pedro Ortiz Suarez, Aaron Gokaslan, Shamik Bose, David Ifeoluwa Adelani, Long Phan, Hieu Tran, Ian Yu, Suhas Pai, Jenny Chim, Violette Lepercq, Suzana Ilic, Margaret Mitchell, Sasha Luccioni, Yacine Jernite:
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset. CoRR abs/2303.03915 (2023) - [i32]Alexandra Sasha Luccioni, Christopher Akiki, Margaret Mitchell, Yacine Jernite:
Stable Bias: Analyzing Societal Representations in Diffusion Models. CoRR abs/2303.11408 (2023) - [i31]Chris Chinenye Emezue, Sanchit Gandhi, Lewis Tunstall, Abubakar Abid, Josh Meyer, Quentin Lhoest, Pete Allen, Patrick von Platen, Douwe Kiela, Yacine Jernite, Julien Chaumond, Merve Noyan, Omar Sanseviero:
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages. CoRR abs/2303.12582 (2023) - [i30]Raymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou, Marc Marone, Christopher Akiki, Jia Li, Jenny Chim, Qian Liu, Evgenii Zheltonozhskii, Terry Yue Zhuo, Thomas Wang, Olivier Dehaene, Mishig Davaadorj, Joel Lamy-Poirier, João Monteiro, Oleh Shliazhko, Nicolas Gontier, Nicholas Meade, Armel Zebaze, Ming-Ho Yee, Logesh Kumar Umapathi, Jian Zhu, Benjamin Lipkin, Muhtasham Oblokulov, Zhiruo Wang, Rudra Murthy V, Jason Stillerman, Siva Sankalp Patel, Dmitry Abulkhanov, Marco Zocca, Manan Dey, Zhihan Zhang, Nour Moustafa-Fahmy, Urvashi Bhattacharyya, Wenhao Yu, Swayam Singh, Sasha Luccioni, Paulo Villegas, Maxim Kunakov, Fedor Zhdanov, Manuel Romero, Tony Lee, Nadav Timor, Jennifer Ding, Claire Schlesinger, Hailey Schoelkopf, Jan Ebert, Tri Dao, Mayank Mishra, Alex Gu, Jennifer Robinson, Carolyn Jane Anderson, Brendan Dolan-Gavitt, Danish Contractor, Siva Reddy, Daniel Fried, Dzmitry Bahdanau, Yacine Jernite, Carlos Muñoz Ferrandis, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries:
StarCoder: may the source be with you! CoRR abs/2305.06161 (2023) - [i29]Giada Pistilli, Carlos Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell:
Stronger Together: on the Articulation of Ethical Charters, Legal Tools, and Technical Documentation in ML. CoRR abs/2305.18615 (2023) - [i28]Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan K. Baker, Su Lin Blodgett, Hal Daumé III, Jesse Dodge, Ellie Evans, Sara Hooker, Yacine Jernite, Alexandra Sasha Luccioni, Alberto Lusoli, Margaret Mitchell, Jessica Newman, Marie-Therese Png, Andrew Strait, Apostol Vassilev:
Evaluating the Social Impact of Generative AI Systems in Systems and Society. CoRR abs/2306.05949 (2023) - [i27]Alexandra Sasha Luccioni, Yacine Jernite, Emma Strubell:
Power Hungry Processing: Watts Driving the Cost of AI Deployment? CoRR abs/2311.16863 (2023) - [i26]Sean Hughes, Harm de Vries, Jennifer Robinson, Carlos Muñoz Ferrandis, Loubna Ben Allal, Leandro von Werra, Jennifer Ding, Sébastien Paquet, Yacine Jernite:
The BigCode Project Governance Card. CoRR abs/2312.03872 (2023) - 2022
- [i25]Angelina McMillan-Major, Zaid Alyafeai, Stella Biderman, Kimbo Chen, Francesco De Toni, Gérard Dupont, Hady Elsahar, Chris Emezue, Alham Fikri Aji, Suzana Ilic, Nurulaqilla Khamis, Colin Leong, Maraim Masoud, Aitor Soroa, Pedro Javier Ortiz Suárez, Zeerak Talat, Daniel van Strien, Yacine Jernite:
Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources. CoRR abs/2201.10066 (2022) - [i24]Yacine Jernite, Huu Nguyen, Stella Biderman, Anna Rogers, Maraim Masoud, Valentin Danchev, Samson Tan, Alexandra Sasha Luccioni, Nishant Subramani, Gérard Dupont, Jesse Dodge, Kyle Lo, Zeerak Talat, Isaac Johnson, Dragomir R. Radev, Somaieh Nikpoor, Jörg Frohberg, Aaron Gokaslan, Peter Henderson, Rishi Bommasani, Margaret Mitchell:
Data Governance in the Age of Large-Scale Data-Driven Language Technology. CoRR abs/2206.03216 (2022) - [i23]Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir R. Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh D. Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Stajner, Sébastien Montella, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin P. AMahidewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou:
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code. CoRR abs/2206.11249 (2022) - [i22]Alexander Borzunov, Max Ryabinin, Tim Dettmers, Quentin Lhoest, Lucile Saulnier, Michael Diskin, Yacine Jernite, Thomas Wolf:
Training Transformers Together. CoRR abs/2207.03481 (2022) - [i21]Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilic, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, et al.:
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. CoRR abs/2211.05100 (2022) - [i20]Denis Kocetkov, Raymond Li, Loubna Ben Allal, Jia Li, Chenghao Mou, Carlos Muñoz Ferrandis, Yacine Jernite, Margaret Mitchell, Sean Hughes, Thomas Wolf, Dzmitry Bahdanau, Leandro von Werra, Harm de Vries:
The Stack: 3 TB of permissively licensed source code. CoRR abs/2211.15533 (2022) - [i19]Christopher Akiki, Giada Pistilli, Margot Mieskes, Matthias Gallé, Thomas Wolf, Suzana Ilic, Yacine Jernite:
BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model. CoRR abs/2212.04960 (2022) - [i18]Margaret Mitchell, Alexandra Sasha Luccioni, Nathan Lambert, Marissa Gerchick, Angelina McMillan-Major, Ezinwanne Ozoani, Nazneen Rajani, Tristan Thrush, Yacine Jernite, Douwe Kiela:
Measuring Data. CoRR abs/2212.05129 (2022) - 2021
- [i17]Sebastian Gehrmann, Tosin P. Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna-Adriana Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondrej Dusek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur P. Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou:
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics. CoRR abs/2102.01672 (2021) - [i16]Isaac Caswell, Julia Kreutzer, Lisa Wang, Ahsan Wahab, Daan van Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, Sokhar Samb, Benoît Sagot, Clara Rivera, Annette Rios, Isabel Papadimitriou, Salomey Osei, Pedro Javier Ortiz Suárez, Iroro Orife, Kelechi Ogueji, Rubungo Andre Niyongabo, Toan Q. Nguyen, Mathias Müller, André Müller, Shamsuddeen Hassan Muhammad, Nanda Muhammad, Ayanda Mnyakeni, Jamshidbek Mirzakhalov, Tapiwanashe Matangira, Colin Leong, Nze Lawson, Sneha Kudugunta, Yacine Jernite, Mathias Jenny, Orhan Firat, Bonaventure F. P. Dossou, Sakhile Dlamini, Nisansa de Silva, Sakine Çabuk Balli, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur Bapna, Pallavi Baljekar, Israel Abebe Azime, Ayodele Awokoya, Duygu Ataman, Orevaoghene Ahia, Oghenefego Ahia, Sweta Agrawal, Mofetoluwa Adeyemi:
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets. AfricaNLP 2021 - [i15]Michael Diskin, Alexey Bukhtiyarov, Max Ryabinin, Lucile Saulnier, Quentin Lhoest, Anton Sinitsin, Dmitry Popov, Dmitry V. Pyrkin, Maxim Kashirin, Alexander Borzunov, Albert Villanova del Moral, Denis Mazur, Ilia Kobelev, Yacine Jernite, Thomas Wolf, Gennady Pekhimenko:
Distributed Deep Learning in Open Collaborations. CoRR abs/2106.10207 (2021) - [i14]Angelina McMillan-Major, Salomey Osei, Juan Diego Rodriguez, Pawan Sasanka Ammanamanchi, Sebastian Gehrmann, Yacine Jernite:
Reusable Templates and Guides For Documenting Datasets and Models for Natural Language Processing and Generation: A Case Study of the HuggingFace and GEM Data and Model Cards. CoRR abs/2108.07374 (2021) - [i13]Quentin Lhoest, Albert Villanova del Moral, Yacine Jernite, Abhishek Thakur, Patrick von Platen, Suraj Patil, Julien Chaumond, Mariama Drame, Julien Plu, Lewis Tunstall, Joe Davison, Mario Sasko, Gunjan Chhablani, Bhavitvya Malik, Simon Brandeis, Teven Le Scao, Victor Sanh, Canwen Xu, Nicolas Patry, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugger, Clement Delangue, Théo Matussière, Lysandre Debut, Stas Bekman, Pierric Cistac, Thibault Goehringer, Victor Mustar, François Lagunas, Alexander M. Rush, Thomas Wolf:
Datasets: A Community Library for Natural Language Processing. CoRR abs/2109.02846 (2021) - 2020
- [i12]Fabio Petroni, Aleksandra Piktus, Angela Fan, Patrick S. H. Lewis, Majid Yazdani, Nicola De Cao, James Thorne, Yacine Jernite, Vassilis Plachouras, Tim Rocktäschel, Sebastian Riedel:
KILT: a Benchmark for Knowledge Intensive Language Tasks. CoRR abs/2009.02252 (2020) - 2019
- [i11]Yacine Jernite, Kavya Srinet, Jonathan Gray, Arthur Szlam:
CraftAssist Instruction Parsing: Semantic Parsing for a Minecraft Assistant. CoRR abs/1905.01978 (2019) - [i10]Jonathan Gray, Kavya Srinet, Yacine Jernite, Haonan Yu, Zhuoyuan Chen, Demi Guo, Siddharth Goyal, C. Lawrence Zitnick, Arthur Szlam:
CraftAssist: A Framework for Dialogue-enabled Interactive Agents. CoRR abs/1907.08584 (2019) - [i9]Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, Michael Auli:
ELI5: Long Form Question Answering. CoRR abs/1907.09190 (2019) - [i8]Arthur Szlam, Jonathan Gray, Kavya Srinet, Yacine Jernite, Armand Joulin, Gabriel Synnaeve, Douwe Kiela, Haonan Yu, Zhuoyuan Chen, Siddharth Goyal, Demi Guo, Danielle Rothermel, C. Lawrence Zitnick, Jason Weston:
Why Build an Assistant in Minecraft? CoRR abs/1907.09273 (2019) - [i7]Yacine Jernite:
Unsupervised Text Summarization via Mixed Model Back-Translation. CoRR abs/1908.08566 (2019) - [i6]Xinyi Wang, Jason Weston, Michael Auli, Yacine Jernite:
Improving Conditioning in Context-Aware Sequence to Sequence Models. CoRR abs/1911.09728 (2019) - 2017
- [i5]Yacine Jernite, Samuel R. Bowman, David A. Sontag:
Discourse-Based Objectives for Fast Unsupervised Sentence Representation Learning. CoRR abs/1705.00557 (2017) - [i4]Ankit Vani, Yacine Jernite, David A. Sontag:
Grounded Recurrent Neural Networks. CoRR abs/1705.08557 (2017) - 2016
- [i3]Yacine Jernite, Anna Choromanska, David A. Sontag, Yann LeCun:
Simultaneous Learning of Trees and Representations for Extreme Classification, with Application to Language Modeling. CoRR abs/1610.04658 (2016) - [i2]Yacine Jernite, Edouard Grave, Armand Joulin, Tomás Mikolov:
Variable Computation in Recurrent Neural Networks. CoRR abs/1611.06188 (2016) - 2015
- [i1]Yoon Kim, Yacine Jernite, David A. Sontag, Alexander M. Rush:
Character-Aware Neural Language Models. CoRR abs/1508.06615 (2015)
Coauthor Index
aka: Alexandra Sasha Luccioni
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 22:18 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint