


Остановите войну!
for scientists:


default search action
1st NeurIPS Datasets and Benchmarks 2021
- Joaquin Vanschoren, Sai-Kit Yeung:
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual. 2021 - Maria Korosteleva, Sung-Hee Lee:
Generating Datasets of 3D Garments with Sewing Patterns. - Zhiyuan Tang, Dong Wang, Yanguang Xu, Jianwei Sun, Xiaoning Lei, Shuaijiang Zhao, Cheng Wen, Xingjun Tan, Chuandong Xie, Shuran Zhou, Rui Yan, Chenjia Lv, Yang Han, Wei Zou, Xiangang Li:
KeSpeech: An Open Source Speech Dataset of Mandarin and Its Eight Subdialects. - Arjun D. Desai, Andrew M. Schmidt, Elka B. Rubin, Christopher M. Sandino, Marianne S. Black, Valentina Mazzoli, Kathryn J. Stevens, Robert Boutin, Christopher Ré, Garry Gold, Brian A. Hargreaves, Akshay Chaudhari:
SKM-TEA: A Dataset for Accelerated MRI Reconstruction with Dense Image Labels for Quantitative Clinical Evaluation. - Cédric Renggli, Luka Rimanic, Nora Hollenstein, Ce Zhang:
Evaluating Bayes Error Estimators on Real-World Datasets with FeeBee. - Lukasz Borchmann, Michal Pietruszka, Tomasz Stanislawek, Dawid Jurkiewicz, Michal Turski, Karolina Szyndler, Filip Gralinski:
DUE: End-to-End Document Understanding Benchmark. - Yuki M. Asano, Christian Rupprecht, Andrew Zisserman, Andrea Vedaldi:
PASS: An ImageNet replacement for self-supervised pretraining without humans. - Kenneth Peng, Arunesh Mathur, Arvind Narayanan:
Mitigating dataset harms requires stewardship: Lessons from 1000 papers. - Raesetje Sefala, Timnit Gebru, Nyalleng Moorosi, Luzango Mfupe, Richard Klein:
Constructing a Visual Dataset to Study the Effects of Spatial Apartheid in South Africa. - Inioluwa Deborah Raji, Emily Denton, Emily M. Bender, Alex Hanna, Amandalynne Paullada:
AI and the Everything in the Whole Wide World Benchmark. - Michael Laskin, Denis Yarats, Hao Liu, Kimin Lee, Albert Zhan, Kevin Lu, Catherine Cang, Lerrel Pinto, Pieter Abbeel:
URLB: Unsupervised Reinforcement Learning Benchmark. - Dong Huk Park, Samaneh Azadi, Xihui Liu, Trevor Darrell, Anna Rohrbach:
Benchmark for Compositional Text-to-Image Synthesis. - Yanqiao Zhu
, Yichen Xu, Qiang Liu, Shu Wu:
An Empirical Study of Graph Contrastive Learning. - Shusheng Xu, Yichen Liu, Xiaoyu Yi, Siyuan Zhou, Huizi Li, Yi Wu:
Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension. - Nafise Sadat Moosavi, Andreas Rücklé, Dan Roth, Iryna Gurevych:
SciGen: a Dataset for Reasoning-Aware Text Generation from Scientific Tables. - Malte Lücken, Daniel Burkhardt
, Robrecht Cannoodt, Christopher Lance, Aditi Agrawal, Hananeh Aliee, Ann Chen, Louise Deconinck, Angela Detweiler, Alejandro Granados, Shelly Huynh, Laura Isacco, Yang Kim, Dominik Klein, Bony de Kumar, Sunil Kuppasani, Heiko Lickert, Aaron McGeever, Joaquin Melgarejo, Honey Mekonen, Maurizio Morri, Michaela Müller, Norma Neff, Sheryl Paul, Bastian Rieck, Kaylie Schneider, Scott Steelman, Michael Sterr, Daniel Treacy, Alexander Tong, Alexandra-Chloé Villani, Guilin Wang, Jia Yan, Ce Zhang, Angela Pisco, Smita Krishnaswamy, Fabian J. Theis, Jonathan M. Bloom:
A sandbox for prediction and integration of DNA, RNA, and proteins in single cells. - Joy T. Wu, Nkechinyere Agu, Ismini Lourentzou, Arjun Sharma, Joseph Alexander Paguio, Jasper Seth Yao, Edward C. Dee, William Mitchell, Satyananda Kashyap, Andrea Giovannini, Leo Anthony Celi, Mehdi Moradi:
Chest ImaGenome Dataset for Clinical Reasoning. - Jieyu Zhang, Yue Yu, Yinghao Li, Yujing Wang, Yaming Yang, Mao Yang, Alexander Ratner:
WRENCH: A Comprehensive Benchmark for Weak Supervision. - Prithviraj Ammanabrolu, Mark O. Riedl:
Modeling Worlds in Text. - Wenhu Chen, Xinyi Wang, William Yang Wang:
A Dataset for Answering Time-Sensitive Questions. - Jesse Marshall, Ugne Klibaite, Amanda Gellis, Diego Aldarondo, Bence Olveczky, Timothy W. Dunn:
The PAIR-R24M Dataset for Multi-animal 3D Pose Estimation. - Daniel Galvez, Greg Diamos, Juan Torres, Keith Achorn, Juan Felipe Cerón, Anjali Gopi, David Kanter, Max Lam, Mark Mazumder, Vijay Janapa Reddi:
The People's Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage. - Sheshera Mysore, Tim O'Gorman, Andrew McCallum, Hamed Zamani:
CSFCube - A Test Collection of Computer Science Research Articles for Faceted Query by Example. - Alex J. Chan, Ioana Bica, Alihan Hüyük, Daniel Jarrett, Mihaela van der Schaar:
The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation. - Charan Reddy, Deepak Sharma, Soroush Mehri, Adriana Romero-Soriano, Samira Shabanian, Sina Honari:
Benchmarking Bias Mitigation Algorithms in Representation Learning through Fairness Metrics. - Chak Hin Bryan Liu, Ângelo Cardoso, Paul Couturier, Emma J. McCoy:
Datasets for Online Controlled Experiments. - Zhiqiu Lin, Jia Shi, Deepak Pathak, Deva Ramanan:
The CLEAR Benchmark: Continual LEArning on Real-World Imagery. - Zhengxuan Wu, Elisa Kreiss, Desmond C. Ong, Christopher Potts:
ReaSCAN: Compositional Reasoning in Language Grounding. - Vincent Dumoulin, Neil Houlsby, Utku Evci, Xiaohua Zhai, Ross Goroshin, Sylvain Gelly, Hugo Larochelle:
A Unified Few-Shot Classification Benchmark to Compare Transfer and Meta Learning Approaches. - Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, Gavriel State:
Isaac Gym: High Performance GPU Based Physics Simulation For Robot Learning. - Chenyu Yi, Siyuan Yang, Haoliang Li, Yap-Peng Tan, Alex C. Kot:
Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions. - Alicia Curth, David Svensson, James Weatherall, Mihaela van der Schaar:
Really Doing Great at Estimating CATE? A Critical Look at ML Benchmarking Practices in Treatment Effect Estimation. - Christian Dallago, Jody Mou, Kadina E. Johnston, Bruce J. Wittmann, Nicholas Bhattacharya, Samuel Goldman, Ali Madani, Kevin Yang:
FLIP: Benchmark tasks in fitness landscape inference for proteins. - Yuanqi Du, Shiyu Wang, Xiaojie Guo, Hengning Cao, Shujie Hu, Junji Jiang, Aishwarya Varala, Abhinav Angirekula, Liang Zhao:
GraphGT: Machine Learning Datasets for Graph Generation and Transformation. - Mateusz Jurewicz, Leon Derczynski:
PROCAT: Product Catalogue Dataset for Implicit Clustering, Permutation Learning and Structure Prediction. - Boxin Wang, Chejian Xu, Shuohang Wang, Zhe Gan, Yu Cheng, Jianfeng Gao, Ahmed Hassan Awadallah, Bo Li:
Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models. - Nikhil X. Bhattasali, Momchil S. Tomov, Samuel J. Gershman:
CCNLab: A Benchmarking Framework for Computational Cognitive Neuroscience. - Yuta Saito, Shunsuke Aihara, Megumi Matsutani, Yusuke Narita:
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation. - Gaoussou Youssouf Kebe, Padraig Higgins, Patrick Jenkins, Kasra Darvish, Rishabh Sachdeva, Ryan Barron, John Winder, Don Engel, Edward Raff, Francis Ferraro, Cynthia Matuszek:
A Spoken Language Dataset of Descriptions for Speech-Based Grounded Language Learning. - Loren Lugosch, Piyush Papreja, Mirco Ravanelli, Abdelwahab Heba, Titouan Parcollet:
Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers. - Santhosh Kumar Ramakrishnan, Aaron Gokaslan, Erik Wijmans, Oleksandr Maksymets, Alexander Clegg, John Turner, Eric Undersander, Wojciech Galuba, Andrew Westbury, Angel X. Chang, Manolis Savva, Yili Zhao, Dhruv Batra:
Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI. - Tudor Mare, Georgian-Emilian Duta, Mariana-Iuliana Georgescu, Adrian Sandru, Bogdan Alexe, Marius Popescu, Radu Tudor Ionescu:
A realistic approach to generate masked faces applied on two novel masked face recognition data sets. - Mingjie Li, Wenjia Cai, Rui Liu, Yuetian Weng, Xiaoyun Zhao, Cong Wang, Xin Chen, Zhong Liu, Caineng Pan, Mengke Li, Yingfeng Zheng, Yizhi Liu, Flora D. Salim, Karin Verspoor, Xiaodan Liang, Xiaojun Chang:
FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark. - Subhabrata Mukherjee, Xiaodong Liu, Guoqing Zheng, Saghar Hosseini, Hao Cheng, Ge Yang, Christopher Meek, Ahmed Hassan Awadallah, Jianfeng Gao:
Few-Shot Learning Evaluation in Natural Language Understanding. - Paul Pu Liang, Yiwei Lyu, Xiang Fan, Zetian Wu, Yun Cheng, Jason Wu, Leslie Chen, Peter Wu, Michelle A. Lee, Yuke Zhu, Ruslan Salakhutdinov, Louis-Philippe Morency:
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning. - Haozhe Sun, Wei-Wei Tu, Isabelle Guyon:
OmniPrint: A Configurable Printed Character Synthesizer. - Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg:
A Toolbox for Construction and Analysis of Speech Datasets. - Dan Hendrycks, Mantas Mazeika, Andy Zou, Sahil Patel, Christine Zhu, Jesus Navarro, Dawn Song, Bo Li, Jacob Steinhardt:
What Would Jiminy Cricket Do? Towards Agents That Behave Morally. - Sudeep Dasari, Jianren Wang, Joyce Hong, Shikhar Bahl, Yixin Lin, Austin S. Wang, Abitha Thankaraj, Karanbir Chahal, Berk Çalli, Saurabh Gupta, David Held, Lerrel Pinto, Deepak Pathak, Vikash Kumar, Abhinav Gupta:
RB2: Robotic Manipulation Benchmarking with a Twist. - Tal Schuster, Ashwin Kalyan, Alex Polozov, Adam Kalai:
Programming Puzzles. - Bernard Koch, Emily Denton, Alex Hanna, Jacob G. Foster:
Reduced, Reused and Recycled: The Life of a Dataset in Machine Learning Research. - Karl Otness, Arvi Gjoka, Joan Bruna, Daniele Panozzo, Benjamin Peherstorfer, Teseo Schneider, Denis Zorin:
An Extensible Benchmark Suite for Learning to Simulate Physical Systems. - Alon Talmor, Ori Yoran, Ronan Le Bras, Chandra Bhagavatula, Yoav Goldberg, Yejin Choi, Jonathan Berant:
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification. - Samriddhi Singla, Ayan Mukhopadhyay, Michael Wilbur, Tina Diao, Vinayak Gajjewar, Ahmed Eldawy, Mykel J. Kochenderfer, Ross D. Shachter, Abhishek Dubey:
WildfireDB: An Open-Source Dataset Connecting Wildfire Occurrence with Relevant Determinants. - Joseph Suarez, Yilun Du, Clare Zhu, Igor Mordatch, Phillip Isola:
The Neural MMO Platform for Massively Multiagent Research. - Benjamin Wilson, William Qi, Tanmay Agarwal, John Lambert, Jagjeet Singh, Siddhesh Khandelwal, Bowen Pan, Ratnesh Kumar, Andrew Hartnett, Jhony Kaesemodel Pontes, Deva Ramanan, Peter Carr, James Hays:
Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting. - Kexin Huang, Tianfan Fu, Wenhao Gao, Yue Zhao, Yusuf Roohani, Jure Leskovec, Connor W. Coley, Cao Xiao, Jimeng Sun, Marinka Zitnik:
Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development. - David Alfonso-Hermelo, Ahmad Rashid, Abbas Ghaddar, Philippe Langlais, Mehdi Rezagholizadeh:
NATURE: Natural Auxiliary Text Utterances for Realistic Spoken Language Evaluation. - Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, Yanfei Zhong:
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation. - Gabriel Tseng, Ivan Zvonkov, Catherine Nakalembe, Hannah Kerner:
CropHarvest: A global dataset for crop-type classification. - Jack Bandy, Nicholas Vincent:
Addressing "Documentation Debt" in Machine Learning: A Retrospective Datasheet for BookCorpus. - Yasumasa Onoe, Michael J. Q. Zhang, Eunsol Choi, Greg Durrett:
CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge. - Mayur Hemani, Abhinav Patel, Tejas Shimpi, Anirudha Ramesh, Balaji Krishnamurthy:
What Ails One-Shot Image Segmentation: A Data Perspective. - Hugo Yèche, Rita Kuznetsova, Marc Zimmermann, Matthias Hüser, Xinrui Lyu, Martin Faltys, Gunnar Rätsch:
HiRID-ICU-Benchmark - A Comprehensive Machine Learning Benchmark on High-resolution ICU Data. - Lukas Kondmann, Aysim Toker, Marc Rußwurm, Andrés Camero, Devis Peressuti, Grega Milcinski, Pierre-Philippe Mathieu, Nicolas Longépé, Timothy Davis, Giovanni Marchisio, Laura Leal-Taixé, Xiaoxiang Zhu:
DENETHOR: The DynamicEarthNET dataset for Harmonized, inter-Operable, analysis-Ready, daily crop monitoring from space. - Bo Wu, Shoubin Yu, Zhenfang Chen, Josh Tenenbaum, Chuang Gan:
STAR: A Benchmark for Situated Reasoning in Real-World Videos. - Nikolaos-Antonios Ypsilantis, Noa Garcia, Guangxing Han, Sarah Ibrahimi, Nanne van Noord, Giorgos Tolias:
The Met Dataset: Instance-level Recognition for Artworks. - Stefan Daniel Dumitrescu, Petru Rebeja, Beáta Lorincz, Mihaela Gaman, Andrei-Marius Avram, Mihai Ilie, Andrei Pruteanu, Adriana Stan, Lorena Rosia, Cristina Iacobescu, Luciana Morogan, George Dima, Gabriel Marchidan, Traian Rebedea, Madalina Chitez, Dani Yogatama, Sebastian Ruder, Radu Tudor Ionescu, Razvan Pascanu, Viorica Patraucean:
LiRo: Benchmark and leaderboard for Romanian language tasks. - Scott Freitas, Yuxiao Dong, Joshua Neil, Duen Horng Chau:
A Large-Scale Database for Graph Representation Learning. - Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Peng Xu, Feijun Jiang, Yuxiang Hu, Chen Shi, Pascale Fung:
BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling. - Jianhua Han, Xiwen Liang, Hang Xu, Kai Chen, Lanqing Hong, Jiageng Mao, Chaoqiang Ye, Wei Zhang, Zhenguo Li, Xiaodan Liang, Chunjing Xu:
SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving. - Nandan Thakur, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, Iryna Gurevych:
BEIR: A Heterogeneous Benchmark for Zero-shot Evaluation of Information Retrieval Models. - Cécile Logé, Emily Ross, David Yaw Amoah Dadey, Saahil Jain, Adriel Saporta, Andrew Y. Ng, Pranav Rajpurkar:
Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management. - Ivan Kiskin, Marianne Sinka, Adam D. Cobb, Waqas Rafique, Lawrence Wang, Davide Zilli, Benjamin Gutteridge, Rinita Dam, Theodoros Marinos, Yunpeng Li, Dickson Msaky, Emmanuel Kaindoa, Gerard Killeen, Eva Herreros-Moya, Kathy Willis, Stephen J. Roberts:
HumBugDB: A Large-scale Acoustic Mosquito Dataset. - Nikita Pavlichenko, Ivan Stelmakh, Dmitry Ustalov:
CrowdSpeech and Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription. - Afshin Dehghan, Gilad Baruch, Zhuoyuan Chen, Yuri Feigin, Peter Fu, Thomas Gebauer, Daniel Kurz, Tal Dimry, Brandon Joffe, Arik Schwartz, Elad Shulman:
ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data. - Jiageng Mao, Minzhe Niu, Chenhan Jiang, Hanxue Liang, Jingheng Chen, Xiaodan Liang, Yamin Li, Chaoqiang Ye, Wei Zhang, Zhenguo Li, Jie Yu, Chunjing Xu, Hang Xu:
One Million Scenes for Autonomous Driving: ONCE Dataset. - Rami Aly, Zhijiang Guo, Michael Sejr Schlichtkrull, James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Oana Cocarascu, Arpit Mittal:
FEVEROUS: Fact Extraction and VERification Over Unstructured and Structured information. - Sarah Wiegreffe, Ana Marasovic:
Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing. - Qinkai Zheng, Xu Zou, Yuxiao Dong, Yukuo Cen, Da Yin, Jiarong Xu, Yang Yang, Jie Tang:
Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning. - Dan Hendrycks, Collin Burns, Anya Chen, Spencer Ball:
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review. - John Pougué-Biyong, Valentina Semenova, Alexandre Matton, Rachel Han, Aerin Kim, Renaud Lambiotte, Doyne Farmer:
DEBAGREEMENT: A comment-reply dataset for (dis)agreement detection in online debates. - John Lambert, James Hays:
Trust, but Verify: Cross-Modality Fusion for HD Map Change Detection. - Chuang Gan, Jeremy Schwartz, Seth Alter, Damian Mrowca, Martin Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Michael Lingelbach, Aidan Curtis, Kevin T. Feigelis, Daniel Bear, Dan Gutfreund, David D. Cox, Antonio Torralba, James J. DiCarlo, Josh Tenenbaum, Josh H. McDermott, Dan Yamins:
ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation. - Thomas Liao, Rohan Taori, Deborah Raji, Ludwig Schmidt:
Are We Learning Yet? A Meta Review of Evaluation Failures Across Machine Learning. - Avanika Narayan, Piero Molino, Karan Goel, Willie Neiswanger, Christopher Ré:
Personalized Benchmarking with the Ludwig Benchmarking Toolkit. - Camille Garcin, Alexis Joly, Pierre Bonnet, Antoine Affouard, Jean-Christophe Lombardo, Mathias Chouet, Maximilien Servajean, Titouan Lorieul, Joseph Salmon:
Pl@ntNet-300K: a plant image dataset with high label ambiguity and a long-tailed distribution. - Zihao Wang, Hang Yin, Yangqiu Song:
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs. - Andreas Aakerberg, Kamal Nasrollahi, Thomas B. Moeslund:
RELLISUR: A Real Low-Light Image Super-Resolution Dataset. - Jennifer J. Sun, Tomomi Karigo, Dipam Chakraborty, Sharada P. Mohanty, Benjamin Wild, Quan Sun, Chen Chen, David J. Anderson, Pietro Perona, Yisong Yue, Ann Kennedy:
The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions. - Catherine Ordun, Alexandra N. Cha, Edward Raff, Byron Gaskin, Alex Hanson, Mason Rule, Sanjay Purushotham, James L. Gulley:
Intelligent Sight and Sound: A Chronic Cancer Facial Pain Dataset. - Jiyang Qi, Yan Gao, Yao Hu, Xinggang Wang, Xiaoyu Liu, Xiang Bai, Serge J. Belongie, Alan L. Yuille, Philip H. S. Torr, Song Bai:
Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge. - Thibaut Horel, Lorenzo Masoero, Raj Agrawal, Daria Roithmayr, Trevor Campbell:
The CPD Data Set: Personnel, Use of Force, and Complaints in the Chicago Police Department. - Jaeju An, Jeongho Kim, Hanbeen Lee, Jinbeom Kim, Junhyung Kang, Minha Kim, Saebyeol Shin, Donghee Hong, Simon S. Woo:
VFP290K: A Large-Scale Benchmark Dataset for Vision-based Fallen Person Detection. - Yang Deng, Juncheng Dong, Simiao Ren, Omar Khatib, Mohammadreza Soltani, Vahid Tarokh, Willie Padilla, Jordan M. Malof:
Benchmarking Data-driven Surrogate Simulators for Artificial Electromagnetic Materials. - Megan Stanley, John Bronskill, Krzysztof Maziarz, Hubert Misztela, Jessica Lanini, Marwin H. S. Segler, Nadine Schneider, Marc Brockschmidt:
FS-Mol: A Few-Shot Learning Dataset of Molecules. - Alex Tamkin, Vincent Liu, Rongfei Lu, Daniel Fein, Colin Schultz, Noah D. Goodman:
DABS: a Domain-Agnostic Benchmark for Self-Supervised Learning. - Nan Rosemary Ke, Aniket Didolkar, Sarthak Mittal, Anirudh Goyal, Guillaume Lajoie, Stefan Bauer, Danilo Jimenez Rezende, Michael Mozer, Yoshua Bengio, Chris Pal:
Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning. - Hang Yu, Yufei Xu, Jing Zhang, Wei Zhao, Ziyu Guan, Dacheng Tao:
AP-10K: A Benchmark for Animal Pose Estimation in the Wild. - Michelle Bao, Angela Zhou, Samantha Zottola, Brian Brubach, Sarah Desmarais, Aaron Horowitz, Kristian Lum, Suresh Venkatasubramanian:
It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks. - Katharina Eggensperger, Philipp Müller, Neeratyoy Mallik, Matthias Feurer, René Sass, Aaron Klein, Noor H. Awad, Marius Lindauer, Frank Hutter:
HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO. - Christopher Yeh, Chenlin Meng, Sherrie Wang, Anne Driscoll, Erik Rozi, Patrick Liu, Jihyeon Janel Lee, Marshall Burke, David B. Lobell, Stefano Ermon:
SustainBench: Benchmarks for Monitoring the Sustainable Development Goals with Machine Learning. - Mark Weber, Jun Xie, Maxwell D. Collins, Yukun Zhu, Paul Voigtlaender, Hartwig Adam, Bradley Green, Andreas Geiger, Bastian Leibe
, Daniel Cremers, Aljosa Osep, Laura Leal-Taixé, Liang-Chieh Chen:
STEP: Segmenting and Tracking Every Pixel. - Felix Pei, Joel Ye, David M. Zoltowski, Anqi Wu, Raeed H. Chowdhury, Hansem Sohn, Joseph E. O'Doherty, Krishna V. Shenoy, Matthew T. Kaufman, Mark M. Churchland, Mehrdad Jazayeri, Lee E. Miller, Jonathan W. Pillow, Il Memming Park, Eva L. Dyer, Chethan Pandarinath:
Neural Latents Benchmark '21: Evaluating latent variable models of neural population activity. - Sungjoon Park, Jihyung Moon, Sungdong Kim, Won-Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Youngsook Song, Tae Hwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Eunjeong Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho:
KLUE: Korean Language Understanding Evaluation. - Tal Ridnik, Emanuel Ben Baruch, Asaf Noy, Lihi Zelnik:
ImageNet-21K Pretraining for the Masses. - Afshin Sadeghi, Hirra Malik, Diego Collarana, Jens Lehmann:
Relational Pattern Benchmarking on the Knowledge Graph Link Prediction Task. - Ramya Srinivasan, Emily Denton, Jordan Famularo, Negar Rostamzadeh, Fernando Diaz, Beth Coleman:
Artsheets for Art Datasets. - Xingjian Shi, Jonas Mueller, Nick Erickson, Mu Li, Alexander J. Smola:
Benchmarking Multimodal AutoML for Tabular Data with Text Fields.