"A Benchmark Suite for Systematically Evaluating Reasoning Shortcuts."

Samuele Bortolotti et al. (2024)

Details and statistics

DOI: 10.5281/ZENODO.11612556

access: open

type: Data or Artifact

metadata version: 2025-03-17