←All datasets
Directory · Datasets · Evaluation
EvaluationMMLU-Pro (test set)
The evaluation data behind MMLU-Pro. Use it to benchmark new models on expert knowledge.
Size
12K questions
Format
jsonl
License
MIT
Maintainer
TIGER-Lab
What it\u2019s for
The evaluation data behind MMLU-Pro. Use it to benchmark new models on expert knowledge.