Inference Index
All datasets
Directory · Datasets · Domain
Domain

arXiv Dataset

Full-text snapshot of the arXiv preprint archive. The substrate of every AI-science-assistant startup.

Size
2.3M papers
Format
parquet
License
CC0 (metadata) + per-paper
Maintainer
arXiv / Cornell

What it\u2019s for

Full-text snapshot of the arXiv preprint archive. The substrate of every AI-science-assistant startup.