←All datasets
Directory · Datasets · Code
CodeCodeSearchNet
6M functions with docstrings, across 6 languages. Canonical dataset for code search and embedding models.
Size
6M functions
Format
jsonl
License
Apache 2.0
Maintainer
GitHub
What it\u2019s for
6M functions with docstrings, across 6 languages. Canonical dataset for code search and embedding models.