The Inference Index
A family of three benchmark numbers for the AI inference market — cost, capability, and value — computed daily from the same public data feeds that back every page on this site.
Average best-price input cost across the top‑10 intelligence-ranked models. USD per 1M tokens. The headline number.
Highest intelligence score available across all scored models. Tracks the capability frontier.
Average of (intelligence score ÷ blended price) across all scored models. Rising = more capability per dollar.
01 II-C — Cost Index
We rank all models with a known intelligence score by that score (highest first), take the top 10, and compute the unweighted average of their best available input price across all providers. The result is denominated in USD per 1M input tokens.
“Best input price” is the cheapest available non-free input price across all providers listed for that model: CheapTokens, direct provider (Anthropic, OpenAI, Google, etc.), and Venice where applicable.
02 II-I — Intelligence Index
Simply the highest intelligence score among all models we track. It answers: “How smart is the smartest model available today?”
Tracked over time to show how the capability frontier evolves. When a new frontier model ships and beats the incumbent, II-I rises.
03 II-V — Value Index
For each scored model we compute iq / blendedPrice where blended price is (input × 3 + output) / 4 (reflecting typical 3:1 input/output usage). We average that ratio across all scored models. Higher is better: more intelligence per dollar.
II-V = avg(ratio) across all scored models
When prices fall faster than intelligence, II-V rises. A sustained rise in II-V is the clearest signal that the market is improving for buyers.
04 Derived metrics
- →Cost-per-IQ: frontier model's best input price divided by its intelligence score. Smaller is better. An intuitive “unit price of frontier intelligence.”
- →Median input / output: median best-price across every tracked model (not just top-10). Helps spot long-tail moves that the index misses.
- →Biggest drop / rise: per-model 24h % change in best-price. Computed from consecutive daily snapshots.
- →Model basis:
cheapest channel price − direct owner price. Negative values indicate channel discount vs direct. - →Input dispersion: average cross-provider price dispersion (stddev/mean) for models sold by multiple providers.
05 Data sources
- →Intelligence: Artificial Analysis Intelligence Index — live API with a 6-hour in-memory cache and a static fallback for API-down scenarios.
- →Venice base pricing: Venice API polled every 5 minutes. CheapTokens discount applied on top to show the effective cheapest Venice-route price.
- →Direct provider overrides: Manually verified prices from Anthropic, OpenAI, Google, xAI, DeepSeek, Groq, and Together AI pricing pages. Audited periodically.
- →Daily snapshots: All index values are snapshotted daily at UTC midnight and written immutably. The public historical series begins on 2026-04-16 (first live capture). See daily brief for the running log.
06 Neutrality policy
We do not accept payment for rankings, placement, or weighting. Providers that beat price on a model show up as “best” — full stop. If CheapTokens is the cheapest, it wins. If Anthropic's direct price is cheaper, Anthropic wins. If you spot an error, email us or open an issue.
07 Versioning
This methodology is versioned as v1.1. When the index computation changes materially, the version number increments and historical snapshots are preserved under their original version.
- v1.0 — initial release. Top-10 by AA intelligence, unweighted avg best input.
- v1.1 — added derived metrics (cost-per-IQ, median input/output, daily movers); clarified neutrality; added citation scaffolding and per-model history.
08 Raw data & API
The full daily snapshot is machine-readable JSON:
Cite this methodology
Inference Index (v1.1). "Methodology for II-C, II-I, II-V." Retrieved 2026-04-16. https://inferenceindex.ai/methodologycurl https://inferenceindex.ai/api/snapshot