AnalysisApr 12, 2026 · 4mo ago

DeepSeek V4 quietly reset the cost curve again

At $0.14 input / $0.28 output per million tokens, DeepSeek V4 now matches GPT-5 Mini on quality while costing less than Gemini Flash.

DeepSeek V4 released with benchmarks that would have been frontier-tier six months ago — MMLU-Pro 85.4, HumanEval 91.5, SWE-Bench 54.1 — at pricing that is hard to reconcile with any reasonable GPU utilization model.

At $0.14 per million input tokens and $0.28 output, DeepSeek V4 is cheaper than Gemini Flash and outperforms it on most benchmarks we track. Whether this is sustainable depends on questions we can’t fully answer, but the market-clearing price for "Western frontier quality" has decisively moved.

The near-term implication for builders: if your stack is cost-sensitive and your use case doesn’t require vision, V4 deserves a real evaluation. The farther-term implication for Western labs: margin pressure is real and will not go away.

Byline

Inference Index

DeepSeek V4 quietly reset the cost curve again

More analysis stories

Why we think Llama 4 Scout is the best sub-8B deployment today