Compare

GPT-5 Turbo vs DeepSeek R2

OpenAI vs DeepSeek. Specs, benchmarks, and real per-task cost — all in one page.

Verdict

DeepSeek R2 is ~10.9× cheaper on a 3:1 input:output blend. DeepSeek R2 is open-weights, the other is proprietary.

OpenAI

OpenAI’s flagship unified model. Handles text, vision, and audio natively. The generalist benchmark champion.

Native multimodalStrong math and scienceHuge ecosystem

DeepSeek

DeepSeek’s reasoning variant. Competes with o4 on math at a fraction of the cost.

Frontier math at open-weights pricingLong CoT traces

Pricing

Input / 1M$6.00$0.55

Output / 1M$24.00$2.19

Context400K128K

Max output32K64K

LicenseProprietaryOpen weights

Released2026-01-212026-01-15

Benchmarks

LMSYS ELO1398.0—

MMLU Pro91.287.8

HumanEval93.090.1

SWE-Bench65.8—

MATH91.594.1

GPQA68.171.2

MMMU78.8—

IFEval88.9—

Per-task cost

Summarize a 1-hour meeting transcript$0.102$0.0093

Review a 500-line pull request$0.096$0.0088

Answer a customer support ticket$0.037$0.0034

Extract structured data from a resume$0.054$0.0049

Debug a stack trace with context$0.084$0.0077

Per-call cost using published token counts for each task. Real-world prompts vary.

Add up to four models, tweak tasks live.