Compare

GPT-5 Turbo vs Gemini 2.5 Ultra

OpenAI vs Google DeepMind. Specs, benchmarks, and real per-task cost — all in one page.

Verdict

GPT-5 Turbo leads on LMSYS ELO (1398 vs 1385).

OpenAI

OpenAI’s flagship unified model. Handles text, vision, and audio natively. The generalist benchmark champion.

Native multimodalStrong math and scienceHuge ecosystem

Google DeepMind

2M-token context, native video understanding, and Google’s deepest multimodal stack. The long-context king.

2M context windowNative videoBest multimodal reasoning

Pricing

Input / 1M$6.00$7.00

Output / 1M$24.00$21.00

Context400K2.0M

Max output32K64K

LicenseProprietaryProprietary

Released2026-01-212026-02-05

Benchmarks

LMSYS ELO1398.01385.0

MMLU Pro91.290.4

HumanEval93.090.2

SWE-Bench65.858.3

MATH91.592.0

GPQA68.1—

MMMU78.882.1

IFEval88.989.7

Per-task cost

Summarize a 1-hour meeting transcript$0.102$0.115

Review a 500-line pull request$0.096$0.098

Answer a customer support ticket$0.037$0.038

Extract structured data from a resume$0.054$0.056

Debug a stack trace with context$0.084$0.084

Per-call cost using published token counts for each task. Real-world prompts vary.

Add up to four models, tweak tasks live.