Interactive Tool
Updated March 2026

AI Model Value Map

Which models give you the best quality for the price? This scatter plot maps every major model by cost vs. benchmark quality. Models in the top-left are the best value. Click any model to compare.

Filter:
Cost axis:
Comparing:
LMSYS Elo Rating →
Best Value Zone
Cost per 1M Tokens (USD) →
# Model Provider Input/1M Output/1M Blended/1M Elo Context Value Score
Best Budget Pick
Best Value (Elo/Cost)
Top Quality

Quick Compare

How to read this chart

Each dot represents an AI model. The horizontal position shows its cost per 1M tokens, and the vertical position shows its LMSYS Elo rating (a crowd-sourced quality benchmark). The dashed line traces the "efficiency frontier" — the best quality available at each price point.

Models in the top-left corner offer the best value — high quality at low cost. Click any two models to compare them side by side, or switch to table view to sort by any column including the composite Value Score.

Frequently Asked Questions

What is the cheapest AI model with good quality?
As of March 2026, Gemini 2.5 Flash and Grok 4.1 Fast offer the best budget-to-quality ratio, both under $0.40/1M blended tokens while scoring above 1300 Elo. For even cheaper options, Llama 3.3 70B costs $0.10/1M but scores lower on benchmarks. Use the chart above to filter by your budget range.
What does "blended cost" mean?
Blended cost is the average of input and output token prices: (input + output) / 2. This gives a rough estimate of cost for typical workloads with a 50/50 input-to-output ratio. You can switch the chart to show input-only or output-only pricing using the "Cost axis" dropdown.
What is the LMSYS Elo rating?
LMSYS Chatbot Arena uses an Elo rating system (similar to chess) based on blind, head-to-head comparisons voted on by real users. A higher Elo means the model is preferred more often in direct comparisons. It's one of the most trusted crowd-sourced quality benchmarks for LLMs.
How is the Value Score calculated?
The Value Score is Elo rating divided by blended cost per 1M tokens. A higher score means more quality per dollar. This metric helps identify models that punch above their weight in quality relative to their pricing tier.