Side-by-side comparison of the best AI models for software development. Scores derived from HumanEval, SWE-bench, MMLU, and LiveCodeBench — updated regularly.
Open-source and free-tier models that deliver strong coding performance without any cost barrier.
Premium frontier models offering the highest benchmark scores and most advanced reasoning capabilities.
Benchmark scores are aggregated from publicly available evaluations including HumanEval, LiveCodeBench, SWE-bench Verified, MMLU, and GPQA. Scores reflect averages across multiple runs and may differ slightly from provider-reported numbers. Speed estimates are approximate and vary by hardware. Last updated June 2026.