
OpenAI’s o3: AI Benchmark Discrepancy Reveals Gaps in Performance Claims
Published on April 22, 2025
The FrontierMath benchmark from Epoch AI tests generative models on difficult math problems. Find out how OpenAI’s o3 and other AI models performed.

The FrontierMath benchmark from Epoch AI tests generative models on difficult math problems. Find out how OpenAI’s o3 and other AI models performed.