Math 500 - Search News

Hosted on MSN

AI is actually bad at math, ORCA shows

ORCA benchmark trips up ChatGPT-5, Gemini 2.5 Flash, Claude Sonnet 4.5, Grok 4, and DeepSeek V3.2 In the world of George Orwell's 1984, two and two make five. And large language models are not much ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

AI is actually bad at math, ORCA shows

Trending now