METR couldn't repeat its AI coding study because devs refused to work without AI. Amazon shut down its token leaderboard. Uber blew its AI budget in four months.
Morning Overview on MSN
The newest Anthropic model just took the top spot on the Super-Agent benchmark — the only AI to finish every test case end-to-end and beat OpenAI’s GPT-5.5
Anthropic’s latest AI model has reportedly reached the top of the Super-Agent benchmark, a grueling test of whether an AI system can take a real-world code repository and run it from scratch without ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results