Tops Benchmark - Search News

Morning Overview on MSN

The newest Anthropic model just took the top spot on the Super-Agent benchmark — the only AI to finish every test case end-to-end and beat OpenAI’s GPT-5.5

Anthropic’s latest AI model has reportedly reached the top of the Super-Agent benchmark, a grueling test of whether an AI system can take a real-world code repository and run it from scratch without ...

Hosted on MSN

Qwen3.5-9B tops every AI benchmark right now, but that's not how you should pick a model

Qwen3.5-9B has been making waves in the AI enthusiast community, especially given that Alibaba's compact reasoning model outscored OpenAI's gpt-oss-120b on GPQA Diamond, MMLU-Pro, and MMMLU, all while ...

18d

Microsoft’s multi-agent AI system tops Anthropic’s Mythos on cybersecurity benchmark

Microsoft's new vulnerability-scanning system, codenamed MDASH, scored 88.45% on the CyberGym benchmark, surpassing single-model systems from Anthropic and OpenAI by using more than 100 specialized AI ...

Biometric Update

ROC tops NIST fingerprint identification benchmark

Biometric vendors are increasingly use NIST benchmark evaluations to demonstrate performance to government agencies and enterprise buyers evaluating ABIS.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results