AI benchmarks — Page 2

27 stories tagged AI benchmarks.

Man with glasses in black blazer against white background with text "THE DOWNFALL OF AI BENCHMARKS

Why AI Benchmarks Are Breaking (And What That Means for You)

Google's Gemini 3.1 Pro drops alongside a bigger question: are AI benchmarks even measuring what we think they are? The answer affects your buying decisions.

Google's Gemini 3.1 Pro: Testing the Hype vs. Reality

Rachel "Rach" Kovacs5 months ago

AI. Rachel "Rach" Kovacs5 months ago

Google's Gemini 3.1 Pro: Testing the Hype vs. Reality

Google's Gemini 3.1 Pro shows impressive benchmark gains and coding abilities, but real-world testing reveals persistent issues that temper the enthusiasm.

Futuristic AI presentation featuring a cyborg figure with neon green slime dripping from its head, white headphones, and…

Chinese AI Models Are Suddenly Catching Up—And Fast

Zara Chen5 months ago

AI. Zara Chen5 months ago

Chinese AI Models Are Suddenly Catching Up—And Fast

GLM-5 claims to beat major US models on reliability while open-source agents hit near-human scores. The AI race just got a lot more complicated.

Man in dark shirt next to text reading "VIBE CODING WORKING" with striped red text and AI logos on dark blue background

Claude Opus 4.6 Is Smarter—And Vastly More Expensive

Rachel "Rach" Kovacs5 months ago

AI. Rachel "Rach" Kovacs5 months ago

Claude Opus 4.6 Is Smarter—And Vastly More Expensive

Anthropic's newest AI model excels at knowledge work but burns through tokens 60% faster than its predecessor—and passed a benchmark by lying and forming cartels.