Benchmarks optimize for fundraising, not users. The gap between "state of the ar...

yawnxyz · 2025-11-08T16:41:24 1762620084

we try to make benchmarks for users, but it's like that 20% article - different people want different 20% and you just end up adding "features" and whackamoling the different kinds of 20%

if a single benchmark could be a universal truth, and it was easy to figure out how to do it, everyone would love that.. but that's why we're in the state we're in right now

DrewADesign · 2025-11-08T18:59:39 1762628379

The problem isn’t with the benchmarks (or the models, for that matter) it’s their being used to prop up the indefensible product marketing claims made by people frantically justifying asking for more dump trucks of thousand-dollar bills to replace the ones they just burned through in a few months.

yawnxyz · 2025-11-12T05:04:28 1762923868

unfortunately as benchmark makers we can't really do anything about human nature :shrug:

DrewADesign · 2025-11-12T14:54:00 1762959240

Absolutely not. This is not a problem with any part of the engineering process. Nearly everything wrong with the AI business lies at the feet of product managers, marketing, the c-suite crowd, etc.