>All of our tests and benchmarks account for repeatability. What does repeatabil...

famouswaffles · 2025-10-27T15:50:08 1761580208

>What does repeatability have to do with intelligence? If I ask a 6 year old "Is 1+1=2" I don't change my estimation of their intelligence the 400th time they answer correctly.

If your 6 year old can only answer correctly a few times out of that 400 and you don't change your estimation of their understanding of arithmetic then, I sure hope you are not a teacher.

>What machine is that? All the LLMs I have tried produce neat results on very narrow topics but fail on consistency and generality. Which seems like something you would want in a general intelligence.

No LLM will score 80% on benchmark x today then 50% on the same 2 days later. That doesn't happen, so the convoluted setup OP had is meaningless. LLMs do not 'fail' on consistency or generality.