Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>All of our tests and benchmarks account for repeatability.

What does repeatability have to do with intelligence? If I ask a 6 year old "Is 1+1=2" I don't change my estimation of their intelligence the 400th time they answer correctly.

>The machine in question has no problem replicating its results on whatever test

What machine is that? All the LLMs I have tried produce neat results on very narrow topics but fail on consistency and generality. Which seems like something you would want in a general intelligence.



>What does repeatability have to do with intelligence? If I ask a 6 year old "Is 1+1=2" I don't change my estimation of their intelligence the 400th time they answer correctly.

If your 6 year old can only answer correctly a few times out of that 400 and you don't change your estimation of their understanding of arithmetic then, I sure hope you are not a teacher.

>What machine is that? All the LLMs I have tried produce neat results on very narrow topics but fail on consistency and generality. Which seems like something you would want in a general intelligence.

No LLM will score 80% on benchmark x today then 50% on the same 2 days later. That doesn't happen, so the convoluted setup OP had is meaningless. LLMs do not 'fail' on consistency or generality.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: