Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>We already had a definition of AGI. We hit it.

The turing test.



As far as I can tell, passing the Turing test has never been the majority-consensus definition of AGI. It seems to me that the Turing test has fundamentally always been about proving a negative: if something fails the Turing test, it's probably not AGI.

For reference, the term AGI post-dates the Turing test by half a century. I also don't personally remember ever hearing the exact term "artificial general intelligence" prior to 2023 or 2024, or at least it wasn't mainstream the way it is today.

If AGI had truly ever been defined by the Turing test, then Cleverbot should've been hailed as AGI when it passed the test in 2011. Even if we did all agree to call it that, we'd still need some other term for what we actually mean when we say "AGI" today. Cleverbot-era chatbots were cute toys, but they weren't capable of doing useful work of any kind.


That’s not accurate. The Turing test was always intended as a benchmark for general intelligence. Turing’s 1950 paper explicitly proposed it as a way to operationalize the question “Can machines think?” not as a parlor trick about conversation but as a proxy for indistinguishability in intellectual behavior. The whole point of the imitation game was to sidestep metaphysical arguments and reduce intelligence to functional equivalence. If a machine could consistently hold its own in unrestricted dialogue, it would demonstrate the breadth, adaptability, and contextual understanding that characterize general intelligence.

The term AGI may have come later, but the concept it represents traces directly back to Turing’s framing. When early AI researchers talked about “strong AI” or “thinking machines,” they were using the same conceptual lineage. The introduction of the acronym doesn’t rewrite that history, it just gave a modern label to an old idea. The Turing test was never meant to detect a “negative” but to give a concrete, falsifiable threshold for when positive claims of general intelligence might be justified.

As for Cleverbot, it never truly passed the test in any rigorous or statistically sound sense. Those 2011 headlines were based on short exchanges with untrained judges and no control group. Passing a genuine Turing test requires sustained coherence, reasoning across domains, and the ability to handle novel input gracefully. Cleverbot couldn’t do any of that. It failed the spirit of the test even if it tricked a few people in the letter of it.

By contrast, modern large language models can pass the Turing test in flying colors. They can maintain long, open-ended conversations, reason about complex subjects, translate, summarize, and solve problems across many domains. Most human judges would be unable to tell them apart from people in text conversation, not for a few sentences but for hours. Granted, one can often tell ChatGPT is an AI because of its long and overly descriptive replies, but that’s a stylistic artifact, not a limitation of intelligence. The remarkable thing is that you can simply instruct it to imitate casual human conversation, and it will do so convincingly, adjusting tone, rhythm, and vocabulary on command. In other words, the test can be passed both intentionally and effortlessly. The Turing test was never obsolete; we finally built systems that can truly meet it.


I can definitely see the case for that. Ultimately, we're going to need vocabulary for all of the following:

* >=GPT-3.5-level intelligence

* AI that replaces an ordinary human for knowledge work

* AI that replaces an ordinary human for all work (given sufficiently capable hardware)

* AI that replaces any human for knowledge work

* AI that replaces any human for all work (given sufficiently capable hardware)

It doesn't really matter to me which of those we call "AGI" as long as we're consistent. One of them may be AGI, but all of them are important milestones.


The Turing test was never a test of thinking: Turing said that thinking was difficult to define and so he decided to "replace the question by another, which is closely related to it" (I disagree with him there) "and is expressed in relatively unambiguous words," i.e. the question of whether a chatbot can fool a text-only observer into thinking it's human.

Clearly, current LLMs have passed the Turing test, as witnessed by the difficulty many schools have in enforcing "do not use LLMs to do your homework" rules. But even Turing didn't say his test was a test of intelligence, just a test "closely related" to intelligence. And if he had seen today's LLMs, I think he would have revised that opinion, because today's LLMs generate text with no underlying fact model, no fundamental understanding of the truth behind the words they're saying. (No understanding, even, of the concepts of truth or falsehood). I think today's LLMs have demonstrated that being able to string words together in coherent sentences is not "closely related" to intelligence at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: