I really appreciate his iconoclasty right now, but every time I engage with his ideas I come away feeling short changed. I’m always like “there is no such thing as outside the training data”. What’s inside and what’s outside the training data is at least as ill-defined as “what is AGI”.
ChatGPT 3 was the first AI that could do 100,000 different things poorly. Before that we only had AIs that could do a few things decently, or very well. So yeah, I'm sticking with "baby AGI" because of the "G".
I don't have an opinion on whether ChatGPT qualifies as AGI. What I'm saying is where one stands on that question has nothing to do with "why it became so popular so fast."
(Also, several machine-learning techniques could do millions of things terribly before LLMs. GPT does them, and other things, less poorly. It's a broadening. But I suppose really any intelligence of any kind can be considered a "baby" AGI.)
The "ChatGPT" web app started with the underlying model GPT-3.5
The predecessor models, a whole series of them collectively "GPT-3" but sold under API with names like "davinci" and "ada", was barely noticed outside AI research circles.
3 was useful, but you had to treat it as a text completion system not a chat interface, your prompt would have been e.g.
Press release
Subject: President announces imminent asteroid impact, evacuation of Florida
My fellow Americans,
Because if you didn't put "My fellow Americans," in there, it would then suggest a bunch of other press release subjects.