Llama 3 8B captures that 'magic' fairly well and runs on a modest gaming PC. You can even run it on an iPhone 15 if you're willing to sacrifice floating point precision. Three years from now I full expect GPT4 quality models running locally on an iPhone.
Three years is more than twice the time since GPT-4 was released to now. Almost twice the time ChatGPT existed. At this rate, even if we'll end up with GPT-4 equivalents runnable on consumer hardware, the top models made available by big players via API will make local LLMs feel useless. For the time being, the incentive to use a service will continue.
It's like a graphics designer being limited to chose between local MS Paint, and Adobe Creative Cloud. Okay, so Llama 3 8B, if it's really as good as you say, graduates to local Paint.NET. Not useless per se, but still not even in the same class.
No one knows how it will all shake out. I'm personally skeptical scaling laws will hold beyond GPT4 sized models. GPT4 is likely severely undertrained given how much data facebook is using to train their 8B parameter models. Unless OpenAI has a dramatic new algorithmic discovery or a vast trove of previously unused data, I think GPT5 and beyond will be modest improvements.
Alternatively synthetic data might drive the next generation of models, but that's largely untested at this point.
I know this isn’t really the point, but Adobe CC hasn’t really improved all that much from Adobe CS, which was purely local and perfectly capable. A better analogy might be found in comparing Encyclopedia Brittanica to Wikipedia. The latter is far from perfect, but an astounding expansion of accessible human knowledge that represents a full, worldwide paradigm shift in how such information is maintained, distributed, and accessed.
On the same token, those of us who are sufficiently motivated can maintain and utilize a local copy of Wikipedia…frequently for training LLMs at this point, so I guess the snake has come around, and we’ve settled into a full-on ouroboros of digital media hype. ;-)
They're extremely pessimistic, 3 years is 200% of how long it took ChatGPT 3.5.
Llama 8B is ChatGPT 3.5 (18 months before L3), running on all new iPhones released since October 2022, (19 months before L3). That includes multimodal variants (built outside Facebook).