> the current method for training requires this volume of data
This is one of those things that signal how dumb this technology still is - or maybe how smart humans are when compared to machines. A human brain doesn't need anywhere close to this volume of data, in order to be able to produce good output.
I remember talking with friends 30 years ago about how it was inevitable that the brain would eventually be fully implemented as machine, once calculation power gets big enough; but it looks like we're still very far from that.
> A human brain doesn't need anywhere close to this volume of data, in order to be able to produce good output.
Maybe not directly, but consider that our brains are the product of million of years of evolution and aren't a blank slate when we're born. Even though babies can't speak a language at birth, they already have all the neural connections in place in order to acquire and manipulate language, and require just a few years of "supervised fine tuning" to learn the actual language.
LLMs, on the other hand, start with their weights at random values and need to catch up with those million years of evolution first.
Add to this, the brain is constantly processing raw sensory data from the moment it became viable, even when the body is "sleeping". It's using orders of magnitude more data than any model in existence every moment, but isn't generally deemed "intelligent" enough until it's around 18 years old.
It’s unlikely that sensory data contributes to cognitive ability in humans. People with sensory impairments, such as blind people, are not less cognitively capable than people without sensory impairments. Think of Helen Keller, who, despite taking in far less sensory information than the average person, was still more intelligent than average.
Without sensory data there cannot be actual cognitive ability, though there may be potential for it. The data doesn't have to be visual; bear in mind we have 5 senses. When vision is impaired, hearing becomes far more sensitive to compensate. And theoretically, if someone were to only have use of a single sense, they may still be able to use the data from it to actualize their cognition, but it would take a lot more effort and there would be large gaps in capability. Just as, technically, preprocessed vision* is the primary "sense" of LLMs.
* Preprocessed since the data is actually of 1D streams of characters, and not 2D colour points (as with vision models).
sadly, those weights will not be inherited like they would to a baby. They'll be cooped up until the company dies, and that data probably dies with them. No wonder LLM has allegedly hit some stalls already.
> A human brain doesn't need anywhere close to this volume of data, in order to be able to produce good output.
> I remember talking with friends 30 years ago
I'd say you're pretty old. How many years of training did it take for you to start producing good output?
The leason here is we're kind of meta-trained: our minds are primed to pick up new things quickly by abstracting them and relating them to things we already know. We work in concepts and mental models rather than text. LLMs are incredibly weak by comparison. They only understand token sequences.
That's the point I think. It should be possible to require orders of magnitude less data to create an intelligence, and we are far from achieving that (including achieving AGI in the first place even with those huge amounts of data).
My point is it took a very large amount of data for a human to be able to "produce good output". Once it had its performance was of a different strata though.
We are unbelievably far from that. Everyone who tells you that we're within 20 years of emulating brains and says stuff like "the human brain only runs at 100 hertz!" has either been conned by a futurist or is in denial of their own mortality.
Absolutely! But the question is whether the next step-change in intelligence is just around the corner (in which case, this legal speedbump might spur innovation). Or, will the next revolution take a while.
There's enough money in the market to fund a lot of research into totally novel underlying methods. But if it takes too long, investors and lawmakers will just move to make what already works legal, because it is useful.
> I remember talking with friends 30 years ago about how it was inevitable that the brain would eventually be fully implemented as machine, once calculation power gets big enough; but it looks like we're still very far from that.
Why would it be?
"It's inevitable that the Burj Khalifa gets built, once steel production gets high enough."
"It's inevitable that Pegasuses will be bred from horses, as soon as somebody collects enough oats."
Reducing intelligence to the bulk aggregate of brute "calculation power" is... Ironically missing the point of intelligence.
This is one of those things that signal how dumb this technology still is - or maybe how smart humans are when compared to machines. A human brain doesn't need anywhere close to this volume of data, in order to be able to produce good output.
I remember talking with friends 30 years ago about how it was inevitable that the brain would eventually be fully implemented as machine, once calculation power gets big enough; but it looks like we're still very far from that.