Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is one of the last bastions of anthropocentric thinking. I hope this will change in this century. I believe even plants are capable of communication. Everything that changes over time or space can be a signal. And most organisms can generate or detect signals. Which means they do communicate. The term “language” has traditionally been defined from an anthropocentric perspective. Like many other definitions about the intellect (consciousness, reasoning etc.).

That’s like a bird saying planes can’t fly because they don’t flap their wings.

LLMs use human language mainly because they need to communicate with humans. Their inputs and outputs are human language. But in between, they don’t think in human language.



> LLMs use human language mainly because they need to communicate with humans. Their inputs and outputs are human language. But in between, they don’t think in human language.

You seem to fundamentally misunderstand what llms are and how they work, honestly. Remove the human language from the model and you end up with nothing. That's the whole issue.

Your comment would only make sense if we had real artificial intelligence, but LLMs are quite literally working by predicting the next token - which works incredibly well for a fascimlie of intelligence because there is an incredible amount of written content on the Internet which was written by intelligent people


True, but a human child is taught a language. He doesn't come with it. It is an important part of how our brains form.


A human child not taught literally anything can see some interesting item extend a hand to it, touch it, interact with it. All decided by the child. Heck, even my cat can see a new toy, go to it and play with it, without any teaching.

LLMs can't initiate any task on their own, because they lack thinking/intelligence part.


I'm not sure it's the lack of intelligence so much as they aren't generally in a snooze - look for something fun to do - snooze loop like cats.


This to me overstretches the definition of teaching. No, a human baby is not "taught" language, it learns it independently by taking cues from its environment. A child absolutely comes with an innate ability to recognize human sound and the capability to reproduce it.

By the time you get to active "teaching", the child has already learned language -- otherwise we'd have a chicken-and-egg problem, since we use language to teach language.


>but LLMs are quite literally working by predicting the next token - which works incredibly well for a fascimlie of intelligence because there is an incredible amount of written content on the Internet which was written by intelligent people

An additional facet nobody ever seems to mention:

Human language is structured, and seems to follow similar base rules everywhere.

That is a huge boon to any statistical model trying to approximate it. That's why simpler forms of language generation are even possible. It's also a large part of why LLMs are able to do some code, but regularly fuck up the meaning when you aren't paying attention. The "shape" of code and language is really simple.


How do we know animal language isn’t structured, in similar ways? For example we now know that “dark” birds are often colorful, just in the UV spectrum they can see and we can’t. Similarly there’s evidence dolphin and whale speech may be structured, we just don’t know the base rules; their speech is modulated at such rapid frequency our computers until maybe recently would struggle to even record and process that data realtime (probably still do).

Just because we don’t understand something doesn’t mean there’s nothing there.

Also, I’m not so sure human language is structured the same way globally. There’s languages quite far from each other and the similarities tend to be grouped by where the languages originated. Eg Spanish and French might share similarities of rules, but those similarities are not shared with Hungary or Chinese. There’s cross pollination of course but language is old and humans all come from a single location so it’s not surprising for there to be some kinds of links but even a few hundred thousand years of evolution have diverged the rules significantly.


Transformers are very powerful also for non-language data. For example time series, sequences like DNA or audio (also outside of speech and music). Of course the vast amount of human text is key to training a typical LLM, but it is not the only use.


Well, you can explain to a plant in your room that E=mc2 in a couple of sentences, a plant can't explain to you how it feels the world.

If cows were eating grass and conceptualising what is infinity, and what is her role in the universe, and how she was born, and what would happen after she is dead... we would see a lot of jumpy cows out there.


This is exactly what I mean by anthropocentric thinking. Plants talk plant things and cows talk about cow issues. Maybe there are alien cows in some planet with larger brains and can do advanced physics in their moo language. Or some giant network of alien fungi discussing about their existential crisis. Maybe ants talk about ant politics by moving their antennae. Maybe they vote and make decisions. Or bees talk about elaborate honey economics by modulating their buzz. Or maybe plants tell bees the best time for picking pollens by changing their colors and smell.

Words, after all are just arbitrary ink shapes on paper. Or vibrations in air. Not fundamentally different than any other signal. Meaning is added only by the human brain.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: