Let me respond with an analogy of my own. Imagine you are a scientist on an alien world. The aliens primary experience the world through magnetic fields. They live deep in the atmosphere of a hot Jupiter like planet and rarely touch anything and have no eyes. Still they are intelligent beings and so quickly they are able to establish communication with you. A computer translates and you both have to become a bit more familiar with each other's modes of perceiving the world. You could write a whole novel explaining this sort of difference in modes of perception, but my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes. In fact, if you are to communicate you'll have to. I think the sort of modal/sense difference your analogy plays on is similar because I think for a human to get good at responding you'd have to start knowing things about the symbols. That knowledge obviously wouldn't be grounded in a way that you could translate it back into English. But you might for example learn that one word is a type of another or even that some words describe entities that are then referenced later and to actually get good at it, which it's not at all clear a human could, even that some entities have hidden state
This feels related to the idea of the Chinese room. There I think the resolution is that the human following instructions does not understand Chinese but the room, the system of instructions + the human to follow them does. In a similar way obviously an individual neuron doesn't understand anything but brains do.
I guess it just feels like this general argument, that merely seeing things and making predictions that turn out to be right isn't enough to understand it will never go away. We could have a full fledged robot walking around having conversations and I could dispute its ability to really understand. It's just learned to imitate other humans I'd say. It doesn't really know anything, it's just following a statistical model to decide how to move an arm
> but my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes.
I think it's obviously no, because we don't have sensations of magnetic fields. It's the question of what it's like to be a bat raised by Thomas Nagel. The aliens can give us their words for conscious magnetic sensations which we can learn to use, but we won't experience them. We're basically p-zombies when it comes to non-human experiences.
> There I think the resolution is that the human following instructions does not understand Chinese but the room, the system of instructions + the human to follow them does. In a similar way obviously an individual neuron doesn't understand anything but brains do.
Searle's response to the systems objection is that we already know that brains understand Chinese. But we don't know this for the room. I would further say that brains alone don't understand anything, humans understand things as language users embedded in a social and physical world. One can invoke Wittgenstein and language games here.
I agree with you. I really enjoy this idea that understanding, conscience, are emerging properties of a system, which does not need to limit itself to any scope to happen. In that light the current approach most people take on this, taking an arbitary selection of parts to see if it exhibits those properties, is not right at all.
A ion channel does not have even a tiny spec of conscience, no matter how you organize them, but our brain does indeed need those to be conscient (and incidentally it relies on a whole lot more "stupid" parts than that: try being conscient without oxygen, or glucose).
I would go as far as making conscience an emergent property of interaction with the environment: what does it mean to be conscious if nothing is there to confirm that you are indeed of a singular conscience? Is it possible to understand the concept of self if you have no concept of other beings?
> my question is if you, the human, can learn to understand what it is to perceive magnetic fields? I think obviously the answer is yes.
I certainly don't see that as obvious, and I would guess that while you can learn _about_ their perceptual mode, you can't learn what it is like to perceive magnetic fields just through talking about it. I would consider the Mary's Room thought experiment, and the What Is It Like To Be a Bat paper from Nagel.
I think there's a relationship to the Chinese Room, but I want to be clear. In the original formulation, the person in the room follows a book of pre-provided instructions to produce a response. The LLM and person in the Thai text completion scenario must learn an equivalent set of instructions themselves, and for this I would claim that they are comparable to the human + book combination in the original Chinese Room. The person who learns to complete Thai text doesn't know what they're talking about, but they know more than the person following instructions in the Chinese Room. But clearly they still don't know what a Thai speaker knows.
> I guess it just feels like this general argument, that merely seeing things and making predictions that turn out to be right isn't enough to understand it will never go away. We could have a full fledged robot walking around having conversations and I could dispute its ability to really understand.
No, perhaps the end of my original statement didn't make this clear, but I think AI systems _can_ know things, and knowing is not a binary but part of a range. StabilityAI / DALL-e know quite a bit about the relationship between texts and images, and the structure within images -- but they _don't_ know about bodies, physical reality, etc etc. A system that has multiple modalities of perception, learns to physically navigate the world, interact with objects, make and execute plans by understanding the likely effects of actions, etc -- knows and understands a lot. I'm not arguing about a hard limitation of AI; I'm arguing about a limitation of the way our current AIs are built and trained.
My intuition is that the difference between GP's analogy and the Chinese room is in computing power of the system, in the sense of Chomsky hierarchy[0] (as opposed to instructions per second).
In the Chinese room, the instructions you're given to manipulate symbols could be Turing-complete programs, and thus capable of processing arbitrary models of reality without you knowing about them. I have no problem accepting the "entire room" as a system understands Chinese.
In contrast, in GP's example, you're learning statistical patterns in Thai corpus. You'll end up building some mental models of your own just to simplify things[1], but I doubt they'll "carve reality at the joints" - you'll overfit the patterns that reflect regularities of Thai society living and going about its business. This may be enough to bluff your way through average conversation (much like ChatGPT does this successfully today), but you'll fail whenever the task requires you to use the kind of computational model your interlocutor uses.
Math and logic - the very tasks ChatGPT fails spectacularly at - are prime examples. Correctly understanding the language requires you to be able to interpret the text like "two plus two equals" as a specific instance of "<number> <binary-operator> <number>"[2], and then execute it using learned abstract rules. This kind of factoring is closer to what we mean by understanding: you don't rely on surface-level token patterns, but match against higher-level concepts and models - Turing-complete programs - and factor the tokens accordingly.
Then again, Chinese room relies on the Chinese-understanding program to be handed to you by some deity, while GP's example talks about building that program organically. The former is useful philosophically, the latter is something we can and do attempt in practice.
To complicate it further, I imagine the person in GP's example could learn the correct higher-level models given enough data, because at the center of it sits a modern, educated human being, capable of generating complex hypotheses[3]. Large Language Models, to my understanding, are not capable of it. They're not designed for it, and I'm not sure if we know a way to approach the problem correctly[4]. LLMs as a class may be Turing-complete, but any particular instance likely isn't.
In the end, it's all getting into fuzzy and uncertain territory for me, because we're hitting the "how the algorithm feels from inside" problem here[5] - the things I consider important to understanding may just be statistical artifacts. And long before LLMs became a thing, I realized that both my internal monologue and the way I talk (and how others seem to speak) is best described as a Markov chain producing strings of thoughts/words that are then quickly evaluated and either discarded or allowed to be grown further.
[1] - On that note, I have a somewhat strong intuitive belief that learning and compression are fundamentally the same thing.
[2] - I'm simplifying a bit for the sake of example, but then again, generalizing too much won't be helpful, because most people only have procedural understanding of few most common mathematical objects, such as real numbers and addition, instead of a more theoretical understanding of algebra.
[3] - And, of course, exploit the fact that human languages and human societies are very similar to each other.
[4] - Though taking a code-generating LLM and looping it on itself, in order to iteratively self-improve, sounds like a potential starting point. It's effectively genetic programming, but with a twist that your starting point is a large model that already embeds some implicit understanding of reality, by virtue of being trained on text produced by people.
> I have no problem accepting the "entire room" as a system understands Chinese.
> you'll fail whenever the task requires you to use the kind of computational model your interlocutor uses.
I think it's important to distinguish between knowing the language and knowing anything about the stuff being discussed in the language. The top level comment all this is under mentioned knowing what a bag is or what popcorn is. These don't require computational complexity, but do require some other data than just text, and a model that can relate multiple kinds of input.
This feels related to the idea of the Chinese room. There I think the resolution is that the human following instructions does not understand Chinese but the room, the system of instructions + the human to follow them does. In a similar way obviously an individual neuron doesn't understand anything but brains do.
I guess it just feels like this general argument, that merely seeing things and making predictions that turn out to be right isn't enough to understand it will never go away. We could have a full fledged robot walking around having conversations and I could dispute its ability to really understand. It's just learned to imitate other humans I'd say. It doesn't really know anything, it's just following a statistical model to decide how to move an arm