This was posted from another source yesterday, like similar work it’s anthropomorphizing ML models and describes an interesting behaviour but (because we literally know how LLMs work) nothing related to consciousness or sentience or thought.
> (because we literally know how LLMs work) nothing related to consciousness or sentience or thought.
1. Do we literally know how LLMs work? We know how cars work and that's why an automotive engineer can tell you what every piece of a car does, what will happen if you modify it, and what it will do in untested scenarios. But if you ask an ML engineer what a weight (or neuron, or layer) in an LLM does, or what would happen if you fiddled with the values, or what it will do in an untested scenario, they won't be able to tell you.
2. We don't know how consciousness, sentience, or thought works. So it's not clear how we would confidently say any particular discovery is unrelated to them.
We don't know how LLMs work. We create them in a process that's sort of like if you had a rock tumbler that if you put in watch parts it creates a fully assembled watch.
It would be very impressive if someone showed you one of those, and also if they told you their theory of how it works you probably shouldn't believe them.
Down towards the end they actually say it has nothing to do with consciousness. They do say it might lead to models being more transparent and reliable.
My comment from yesterday - the questions might be answered in the current article: https://news.ycombinator.com/item?id=45765026