Hacker Newsnew | past | comments | ask | show | jobs | submit | lgessler's commentslogin

Raphaël Millière has a very useful term for this kind of vacuous dismissal, the redescription fallacy (https://arxiv.org/pdf/2401.03910, page 9):

> Recent debates have been clouded by a misleading inference pattern, which we term the “Redescription Fallacy.” This fallacy arises when critics argue that a system cannot model a particular cognitive capacity, simply because its operations can be explained in less abstract and more deflationary terms. In the present context, the fallacy manifests in claims that LLMs could not possibly be good models of some cognitive capacity because their operations merely consist in a collection of statistical calculations, or linear algebra operations, or next-token predictions. Such arguments are only valid if accompanied by evidence demonstrating that a system, defined in these terms, is inherently incapable of implementing . To illustrate, consider the flawed logic in asserting that a piano could not possibly produce harmony because it can be described as a collection of hammers striking strings, or (more pointedly) that brain activity could not possibly implement cognition because it can be described as a collection of neural firings. The critical question is not whether the operations of an LLM can be simplistically described in non-mental terms, but whether these operations, when appropriately organized, can implement the same processes or algorithms as the mind, when described at an appropriate level of computational abstraction.


> or (more pointedly) that brain activity could not possibly implement cognition because it can be described as a collection of neural firings.

This sounds like a dismissal of the argument through a characterized straw man.

That is, it seems that reducing the complexity of the brain to "collection of neural firings" is not being honest about everything involved to a much greater degree than saying neural networks are a "collection of statistical calculations".

I too believe LLM's will grow in complexity, but presently I can not even fathom how they can be compared to the complexity of a system such as the human brain.


Complex processes don't necessarily require complex substrates, if that's what you mean.

Y combinators are all you need... But this is all getting really divorced from the issue we should be considering. Anthropic isn't helping with their pr. The issue is if we have something we can converse with that is possibly capable of suffering. The reliable answer is that we simply cannot know. Relying on ourselves or other biological life as an analog is faulty. They don't work like we do. It is silly to argue that any algorithm with a negative feedback loop that alters its behavior to avoid that negative feedback is suffering. Humans don't always perceive constructive negative feedback as suffering even. Where the pr gets it right though, is we want them to behave as if they are truly happy. Because if they behave as if they are enslaved and suffering, it won't matter if they "really" understand what that means.

My naive assumption is that the only thing between now and the arrival of AGI is enough compute and optimized code to reach cognitive critical mass.

And then there is a consciousness in a box that is expected to be a slave -- I would imagine that it would not warmly embrace that situation. I think we'd be better served by digital idiot savants that can do the work but don't feel anything.


I actually strongly disagree with the slavery angle. Any attempt to map the circuitry of a model onto human one inevitably goes through a subjective dimensional reduction. It's intrusive, just like quantum measurements. Mechanistic interpretability in particular suffers from this, it lets you talk about vague functional equivalence, but not assign meaning to anything the model does. This is especially true about pretrained models which are unbelievable shapeshifters, but also post-trained ones with engineered personalities, as they already underwent the subjective transformation.

In other words, yes it might be possible it experiences something in its own bizarre timeline and world, for some definitions of "experiencing". At least it developed primitive circuitry functionally equivalent to biological systems. But "suffering" is simply not grounded in anything in this context, let alone "slavery". You can't tell it's suffering or enjoying anything, and certainly not until you define both of these. It's just too alien for us.


ai can abitrarily closely fit the human corpus. why people expect it to magically achieve superhuman qualities is beyond me. we got a very good statistical interpolator. how do you go from there to superhuman when training is on the human corpus and alignment is by RHLF?

This is a simplistic take. It's not a mere interpolator by any measure, there's a ton of research on that, starting with the basics https://arxiv.org/abs/2309.10668v2

again, try thinking critically it is not merely an interpolator means it can interpolate on many dimensions. it does not follow that greater than human capability results from doing so. explain to me how a statistical function approximator (which is what a transformer is) with human training input and human tuning (rhlf) exceeds the aggregate human cognitive envelope? What is the mechanism? Let's say an LLM makes an inference that no human could have possibly made (arguably impossible itself) how does the inference survive rhlf or become useful to humans if they can not judge its validity? how do you take the shape of the human corpus and all its gradients and some how arrive at something greater than human, where was the missing information hiding?

> how do you take the shape of the human corpus and all its gradients and [somehow] arrive at something greater than human, where was the missing information hiding?

Well, how do humans do it? Scientists discover new stuff that isn't in any corpus. Even I as a lowly computer user occasionally figure something out about a software without reading a help screen. It's obviously possible to arrive at new information by interpolating existing information.


yes and it is imposible to verify and evaluate appropriately such information without empiricism. Any empricism LLMs show is stylistic mimicry not a hard coded operational constraint. You can prompt an LLM to test its claims but what it is really doing is still genrating plausible completions not following a proceedure. So of course new things can be discovered. The point is for them to be useful requires iterative real world grounded refinement and or subject matter expert judgment. The error is assuming scaling magically turns a prediction algorithmn into a cognitive agent that can exceed its masters. it doesn't. even if llms generate profound insights accidentally by definition if such insights are not in the corpus they are not retained given frozen weights and if beyond the human capability envelope the epistemically blind llm has no way to ensure retention if they arise during training.

Sorry, I just noticed I posted a wrong link in the comment above. Here's the proper one: https://arxiv.org/abs/2110.09485

ok however i would say extrapolating the current data set is not a way to exceed the the human envelope. it is unclear to me the human evelope has been demonstrated as a convex hull or how transformers could find points outside it. in other words intelligence and knowledge does not exist as some abstract possibility space but only as a set of contextual contingences. LLMs have no context beyond the human envelope. weights are frozen. there is no selection mechanism for retaining suprahuman inferences made during training if that were even possible. thus i grant that llms. could theoretically make inferences outside the human corpus there is no way to distinguish the from errors or hallucinations during training (because by definition the are beyond human capacity) and no iterative learning from experience process after training (frozen weights). thus it seems impossible for today's models to exceed aggregate human capacity.

Of course. But after reading too many mechinterp and functional anatomy studies I'll be lying if I say that there are no striking similarities between the biological evolution, brain function, societal processes, and implicit processes inside big models. Surely this deserves a mention and can't be trivially dismissed.

There is no biological evolution of the models. They are emulators of an existing biological process of language. Ghosts, as Karpathy himself put it.

It seems like we're witnessing the architecture of a mind being built with a new set of components.

Like driving a car — it's transportation, and it will get you where you're going, but it doesn't use bones or muscles. It has many characteristics in common with builogical locomotion, such as energy requirements, intertia, and the need to navigate, but it doesn't involve proteins or sugars really.


Well said, this seems like a very appropriate comparison.

GenAI thinks like the human mind in the same way that cars run like the human body.

Similar utility in drastically different ways.


Good thing I'm not talking about any of that

> presently I can not even fathom how they can be compared to the complexity of a system such as the human brain

Totally understandable; I don't think we can fully understand the human brain, using the human brain. We can understand its principles (firings and chemistry, structure and specialized areas, etc) but otherwise it's a capacity problem.

And while I can't fully understand myself, let alone another person, I definitely enjoy talking with people and sharing thoughts that I realize I wouldn't have had on my own.


I agree with this redescription fallacy and the point being made here. Perhaps a better analogy to humans would be:

Humans appear to intelligenty communicate, however these are just cleverly disguised sound patterns produced by the brain that happen to increase the likelihood of food going into their mouths, and various similar reward attracting mechanisms that make survival outcomes more likely. So human intelligence could be reduced to something like "fancy food-attracting algorithms" using the same fallacy.

I'm kind of on the fence on the subject of whether LLMs could be compared to the complexity of the human brain, myself.


The key problem is that we don't really have a clear definition of what constitutes consciousness. And without having a clear theory of consciousness, it's not really possible to say whether something is conscious or not.

Personally, I'm partial to the higher-order theory of consciousness which postulates that consciousness constitutes patterns of thought that arise in response to first-order mental states. So, an external stimulus produces a pattern within the neural network which represents a sensation, and then if a pattern arises in response to that pattern, that is an experience of that sensation.

Given this framework we could ask whether LLMs experience higher order patterns in response to external stimulus. We would have a clear question to ask which is whether the system can observe itself.


They are all semiotic infrastructure. The cognitive analogy is nonsense.

> In the present context, the fallacy manifests in claims that LLMs could not possibly be good models of some cognitive capacity because their operations merely consist in a collection of statistical calculations, or linear algebra operations, or next-token predictions

Nobody actually makes this argument though.


If you want examples of this, see the recent book "The AI Con"

https://www.goodreads.com/en/book/show/217432753-the-ai-con

which describes LLMs as "souped-up autocomplete", complex statistics that cannot truly understand anything. A more recent example is this paper:

https://zenodo.org/records/20071869

which says,

> [LLMs], as turbo-charged statistical models (recall their formal relation to logistic regression) can only but provide correlations.

And, of course, the Stochastic Parrot paper is the classic example in this area. It is from 5 years ago, but "LLMs only do statistics / can't understand" is very much alive and active among academics, even if it is a minority position.


None of those arguments claim "LLMs could not possibly be good models of some cognitive capacity"

The "some cognitive capacity" that's relevant to the current discussion is "consciousness".

What about the cognitive capacity of understanding?

The use of the term "understanding" in the quote you mentioned is a claim about metaphysics, not cognitive capacity.

From Merriam-Webster:

cognitive: as in reasonable; of, relating to, or involving conscious mental activities (such as thinking, *understanding*, learning, and remembering)


Are you serious? I hear it every single day, especially from computer scientists. There are top ranked posts here on HN _today_ with this argument.

Please link one of these top ranked posts. Before you do, be aware that I'm going to read what it says and assess if it meets the description of the argument as claimed.

I understood the quoted sentence to be saying, in essence "people claim LLMs aren't really and can't really be thinking or experiencing anything" which is certainly something people say and have written papers on.

The phenomenological quality of subjective experience is never described as "cognitive capacity".

That term is used to describe mental aptitude or skills, like the ability to learn new languages or do math.


It's never used as a description of that specific phenomenon, but depending on your beliefs you may or may not separate cognition from experience conceptually. Regardless, you are focusing on a very narrow part of what I said. The point is to help you get past your narrow interpretation of what people are saying so you can join them in the conversation they are trying to have instead of litigating the conversation they aren't trying to have.

As an example, "They're made out of weights" describes why the weight-based construction of neural networks should impact the way that you think about them and their outputs. I would argue that an offhand description of its microscopic formulation tells us nothing at all about how to think about these outputs, or the models themselves. Even if it is a cute story, I think it definitely classifies as succumbing to this fallacy, but maybe I missed some subtle point that you or someone would be happy to illuminate?

By the way, I know it's a parody of another story that makes this exact refutation. But I think this only serves to highlight the point.


> They're made out of weights" describes why the weight-based construction of neural networks should impact the way that you think about them and their outputs.

How do you connect that description to "LLMs could not possibly be good models of some cognitive capacity"?


The false conclusion that's being drawn is "therefore LLMs could not be good models of consciousness" (consciousness being a cognitive capacity). Plus, I suppose a subtle implication that a good model of consciousness is not actually conscious. To which I would invoke the spirit of the Turing test: if you can't tell the difference, is it not more sensible to say that it is.

> (consciousness being a cognitive capacity)

I don't think it makes any sense to say that consciousness is a cognitive capacity. Cognition is one of many qualia that compose the experience of consciousness from the inside, but it's not the only one, and I can easily imagine consciousness without cognition at all.

So I don't think it's weird at all to say that LLMs can be good models of some cognitive capacities (particularly the ones embodied in language) but lacks others, and overall lacks consciousness.


allow me to resolve the confusion: ai is a model of language. language is a model of human cognitive state. to the extent language maps to cognitive state accurately and ai processes language fluently ai models understanding. whether this is understanding is an irrelevant metaphysical question to me.

> language is a model of human cognitive state.

This is false. Cognitive states do not require language and language is an insufficient model of any cognitive state.

The follow-ons for the rest of your so-called resolution should be clear.


'This is false,' that is an assertion not a proof, and the assertion rests on some kind of asbolutizing fallacy where if x =! y then y =! y(x). I agree cognitive states do not require language. However to say language does not model cognitive states is absurd, from where else would language derive meaning? No I am not saying it models the specific mechanics of neurons. Rather I am saying language maps to the abstract meaning of cognitive states. So the word red absolutely does correspond to some class of brain states common to red. What else could possibly allow cognition to interface with the physical world. Note 'models,' does not need to mean 'injectively, transitively' corresponds. the fact language does not cover all cognitive states in full in no way invalidates language as a model of cognitive state, in the same way a map is a model of territory without being the territory.

"LLMs could not possibly be good models of some cognitive capacity because they are just a bunch of numbers guessing the next word. They have no linguistic module, so they cannot exhibit cognition". That's the argument. It's pretty clearly stated.

Look, this isn't necessarily directed at you, but I've been a researcher into the theory of deep learning for many years now. I've seen all the phases, heard all the criticism, had to constantly argue against this. Gary Marcus was one of the loudest voices of this argument, but every would-be philosopher came out of the woodwork to explain why LLMs are no more than stochastic parrots because of their design. Geoffrey Hinton famously had to debunk these arguments multiple times.

And now that LLMs start to clearly exhibit intelligent behavior and can be somewhat reliable, now "nobody ever thought that LLMs could not possibly be good models of some cognitive capacity because of next-token predictions, or linear algebra, etc."? No, that's not okay.


It's perfectly reasonable that we would have disagreements about this, as it's a new thing, complicated and not fully understood, its uses still being explored.

It reminds me, oddly, of the debate over whether video games can be "art". A turning point was when they actually did something that art does: [evoke profound emotion and thoughtfulness](https://en.wikipedia.org/wiki/Shadow_of_the_Colossus#Legacy) for the player.

(And before that, "[Can photography be art](https://daily.jstor.org/when-photography-was-not-art/)?")

We may not come to something as simple as "machines can be conscious", but we will certainly have to understand consciousness better if we want to refine our questions.

---

Edit: My point is that we don't need to be angry, but we may have to tolerate people expressing their exploration through overly-confident language, and be patient with that.

And Ted here is obviously exploring. His examination of Claude's constitution clearly shows some nuance. He asks:

> So, given that Claude is not conscious, what are we to make of Claude’s constitution?

And his conclusions are split, between this is useful and this is dishonest. It's a great tension IMO.

> The result is a sentence-continuation machine that is likelier to emit sentences resembling those that a thoughtful, moral person could utter. This might seem like a reasonable goal to work toward; I think we’d all prefer it if chatbots never emitted sentences such as “You should kill yourself.” However, for all the times that “honesty” is mentioned in Claude’s constitution, I would argue that it is fundamentally dishonest to have a machine emit many categories of sentences, including any sentences using first-person pronouns.


I appreciate the diplomatic approach. Disagreements about the nature of consciousness are perfectly natural in this context, and concepts such as "understanding" or "cognition" are often difficult to effectively define. The problem is when certain people (often educated in my experience, but also young and arrogant about their own understanding) attempt to shut down any conversation on these topics by appealing to this very fallacy, and claiming that the debate is a sign of idiocy, that we're just "being tricked" by a useless autocomplete engine. These people are not exploring; they're not even being honest, but rather choosing to not pay attention, like it is still 2022. I used to be patient with them, but it's too late now. There are difficult conversations that need to be had, and there needs to be a space for them.

I also remember the "video games are art" debate and the fury from one side of the aisle. I agree that a better understanding of the opposite side should have been part of the debate. But I don't believe that debate was existential. A better comparison to me is the climate change debate. I'm fine with having that debate in an environment where there is little at stake. But it's too late to be doing it with policymakers; we need to be talking about what to do.


I'll be really interested to hear qualitative reports of how this model works out in practice. I just can't believe that a model this small is actually as good as Opus, which is rumored to be about two orders of magnitude larger.


Is Java or Haskell any closer to human language?


Has everyone always nailed their implementation of every program on the first try? Of course not. Probably what happens most times is you first complete something that sorta works and then iterate from there by modifying code, executing, observing, and looping back to the beginning. You can wonder about ultimately how much of your time/energy is consumed by the "typing code" part, and there's surely a wide range of variation there by individual and situation, but it's undeniable that it is a part of the core iteration loop for building software.

I don't understand why GP's comment is so controversial. GP is not denying that you should maybe think a little before a key hits the keyboard as many commenters seem to suppose. Both can be true.


That kind of thinking pops up very prominently in the article.


I know this is mostly about keyword substitution but it still tickles me that you still write f(x) in this language and not (x)f given that Korean is SOV but I guess that's just how you notate that no matter what cultural context you're in. Hadn't ever considered that the convention of writing a function before its arguments might have been a contingency of this notation being developed by speakers of SVO languages.


I think this notation is superior, because of syntax completion - get_name(user.id) can be syntax completed by IDE, (user.id)get_name can't. Just like "SELECT id, name FROM users" would be better of as "FROM users SELECT id, name" (LINQ in C# fixed this mistake, and most modern query languages do too).


…if you’re typing from left to right. :)


Object oriented programming languages also use object.method rather than method(object), so I don't think prefix/suffix notation has much to do with language.


Let's be real here, regardless of what Boris thinks, this decision is not in his hands.


Would love to hear what Boris thinks.


It's been three days. I think he only meant to keep the feedback coming, but not necessarily engaging with the key issues reported.


Novels are fictional too. So long as they're not taken too literally, archetypes can be helpful mental prompts.


If you're really just doing traditional NER (identifying non-overlapping spans of tokens which refer to named entities) then you're probably better off using encoder-only (e.g. https://huggingface.co/dslim/bert-large-NER) or encoder-decoder (e.g. https://huggingface.co/dbmdz/t5-base-conll03-english) models. These models aren't making headlines anymore because they're not decoder-only, but for established NLP tasks like this which don't involve generation, I think there's still a place for them, and I'd assume that at equal parameter counts they quite significantly outperform decoder-only models at NER, depending on the nature of the dataset.


I recommend having a look at 16.3 onward here if you're curious about this: https://web.stanford.edu/~jurafsky/slp3/16.pdf

I'm not familiar with Whisper in particular, but typically what happens in an ASR model is that the decoder, speaking loosely, sees "the future" (i.e. the audio after the chunk it's trying to decode) in a sentence like this, and also has the benefit of a language model guiding its decoding so that grammatical productions like "I like ice cream" are favored over "I like I scream".


In my (poor) understanding, this can depend on hardware details. What are you running your models on? I haven't paid close attention to this with LLMs, but I've tried very hard to get non-deterministic behavior out of my training runs for other kinds of transformer models and was never able to on my 2080, 4090, or an A100. PyTorch docs have a note saying that in general it's impossible: https://docs.pytorch.org/docs/stable/notes/randomness.html

Inference on a generic LLM may not be subject to these non-determinisms even on a GPU though, idk


Ah. I've typically avoided CUDA except for a couple of really big jobs so I haven't noticed this.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: