One thing I find that constantly makes pain for users is assuming that any of th...

sunir · 2025-11-14T16:30:54 1763137854

You’re right and wrong at the same time. A quantum superposition of validity.

The word thinking is going too much work in your argument, but arguably “assume it’s thinking” is not doing enough work.

The models do compute and can reduce entropy; however, they don’t match the way we presume things do this because we assume every intelligence is human or more accurately the same as our own mind.

To see the algorithm for what it is, you can make it work through a logical set of steps from input to output but it requires multiple passes. The models use a heuristic pattern matching approach to reasoning instead of a computational one like symbolic logic.

While the algorithms are computed, the virtual space the input is transformed to the output is not computational.

The models remain incredible and remarkable but they are incomplete.

Further there is a huge garbage in garbage out problem as often the input to the model lacks enough information to decide on the next transformation to the code base. That’s part of the illusion of conversationality that tricks us into thinking the algorithm is like a human.

AI has always had human reactions like this. Eliza was surprisingly effective, right?

It may be that average humans are not capable of interacting with an AI reliably because the illusion is overwhelming for instinctive reasons.

As engineers we should try to accurately assess and measure what is actually happening so we can predict and reason about how the models fit into systems.

giuscri · 2025-11-14T10:49:08 1763117348

but it’s also true that the next sentence is generated by evaluating the whole conversation including the proposed solution.

my mental model is that the llm learned to predict what another person would say just by looking at that solution.

so it’s really telling whether the solution is likely (likely!) to be right or wrong

ben_w · 2025-11-14T14:48:06 1763131686

Slight quibble, but the reinforcement learning from human feedback means they're trained (somewhat) on what the specific human asking the question is likely to consider right or wrong.

This is both why they're sycophantic, and also why they're better than just median internet comments.

But this is only a slight quibble, because what you say is also somewhat true, and why they have such a hard time saying "I don't know".

giuscri · 2025-11-14T15:13:59 1763133239

idk… maybe we’ll found out the reason is that on the internet no one ends a conversation saying “i don’t know” :D

ben_w · 2025-11-14T19:09:07 1763147347

That's my point :)

nijave · 2025-11-14T15:56:23 1763135783

>The only way to use a tool like this is to give a problem that fits context

Or give context to the model which fits the problem. That's more of an art than a science at this point it seems

I think people with better success are those better at generating prompts but that's non trivial

stray · 2025-11-14T22:07:55 1763158075

I get that a submarine can't swim.

I'm just not so sure of importance of the difference between swimming and whatever the word for how a submarine moves is.

If it looks like thinking and quacks like thinking...

pietz · 2025-11-14T11:20:14 1763119214

Can you go into a bit more detail why the two approaches are so different in your opinion?

I don't think I agree and I want to understand this argument better.

ismailmaj · 2025-11-14T13:17:27 1763126247

I’m guessing the argument is that LLMs get worse for problems they haven’t seen before, so you may assume they think for problems that are commonly discussed in the internet or seen on github, but once you step out of that zone, you get plausible but logically false results.

That or a reductive fallacy, in either case I’m not convinced, IMO they are just not smart enough (either due to lack of complexity in the architecture or bad training that didn’t help it generalize reasoning patterns).

nijave · 2025-11-14T15:58:32 1763135912

They regurgitate what they're trained on so they're largely consensus based. However, the consensus can be frequently wrong--especially when the information is outdated

Someone with the ability to "think" should be able to separate oft repeated fiction from fact