Each token produced is more computation *only* if those tokens are useful to inf...

zargon · on May 26, 2024

The tokens don't have to be related to the task at all. (From an outside perspective. The connections are internal in the model. That might raise transparency concerns.) A single designated 'compute token' repeated over and over can perform as well as traditional 'chain of thought.' See for example, Let's Think Dot by Dot (https://arxiv.org/abs/2404.15758).

delusional · on May 26, 2024

That doesn't have to be the case, at least in theory. Every token means more computation, also in parts of the network with no connection to the current token. It's possible (but not practically likely) that the disclaimer provides the layer evaluations necessary to compute the answer, even though it confers no information to you.

The AI does not think. It does not work like us, and so the causal chains you want to follow are not necessarily meaningful to it.

londons_explore · on May 26, 2024

I don't think that's true on transformer models.

Ignoring caches+optimisations, a transformer model takes as input a string of words and generates one more word. No other internal state is stored or used for the next word apart from the previous words.

delusional · on May 26, 2024

The words in the disclaimer would have to be the "hidden state". As said, this is unlikely to be true, but theoretically you could imagine a model that starts outputting a disclaimer like "as a large language model" it's possible that the next top 2 words would be "I" or "it" where "I" would lead to correct answers and "it" would lead to wrong ones. Blocking it form outputting "I" would then preclude you from getting to the correct response.

This is a rather contrived example, but the "mind" of an AI is different our own. We think inside of our brains and express that in words. We can substitute words without substituting the intent behind them. The AI can't. The words are the literal computation. Different words, different intent.