I ran a casino and wrote a bot framework that, with a user's permission, attempted to clone their betting strategy based on their hand history (mainly how they bet as a ratio to the pot in a similar blind odds situation relative to the aggressiveness of players before and after), and I let the players play against their own bots. It was fun to watch. Oftentimes the players would lose against their bot versions for awhile, but ultimately the bot tended to go on tilt, because it couldn't moderate for aggressive behavior around it.
None of that was deterministic and the hardest part was writing efficient monte carlos that could weight each situation and average out a betting strategy close to that from the player's hand history, but throw in randomness in a band consistent with the player's own randomness in a given situation.
And none of it needed to touch on game theory. If it did, it would've been much better. LLMs would have no hope at conceptualizing any of that.
As in, did they use cameras? Image recognition? Manual record keeping? Thought it was pretty obvious that I was asking for more detail. Perhaps OP meant they ran an online casino and not an actual casino.
It's not. The LLM itself only calculates the probabilities of the next token. Assuming no race conditions in the implementation, this is completely deterministic. The popular LLM inference engine llama.cpp is deterministic. It's the job of the sampler to actually select a token using those probabilities. It can introduce pseudo-randomness if configured to, and in most cases it is configured that way, but there's no requirement to do so, e.g. it could instead always pick the most probable token.
This is a poor conceptualization of how LLMs work. No implementations of models you’re talking to today are just raw autorrgressive predictors, taking the most likely next token. Most are presented with a variety of potential options and choose from the most likely set. A repeated hand and flop would not be played exactly the same in many cases (but a 27o would have a higher likelihood of being played the same way).
>No implementations of models you’re talking to today are just raw autorrgressive predictors, taking the most likely next token.
Set the temperature to zero and that's exactly what you get. The point is the randomness is something applied externally, not a "core concept" for the LLM.
Set the temperature to zero and that's exactly what you get.
In some NN implementations, randomness is actually pretty important to keep the gradients from getting stuck at local minima/maxima. Is that true for LLMs, or is it not something that applies at all?
I'm not sure, hence the question. AFAIK temperature only comes into play at inference time once the distribution is known, but I don't know if there are other places where random numbers are involved.
Eg you tend to randomly shuffle your corpus to train on. If you use drop-out (https://en.wikipedia.org/wiki/Dilution_(neural_networks)) you use randomness. You might also randomly perturb your training data. Lots of other sources of randomness that you might want to try.
The amount of problems where people are choosing a temperature of 0 are negligible though. The reason I chose the wording “implementations of models you’re talking to today” was because in reality this is almost never where people land, and certainly not what any popular commercial surfaces are using (Claude code, any LLM chat interface).
And regardless, turning this into a system that has some notion of strategic consistency or contextual steering seems like a remarkably easy problem. Treating it as one API call in, one deterministic and constrained choice out is wrong.
None of that was deterministic and the hardest part was writing efficient monte carlos that could weight each situation and average out a betting strategy close to that from the player's hand history, but throw in randomness in a band consistent with the player's own randomness in a given situation.
And none of it needed to touch on game theory. If it did, it would've been much better. LLMs would have no hope at conceptualizing any of that.