Some good prompt-reply interactions are probably fed back in to subsequent training runs, so they're still stateful/have memory in a way, there's just a long delay.
State is a function of accumulated past. That does not mean that having some past written down makes you stateful. A stateful thing has to incorporate the ongoing changes.
Both of your examples are stateful systems from the outside, given a suitable choice of timeframe, the latter one is just how purely functional systems represent state. Theoretically they can simulate each other, and the endpoint you use to access Actor will still reference the latest Actor. The only reason you're calling them different is because you insist on using a specific timeframe to exclude considering one as stateful, and I'm pointing out that that isn't strictly necessary.