I've experimented with a very similar flow back in the days of GPT3.5. It resulted in really fun story lines, but i found that the player choices benefitted from some steering: For each plot point i generated three choices, where [1] progressed the story, [2] stalled the plot and [3] resulted in a plot twist, just these additions to the choice prompts increased playability by a lot, and prevented plot stagnation and "feedback loops".
I think the intersection of games and LLMs is the way to solve alignment. Imagine a 3D UI for an LLM, this way we can democratize interpretability research.
Basically to train another model, that will take the LLM and convert it into 3D shapes and put them in some 3D world that is understandable for humans.
Simpler example: represent an LLM as a green field with objects, where humans are the only agents:
You stand near a monkey, see chewing mouth nearby, go there (your prompt now is “monkey chews”), close by you see an arrow pointing at a banana, father away an arrow points at an apple, very far away at the horizon an arrow points at a tire (monkeys rarely chew tires).
So things close by are more likely tokens, things far away are less likely, you see all of them at once (maybe you’re on top of a hill to see farther). This way we can make a form of static place AI, where humans are the only agents
We’ll have millions of gamers’ eyeballs on the internals of the multimodal LLMs. They’ll probably find many stolen things there and some “monsters”, unknown unknowns. Probably will inspire them to learn the lower “machine code” of LLMs, the same way graphical user interfaces made computers widespread. Plus we’ll tap into the game dev ecosystem and their tools to start solving alignment together. I don’t see downsides
It's in ren'py, so the states are all hard-coded, LLM powers interactions with NPCs --including branching based on being able to convince them. So rather complementary to this project.
The dynamic features presented are interesting; at the same time, the lack of great consistency with LLMs makes me prefer their use for local tasks, and have a human make the global decisions.
Other interesting games (that are readily playable):
The group behind AI Dungeon released the Wayfarer model in 12B and 70B variants. The former is small enough to run on local hardware. 10/10 would get eaten by a giant sloth again.
This is really cool! When you get a restate instance, does that just handle the durable execution journal / orchestration, or can you run code in the instance as well?
Hey igal from restate here :)
Indeed the restate server does handle the durable execution part, while the user code runs in a separate process, we don't host the code ourselves at the moment, but it is really easy to deploy it wherever you wish and even deploy it on AWS lambda!
This is a very thoughtful and well engineered solution that I think unfortunately removes one of the main advantages of choose your own adventures with an LLM -- that you can just tell them you don't like their choices and want to do something completely different.
If author is here you should remove any mentions of "ch00se your 0wn adventure" and also it's 4 letter acronym. The trademark owner goes very aggressively after it. Use user generated adventures or something similar.
It looks like "Chooseco" owns it. Unfortunately, the best bet at having it thrown out was their lawsuit against Netflix, but Netflix chose to settle.
> Seems like something that shouldn’t be trademarkable
It's not copyrightable, but it's exactly the kind of thing that would be trademarkable. It's just that it's definitely genericized at this point. I'm not sure many people associate it with a brand rather than a genre.