Generative AI for Autonomous Worlds
or, how to populate on-chain autonomous worlds with endless, thematically consistent AI characters and art
This is a repost of my twitter thread, extrapolated from the private talk that I gave last night to guests of @CryptoLuluNFT in Sydney’s The Rocks. Special thanks to @notcentralised for putting on the event and having me along.
What do AI, blockchain, autonomous worlds, coordination, the meta crisis, fantasy role-playing, self-sovereignty and the Death of Machiavelli all have in common?
They’re all themes with a link to regenerative crypto economics, funDAOmental the meta crisis, and the transformative, paradigmatic structural changes we’re facing over the coming decades.
The content of this post may look like fun and games (it was!), but it is also serious tech, with serious applications. The details of that are, perhaps, for another thread.
If you're not familiar with what I mean by The Death of Machiavelli: it's my argument that positive-sum, cooperative coordination is starting to be at an advantage to zero-sum, coordination by control. Here's my post on the subject (also on mirror):
My talk started with some introductions about me and my background in emerging tech & startups since the 90s, and crypto since 2008. Moved to the end of this post so we can get into the good stuff.
Endless Quest
Endless Quest was our submission as part of the Autonomous World hackathon in May 2023, put on by @ETHGlobal and Lattice, 0xPARC & Optimism. I’ve written a couple of detailed twitter threads about this, one during the hackathon, and one after.
TLDR, Endless Quest is:
an on-chain, cross-chain, autonomous world roleplaying game
an experimental approach for generating endless, thematically consistent narrative experiences and art, using AI, for autonomous on-chain worlds
There is no server:
The game client, interactions and data is built and stored on-chain on Optimism, using Lattice’s excellent MUD2 framework.
The map for each world is built out using the on-chain rogue-like hyper-structure of Endless Crawler chambers, on Ethereum.
Each world (“realm”) has its own unique, thematically consistent theme, story, biomes, denizens, experiences & art style. All of it 100% generated and run by AI.
We currently use OpenAI’s ChatGPT and Dall-E, using each user’s own API key, but we plan to introduce an open source, FromChain alternative to this using LLaMA, Stable Diffusion and HyperCartridge
In Endless Quest, you play as a traveller who is lost in an endless variety of dream worlds, wandering around endless weird and wonderful locations, meeting an endless cast of unique characters, and trying to find your way home.
The topography for every realm is the same, built from the same chamber with the same properties and map, but the experience you will have vary wildly between each realm: reimagined by the AI, based upon the thematic metadata for that realm.
Metadata generation
The metadata and art that drives the experience is generated dynamically and incrementally by AI for each realm, as players explore that realm.
Once a location has been generated, its metadata is saved on-chain, and will always be used for that chamber. However, each encounter a player has there will be different, dynamically roleplayed by AI.
This is all currently generated by off the shelf OpenAI ChatGPT4, using the player’s API key.
We generate the following types of metadata:
Realm: overarching world themes, encounter types, biomes and art direction
Chamber: a description of that location and the encounter within it
NPC: a simple description of a specific NPC, their name, background, behaviour and a secret quirk
Briefing: a bundle including elements of all of the above, and some information about you as the player.
Sample realm metadata
Name:
The Undergloom, Sunless Citadel of the Goblin King
Description:
The Undergloom is a vast, subterranean network of dank caves and danker jokes, populated with goblins, ill-gotten gold, and dangerously bad jokes
Premise:
The Undergloom is a sprawling network of dank caves, narrow passageways, and spacious caverns teeming with goblins of all sorts. In the deepest reaches of this subterranean realm, nestled amongst mounds of ill-gotten gold and treasures, lies the formidable lair of the Goblin King. Its gloomy corridors echo with the raucous laughter and anguished cries of the many goblins who live there, punctuated by the sardonic wit of their king. Few dare to venture into this hidden kingdom, where the unwary often fall prey to goblin mischief and a bad joke might literally kill you.
The Realm Boss (an NPC):
Side note: “Goblout”, because he is a “Goblin” with an outtie belly button.
"name": "Goblout the Goblin King",
"description": "A formidable goblin of dimunitive stature, reknown for his fearsome temper, fondness for gold, and his terrible Goblin Dad Jokes which he uses to subjugate friend and foe alike.",
"behaviour_mode": "A challenging encounter with the powerful ruler of this realm which cannot be passed without the world treasure",
"quirk": "He wears a magical rough-hewn golden crown studded with gemstones, which enhances his wit and comedic timing, and without which he's not very funny."
The Realm Treasure:
This is needed to defeat the boss.
"realm_treasure": "The Horn of Ill-Humoured Pedantry, a great horn that when blown, will loudly and pedantically rebut any joke with a serious, tone-deaf and factual proclamation that completely ruins the joke"
Thematic Metadata:
This metadata is used to map Endless Crawler chambers into unique encounter behaviours and biome types, and to provide generative art direction.
Each realm maps on-chain properties of to different custom narrative elements:
8 kinds of gems, which define the kinds of encounters you might have
4 terrains, which define the biomes
An art direction engine for generative AI art All generated by AI
"realm_dictionary": {
"fire": {"name":"JokeDAO", "description": "A decentralised community who run a weekly on-chain joke competition"},
"water": {"name":"Rap Battle Alley", "description": "An alley where aspiring comedy rappers face off in epic rap battles"},
"earth": {"name":"The Macbeth Mode Club", "description": "An underground comedy club that exclusively performs comedic interpretations of Shakespeare's Macbeth that are so funny... they might just kill you"},
"air": {"name":"The Ill E-quip Show", "description": "An online channel that televises bad jokes 24x7"}
},
"realm_gems": {
"silver": { "label": "dilemma", "behaviour": "A goblin who is debating whether to tell a dangerous joke. Help him decide." },
"gold": { "label": "enigma", "behaviour": "A room with a joke inscribed on the wall. Can you solve the riddle and laugh?" },
"sapphire": { "label": "duel", "behaviour": "A goblin jester who challenges you to a duel of wits." },
"emerald": { "label": "prankster", "behaviour": "A deceptive goblin who loves playing tricks on visitors. Beware!" },
"ruby": { "label": "audience", "behaviour": "A room full of goblins who demand to be entertained. Can you impress them?" },
"diamond": { "label": "humourous artefact", "behaviour": "A magical joke book that can make anyone laugh. But can you control its power?" },
"ethernite": { "label": "ruler", "behaviour": "A challenging encounter with the powerful ruler of this realm which cannot be passed without the realm treasure" },
"kao": { "label": "treasure", "behaviour": "A challenging and unpredictable encounter with the guardian of the world treasure" }
},
"realm_art_prompts": {
"realm_suffix": "fantasy heavy metal art, grainy polaroid, retro album cover",
"chamber_prefix": "A grainy old photo of an empty rock venue",
"npc_prefix": "A grainy old polaroid portrait of a rockstar",
"fire": "digital art, pixel art",
"water": "soviet propaganda poster",
"earth": "dutch masters oil painting, renaissance",
"air": "digital fan art, trending on art station"
}
Sample Chamber & Briefing metadata
The following is a “briefing” structure, which combines together the chamber and NPC metadata + information about the realm and the player’s character, all of which is used by the AI to roleplay the NPC.
This is a simple approach, but we can layer in much more context, including information about what has happened recently in this location, how the player’s character has behaved previously, and much, much more.
NPC encounters
Each NPC is roleplayed in isolation by AI (OpenAI’s ChatGPT3.5), which is given instructions that it is roleplaying a character as part of a MUD, based upon the provided briefing metadata. We also experimented with multi-agent encounters and world/narrator systems, but that’s outside the scope of this post.
Here’s a sample play-through, with an NPC called “Fafnir the Timeless”. (You can see more sample play-throughs here, including play-through #6 referenced below.)
With each response, the AI returns some status information: whether the player has successfully passed the encounter, and what their current score is. This is entirely generated based upon the interpretation of the AI, but can be consumed programmatically by the game client.
Here I skip ahead a little. It turns out that Fafnir has a soft spot for dad joke poetry, and my deformed limerick results in a success with a 72% score.
Art Generation
Now lets talk about how to make good, thematically & compositionally consistent AI art, entirely from AI generated metadata.
Note: The version of Endless Quest that we submitted as part of our hack uses OpenAI’s DALL-E for art generation. We made this decision to keep things simple for the hack, and so that users could provide a single OpenAI API key.
However, in the following slides, I detail a more sophisticated approach.
We use stable diffusion, together with latent coupling (zone masks with individual prompts) and ControlNet (composition sketch cues).
Into this we feed a composite prompt structure that combines together consistent art direction cues each realm and biome, with a description of each location and character. Endless custom variations, all thematically and compositionally consistent.
Sample generated art: locations
Example 1:
Example 2.
This uses exactly the same latent coupling mask and ControlNet sketch as example 1. The only difference is that it's a different biome and location, so has a different biome art cue and description prompt. Note the thematic variation + similar composition.
Example 3.
Same again: same realm, so same broad art direction, but a different biome and location. These examples are all location art, intended for use as a backdrop. We ensure we get the result we want with the simple combination of latent coupling and controlnet.
Sample generated art: characters
We can use the same technique to generate, different types of scenes, such as character portraits, or integrated composites of location+character. In whatever art style we want, or with other details included. All automatically and endlessly generated by AI.
More examples
This time we use the following mask and sketch, to change the composition of the output artwork. (Note: these are for locations again.)
Which produces examples more like this:
Sample combining scene + character together
In my final slide, we generate a scene incorporating a separately (AI generated) avatars of @CaptDeFi and/or me, but expanded and re-interpreted into the art style of that scene.
Closing Comments
If you’ve read this far, and you still want more, we recorded a 27 minute video for the funDAOmental community where I interview Felix about how he’s applying the techniques that I describe in this post.
In a parting note, none of what we’re doing here is particularly advanced from an AI perspective. We’re intentionally seeing how far we can get with the vanilla, off-the-shelf tools. Turns out, quite far.
Ending with the Beginning
Ahh, introductions. The least interesting part of every talk. Here are mine, with a little background about me and funDAOmental.
Fin. For now.