Live-ish illustration for your role-playing campaign

These details have not been verified by PyPI

Project description

TTRPG live_illustrate

ASR + LLM + Diffusion = ???

This project:

Uses Whisper to transcribe live audio of a tabletop RPG session
Uses GPT-3.5 to extract a description of the current setting from the transcript
Uses DALL-E to draw the setting
Uses Flask & HTMX to display a new image every few minutes

And like most AI projects, it simultaneously works better and worse than one might expect. The images generated are never a perfect rendition of what's going on, but are almost too good to be just ambient background flavor.

Demo Reel

Some scenes from our party's first trial session:

The party enjoys dinner together on the deck of the Daydream. No one's quite sure where the other ship came from, but it looks nice.

The party sails the Daydream through a narrow canal in a swamp, searching for the hidden pirate city of Siren's Cove. Perhaps they should ask the barrel people for directions.

The party eavesdrops on a red-haired gnome and a halfling in a Siren's Cove tavern who are plotting to steal a competitor's shipping manifest. Pay no attention to the faces of the other patrons.

The party seeks further gossip at a luxe brothel called The Rich Dagger, guarded by a Goliath bouncer and famed for its perplexing architecture.

Installation

I recommend installing in a virtual environment.

# From PyPI:
pip install live_illustrate

# Or locally:
git clone git@github.com:ehennenfent/live_illustrate.git
cd live_illustrate
pip install -e .

Whisper will be much faster if you use a cuda-enabled pytorch build. I recommend installing this manually afterwards.

pip install --index-url https://download.pytorch.org/whl/cu118 torch torchvision torchaudio  # https://pytorch.org/get-started/locally/

You'll need an OpenAI API key, exposed via environment variable or in the .env file, like so: OPENAI_API_KEY=<my_secret_api_key>.

With the default settings, it costs about $1/hour to run. You can lower the cost by reducing the size of the generated images, or increasing the interval between them.

Running

Once installed, run the illustrate command line tool, which will automatically start recording with your default microphone.

A few words about the most important command line options:

--wait_minutes: This controls how frequently the tool draws an image, which directly translates into how expensive it is to run. The default of 7.5 minutes seems to work well for our campaign.
--max_context: Each interval, the tool looks back at the transcript and collects up to max_context tokens to send to GPT3. It will get as close as possible, so some of these tokens may come from before the previous image was generated. GPT can be a bit slow about summarizing large amounts of text, so be careful about making this too large. The default of 2000 tokens seems to correspond very roughly to about ten minutes of conversation from one of our sessions, but YMMV.
--persistence_of_memory When summarizing long conversations, the LLM can seem to get "stuck" on the first setting described. This argument controls what fraction of the previous context is retained each time an image is generated. The default setting of 0.2 may lead to some discontinuity if your party is in one place for a long time.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

Jan 27, 2024

This version

0.1.0

Dec 16, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

live_illustrate-0.1.0.tar.gz (15.9 kB view details)

Uploaded Dec 16, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

live_illustrate-0.1.0-py3-none-any.whl (5.0 kB view details)

Uploaded Dec 16, 2023 Python 3

File details

Details for the file live_illustrate-0.1.0.tar.gz.

File metadata

Download URL: live_illustrate-0.1.0.tar.gz
Upload date: Dec 16, 2023
Size: 15.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for live_illustrate-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e1ca893148541b5ea0c84aaf7bdabf841a79501e4117aa8fa7c1ef99fb3c52ab`
MD5	`852ca5a154c0e72d3c6f20da937ac751`
BLAKE2b-256	`2baefeddec81380d9d4d74b3efb2a357b31a5dffa9e02ab20c1f203777c0271f`

See more details on using hashes here.

File details

Details for the file live_illustrate-0.1.0-py3-none-any.whl.

File metadata

Download URL: live_illustrate-0.1.0-py3-none-any.whl
Upload date: Dec 16, 2023
Size: 5.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for live_illustrate-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2d2be58d906b927005982d3ac8ad0adb08d7afd700fcb88f3727bff1c4a019df`
MD5	`fe2df98ddf3de703777c811f15d887cf`
BLAKE2b-256	`889ee22c09f1b546636123d83bd3f2cfb7bb205c987b38152e02e6a102049a16`

See more details on using hashes here.

live-illustrate 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

TTRPG live_illustrate

Demo Reel

Installation

Running

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes