Multimodal AI Story Teller, built with Stable Diffusion, GPT, and neural text-to-speech
Project description
StoryTeller
A multimodal AI story teller, built with Stable Diffusion, GPT, and neural text-to-speech (TTS).
Given a prompt as an opening line of a story, GPT writes the rest of the plot; Stable Diffusion draws an image for each sentence; a TTS model narrates each line, resulting in a fully animated video of a short story, replete with audio and visuals.
Installation
PyPI
Story Teller is available on PyPI.
$ pip install storyteller-core
Source
- Clone the repository.
$ git clone https://github.com/jaketae/storyteller.git
$ cd storyteller
- Install dependencies.
$ pip install .
Note: For Apple M1/2 users, mecab-python3
is not available. You need to install mecab
before running pip install
. You can do this with Hombrew via brew install mecab
. For more information, refer to this issue.
- (Optional) To develop locally, install
dev
dependencies and install pre-commit hooks. This will automatically trigger linting and code quality checks before each commit.
$ pip install -e .[dev]
$ pre-commit install
Quickstart
The quickest way to run a demo is through the CLI. Simply type
$ storyteller
The final video will be saved as /out/out.mp4
, alongside other intermediate images, audio files, and subtitles.
To adjust the defaults with custom parametes, toggle the CLI flags as needed.
$ storyteller --help
usage: storyteller [-h] [--writer_prompt WRITER_PROMPT]
[--painter_prompt_prefix PAINTER_PROMPT_PREFIX] [--num_images NUM_IMAGES]
[--output_dir OUTPUT_DIR] [--seed SEED] [--max_new_tokens MAX_NEW_TOKENS]
[--writer WRITER] [--painter PAINTER] [--speaker SPEAKER]
[--writer_device WRITER_DEVICE] [--painter_device PAINTER_DEVICE]
optional arguments:
-h, --help show this help message and exit
--writer_prompt WRITER_PROMPT
--painter_prompt_prefix PAINTER_PROMPT_PREFIX
--num_images NUM_IMAGES
--output_dir OUTPUT_DIR
--seed SEED
--max_new_tokens MAX_NEW_TOKENS
--writer WRITER
--painter PAINTER
--speaker SPEAKER
--writer_device WRITER_DEVICE
--painter_device PAINTER_DEVICE
Usage
For more advanced use cases, you can also directly interface with Story Teller in Python code.
- Load the model with defaults.
from storyteller import StoryTeller
story_teller = StoryTeller.from_default()
story_teller.generate(...)
- Alternatively, configure the model with custom settings.
from storyteller import StoryTeller, StoryTellerConfig
config = StoryTellerConfig(
writer="gpt2-large",
painter="CompVis/stable-diffusion-v1-4",
max_new_tokens=100,
)
story_teller = StoryTeller(config)
story_teller.generate(...)
License
Released under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for storyteller_core-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d1f83d305c168209c6d27c22ba3703bb9a7c55484efa0e45c178583976e6db7c |
|
MD5 | 9928c7feba2b1ba40fd5695667ca886c |
|
BLAKE2b-256 | 7f8495d390e9713ca121c162ceaaed1f62c2e88fc0d2f3729de9fedd07939025 |