Skip to main content

Video Killed The Radio Star

Project description

Video Killed The Radio Star Open In Colab

Requirements

FAQ

What is this?

TLDR: Automated music video maker, given an mp3 or a youtube URL

How does this animation technique work?

For each text prompt you provide, the notebook will...

  1. Generate an image based on that text prompt (using stable diffusion)
  2. Use the generated image as the init_image to recombine with the text prompt to generate variations similar to the first image. This produces a sequence of extremely similar images based on the original text prompt
  3. Images are then intelligently reordered to find the smoothest animation sequence of those frames
  4. This image sequence is then repeated to pad out the animation duration as needed

The technique demonstrated in this notebook was inspired by a video created by Ben Gillin.

How are lyrics transcribed?

This notebook uses openai's recently released 'whisper' model for performing automatic speech recognition. OpenAI was kind of to offer several different sizes of this model which each have their own pros and cons. This notebook uses the largest whisper model for transcribing the actual lyrics. Additionally, we use the smallest model for performing the lyric segmentation. Neither of these models is perfect, but the results so far seem pretty decent.

The first draft of this notebook relied on subtitles from youtube videos to determine timing, which was then aligned with user-provided lyrics. Youtube's automated captions are powerful and I'll update the notebook shortly to leverage those again, but for the time being we're just using whisper for everything and not referencing user-provided captions at all.

Something didn't work quite right in the transcription process. How do fix the timing or the actual lyrics?

The notebook is divided into several steps. Between each step, a "storyboard" file is updated. If you want to make modifications, you can edit this file directly and those edits should be reflected when you next load the file. Depending on what you changed and what step you run next, your changes may be ignored or even overwritten. Still playing with different solutions here.

Can I provide my own images to 'bring to life' and associate with certain lyrics/sequences?

Yes, you can! As described above: you just need to modify the storyboard. Will describe this functionality in greater detail after the implementation stabilizes a bit more.

This gave me an idea and I'd like to use just a part of your process here. What's the best way to reuse just some of the machinery you've developed here?

Most of the functionality in this notebook has been offloaded to library I published to pypi called vktrs. I strongly encourage you to import anything you need from there rather than cutting and pasting function into a notebook. Similarly, if you have ideas for improvements, please don't hesitate to submit a PR!

Dev notes

installing unreleased package in colab:

!pip install --upgrade setuptools build
!git clone --branch hf https://github.com/dmarx/video-killed-the-radio-star/
!cd video-killed-the-radio-star;  python -m build; python -m pip install .[api,hf]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vktrs-0.1.2b0.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

vktrs-0.1.2b0-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file vktrs-0.1.2b0.tar.gz.

File metadata

  • Download URL: vktrs-0.1.2b0.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for vktrs-0.1.2b0.tar.gz
Algorithm Hash digest
SHA256 16569fa94a7b078075392d1f981198b06fcdba4ff0f439a64fcb9941c774d855
MD5 e0ec273577c16d78632c94b2eddcc785
BLAKE2b-256 3f9ddb2e6c899567c18d341b72e95d6ead0bbdccb8c30b3da8145c15ec4b6f4f

See more details on using hashes here.

File details

Details for the file vktrs-0.1.2b0-py3-none-any.whl.

File metadata

  • Download URL: vktrs-0.1.2b0-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for vktrs-0.1.2b0-py3-none-any.whl
Algorithm Hash digest
SHA256 bedead88d288b3f62c1e254efbeb694956ab586019f02a2134c1d0bdfce3b157
MD5 79e3bcae40f4e4739240da842927a0c9
BLAKE2b-256 740378e3f7d795b3401611dc2c1e1d4b716af05b608d4925a7ccae479d111abc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page