Skip to main content

Transcribe (whisper) and translate (gpt) voice into LRC file.

Project description

Open-Lyrics

Open-Lyrics is a Python library that transcribes voice files using faster-whisper, and translates/polishes the resulting text into .lrc files in the desired language using OpenAI-GPT.

Installation

  1. Please install CUDA and cuDNN first according to https://opennmt.net/CTranslate2/installation.html to enable faster-whisper.

  2. Add your OpenAI API key to environment variable OPENAI_API_KEY.

  3. Install PyTorch:

    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
    
  4. Install whisperx

    pip install git+https://github.com/m-bain/whisperx.git
    
  5. This project can be installed from PyPI:

    pip install openlrc
    

    or install directly from GitHub:

    pip install git+https://github.com/zh-plus/Open-Lyrics
    

Usage

from openlrc import LRCer

lrcer = LRCer()

# Single file
lrcer.run('./data/test.mp3', target_lang='zh-cn')  # Generate translated ./data/test.lrc with default translate prompt.

# Multiple files
lrcer.run(['./data/test1.mp3', './data/test2.mp3'], target_lang='zh-cn')
# Note we run the transcription sequentially, but run the translation concurrently for each file.

# Path can contain video
lrcer.run(['./data/test_audio.mp3', './data/test_video.mp4'], target_lang='zh-cn')

# Use context.yaml to improve translation
lrcer.run('./data/test.mp3', target_lang='zh-cn', context_path='./data/context.yaml')

Context

Utilize the available context to enhance the quality of your translation. Save them as context.yaml in the same directory as your audio file.

background: "This is a multi-line background.
This is a basic example."
audio_type: Movie
synopsis_map: {
  movie_name1 (without extension): "This
  is a multi-line synopsis for movie1.",
  movie_name2 (without extension): "This
  is a multi-line synopsis for movie2.",
  movie_name3 (without extension): "This is a single-line synopsis for movie 3.",
}

Todo

  • [Efficiency] Batched translate/polish for GPT request (enable contextual ability).
  • [Efficiency] Concurrent support for GPT request.
  • [Efficiency & Transcription Quality] Use whisperx for transcription.
  • [Translation Quality] Make translate prompt more robust according to https://github.com/openai/openai-cookbook.
  • [Usability] Automatically fix json encoder error using GPT.
  • [Efficiency] Asynchronously perform transcription and translation for multiple audio inputs.
  • [Quality] Improve batched translation/polish prompt according to gpt-subtrans.
  • [Usability] Input video support.
  • [Usability] Multiple output format support.
  • [Quality] Use multilingual language model to assess translation quality.
  • [Quality] Speech enhancement for input audio.
  • [Efficiency] Add Azure OpenAI Service support.
  • [Usability] Add local LLM support.
  • [Usability] Multiple translate engine (Microsoft, DeepL, Google, etc.) support.
  • [Others] Add transcribed examples.
    • Song
    • Podcast
    • Audiobook

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openlrc-0.1.3.tar.gz (19.2 kB view hashes)

Uploaded Source

Built Distribution

openlrc-0.1.3-py3-none-any.whl (23.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page