Skip to main content

An audiobook sound effect generator that transforms SRT files into immersive audio experiences. It parses SRT files, uses ChatGPT to create sound effect prompts, generates sounds via the ElevenLabs API, and syncs the audio on an MP3 timeline.

Project description

SRT2SoundFX

PyPI package version number License

This project is a sound effect generator for audiobooks, videos, and recordings based on SRT files. It parses SRT files, sends the elements to ChatGPT for sound effect prompts, merges the prompts with the SRT elements, generates sounds using the ElevenLabs API, and places the audio on an MP3 timeline.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

  • Python 3.7+
  • OpenAI API key (get it here)
  • ElevenLabs API key (get it here)
  • ffmpeg (optional, for adding sounds to the audio file)

Installing

Install the package using pip:

pip install srt2soundfx

Using the Package in Your Code

Here is an example of how to use the Srt2SoundFX class in your code:

from srt2soundfx.main import Srt2SoundFX

# Initialize the Srt2SoundFX class with your API keys
# Choose between Azure OpenAI and OpenAI by providing the respective API keys
srt2soundfx = Srt2SoundFX(
    elevenlabs_api_key='ELEVENLABS_API_KEY',
    openai_api_key='OPENAI_API_KEY', # Use this if you are using OpenAI
    azure_openai_api_key='AZURE_OPENAI_API_KEY', # Use this if you are using Azure OpenAI
    azure_openai_endpoint='AZURE_OPENAI_ENDPOINT' # Use this if you are using Azure OpenAI
)
# Define paths
srt_path = "/path/to/your/audio.srt"
save_dir = "/path/to/save/directory"
project_name = "your_project_name"
audio_path = "/path/to/your/audio.mp3"

# Process the audio
result = srt2soundfx.generate_sounds(srt_path, save_dir, project_name, audio_path)

# If you only want the sounds without placing them in the audio
sounds = srt2soundfx.generate_sounds(srt_path, save_dir, project_name)

Supported Languages

The Srt2SoundFX project supports all languages that are supported by ChatGPT (e.g., English, Spanish, French, German, Polish, Italian, Portuguese, Chinese, Japanese, Korean, Russian, Hindi, and Arabic). This means you can process SRT files in any of these languages, and the sound effect prompts will be generated accordingly. Ensure that your SRT file is properly formatted and encoded in the language you intend to use.

Example Usage

Here is a complete example:

from srt2soundfx.main import Srt2SoundFX

# Initialize the Srt2SoundFX class
srt2soundfx = Srt2SoundFX(
    elevenlabs_api_key='ELEVENLABS_API_KEY',
    openai_api_key='OPENAI_API_KEY'
)

# Define paths
srt_path = "resources/audiobook.srt"
save_dir = "resources"
project_name = "audiobook"
audiobook_path = "resources/audiobook.mp3"

# Process the audio
result = srt2soundfx.generate_sounds(srt_path, save_dir, project_name, audiobook_path)

# If you only want the sounds without placing them in the audio
sounds = srt2soundfx.generate_sounds(srt_path, save_dir, project_name)

Example Output

Output of sounds Variable

When processing an SRT file, the result includes a list of sound effects with their details, such as the start and end times, the prompt used to generate the sound, and the path to the generated audio files:

[
    {
        "id": 10,
        "start": 30.3,
        "end": 33.0,
        "text": "The silhouettes of ships loomed on the horizon.",
        "prompt": "A high-quality sound of distant ships at sea, creating an atmosphere of adventure.",
        "duration": 12,
        "audio_path": "/resources/audiobook_10.mp3"
    },
    {
        "id": 35,
        "start": 116.5,
        "end": 118.2,
        "text": "We'll board them immediately.",
        "prompt": "A high-quality sound of swords clashing, symbolizing a naval battle or abordage.",
        "duration": 5,
        "audio_path": "/resources/audiobook_35.mp3"
    }
]

Output of result Variable

If you process an audio file along with the SRT file, the result variable contains paths to the final audio files:

{
    "effects": "/app/resources/final_audiobook_with_effects.mp3",
    "final_audio": "/app/resources/effects.mp3"
}
  • effects: Path to the audiobook with the sound effects added.
  • final_audio: Path to the standalone audio effects timeline.

Developing

First, clone the repository:

git clone https://github.com/stefaner1/SRT2SoundFX.git

Then, navigate to the project folder:

$ cd SRT2SoundFX

Set Up Environment Variables

Copy the .env_example file and rename it to .env:

cp .env_example .env

Open the .env file and set your API keys. Save the file. These variables will be automatically loaded during development.

Next, run docker:

$ docker-compose up

Running the Tests

Navigate to the tests directory:

$ cd tests

Then, run the tests:

$ python -m unittest discover

Built With

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

srt2soundfx-1.0.0.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

srt2soundfx-1.0.0-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file srt2soundfx-1.0.0.tar.gz.

File metadata

  • Download URL: srt2soundfx-1.0.0.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.9

File hashes

Hashes for srt2soundfx-1.0.0.tar.gz
Algorithm Hash digest
SHA256 9fe83b1a519fc70c8434b64d759a9673d4ab790ef0afa94a020862f843396cc1
MD5 729c610f6715951661d9aef308f2f519
BLAKE2b-256 2de69a88ecb51e175505c1d9403207f95c8d799d5a927734218c5fe102e32853

See more details on using hashes here.

File details

Details for the file srt2soundfx-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: srt2soundfx-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.9

File hashes

Hashes for srt2soundfx-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 69fa4f89339facc2aeeaca0e079576a5648dde598014797bb978608c35aea598
MD5 c65ba6dec65e22413018f6918bc51a24
BLAKE2b-256 5b3f74e2c4251a8e3b3b1dd9d506a19c6795126a1c2daae23a03b00954b32cee

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page