🎙️ SonicScribe - Transcribe & Translate audio/video files using Whisper and GPT models.

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

🎙️ SonicScribe

A powerful CLI tool and Python module for transcribing and translating audio/video files using OpenAI's Whisper and GPT models.

Features

🎬 Extract audio from various video and audio formats
🔤 Transcribe audio using OpenAI's Whisper API
🌐 Translate transcriptions to English or other languages
📝 Generate transcript and subtitle files (SRT)
🗣️ Support for bilingual subtitles
📊 Smart handling of large files by automatic chunking
⚡ Interactive language selection with auto-detection support
🛠️ Modular design for use as a Python library
📈 Progress indicators and detailed logging

Installation

Prerequisites

Python 3.8 or higher
OpenAI API key

Setup

Clone the repository:

git clone https://github.com/tuklu/SonicScribe.git

cd SonicScribe/SonicScribe

Create a virtual environment and activate it:

python -m venv venv

# On Windows

venv\Scripts\activate

# On macOS/Linux

source venv/bin/activate

Install the required packages:
```
pip install -r requirements.txt
```
Create a .env file in the root directory with your OpenAI API key:
```
OPENAI_API_KEY=your_api_key_here
```

Usage

Main Transcription and Translation

The main script provides full functionality for processing audio/video files:

python main.py --input "path/to/your/video.mp4" --translate --output-dir "output/folder"

Options

--input: Path to input audio/video file (required)
--translate: Enable translation to English (optional)
--language: Specify the language of the input file (e.g., en, fr, es). If not provided, auto-detection will be used.
--output-dir: Directory to save output files (default: output/transcripts)
--whisper-model: Model to use for transcription (default: whisper-1)
--gpt-model: Model to use for translation (default: gpt-4o-mini)
--chunk-size: Size of chunks in MB for large files (default: 20)
--verbose or -v: Enable verbose logging
--bilingual: Create bilingual subtitles with both original and translated text

Interactive Language Selection

When processing files, SonicScribe provides an interactive language selection feature:

You can select a language from a predefined list using arrow keys.
You can manually input a language code by typing /.
If no language is selected, SonicScribe will auto-detect the language using GPT, with a warning about potential additional API costs.

Standalone SRT Translation

If you already have an SRT file and just want to translate it:

python translate_srt.py --input "path/to/your/subtitles.srt" --bilingual

Options

--input: Path to input SRT file (required)
--output: Path to output SRT file (default: input_english.srt)
--model: GPT model to use for translation (default: gpt-4o-mini)
--bilingual: Create bilingual SRT with original and translated text
--language: Specify the language of the input subtitles. If not provided, auto-detection will be used.

Examples

Basic Transcription

python main.py --input "lecture.mp4"

This will:

Extract audio from lecture.mp4
Transcribe the audio using Whisper API
Save a transcript and SRT file to the default output directory

Transcription with Translation

python main.py --input "foreign_movie.mp4" --translate --gpt-model "gpt-4o"

This will:

Extract audio from foreign_movie.mp4
Transcribe the audio using Whisper API
Translate the transcription to English using GPT-4o
Save all output files to the default directory

Bilingual Subtitles

python main.py --input "interview.mp3" --translate --bilingual

This will:

Extract audio from interview.mp3
Transcribe the audio using Whisper API
Translate the transcription to English
Create a bilingual SRT file with both original and translated text

Translating Existing Subtitles

python translate_srt.py --input "movie.srt" --bilingual --model "gpt-4o-mini"

This will:

Read the existing SRT file
Translate the subtitles to English using GPT-4o-mini
Create a bilingual SRT with both original and translated text

Output Files

SonicScribe generates several types of output files:

{filename}_transcribed.txt: Plain text transcript
{filename}.srt: SRT subtitle file with timestamps
{filename}_bilingual.srt: Optional bilingual SRT file (when using --bilingual)
{filename}_english.srt: Translated SRT file (when using translate_srt.py)

Handling Large Files

SonicScribe automatically handles large audio files:

Files smaller than 25MB are processed directly through the Whisper API.
Larger files are split into chunks, processed separately, and then recombined.
The --chunk-size parameter controls the size of these chunks (default: 20MB).

Troubleshooting

API Key Issues

If you encounter "API key not found" errors:

Ensure your .env file exists in the project root directory.
Verify that your API key is correct and active.
Try setting the API key directly in your environment.

File Format Problems

If SonicScribe fails to process your file:

Verify the file exists and is not corrupted.
Check that the format is supported (mp4, mkv, mov, mp3, wav, etc.).
Try converting the file to a more standard format like MP4 or WAV.

Memory Issues with Large Files

If you encounter memory errors with very large files:

Try reducing the --chunk-size parameter.
Ensure your system has sufficient free memory.
Consider pre-splitting very large files manually.

Limitations

OpenAI API rate limits may affect processing speed.
Transcription quality depends on audio clarity.
Translation quality varies by language and content complexity.
Processing very large files (multiple hours) can take significant time.

Logging

SonicScribe logs all operations to the logs directory. If you encounter issues, check the log files for detailed information. Use the --verbose flag for more detailed logging.

License

[Your license information here]

Support

For issues, questions, or contributions, please create an issue on the GitHub repository.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

1.0.3

Apr 18, 2025

This version

1.0.0

Apr 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sonicscribe-1.0.0.tar.gz (16.5 kB view details)

Uploaded Apr 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sonicscribe-1.0.0-py3-none-any.whl (17.9 kB view details)

Uploaded Apr 18, 2025 Python 3

File details

Details for the file sonicscribe-1.0.0.tar.gz.

File metadata

Download URL: sonicscribe-1.0.0.tar.gz
Upload date: Apr 18, 2025
Size: 16.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for sonicscribe-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`3e78b3c3c44e0ffada2fc8443e89a2153bf72fb25e8c630525758a0b9bcb92d2`
MD5	`1ffca9ad4221a221fc08039dd03461b6`
BLAKE2b-256	`c6535348a4528b0a574e7103c4a8303e2d6e74174b87d3f5418a0e48bba96e29`

See more details on using hashes here.

File details

Details for the file sonicscribe-1.0.0-py3-none-any.whl.

File metadata

Download URL: sonicscribe-1.0.0-py3-none-any.whl
Upload date: Apr 18, 2025
Size: 17.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for sonicscribe-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`882879133b469914cc44e3326e14640b140f2b0ed6d4f14fc6b7cfd1aa68c972`
MD5	`ef08f0b498bdf9b81b8dbd98ad3a183e`
BLAKE2b-256	`7e834c58fff5ab67f9fe324f315fd16fc3afb48bb9c9c19274b253c85989fa55`

See more details on using hashes here.

sonicscribe 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🎙️ SonicScribe

Features

Installation

Prerequisites

Setup

Usage

Main Transcription and Translation

Options

Interactive Language Selection

Standalone SRT Translation

Options

Examples

Basic Transcription

Transcription with Translation

Bilingual Subtitles

Translating Existing Subtitles

Output Files

Handling Large Files

Troubleshooting

API Key Issues

File Format Problems

Memory Issues with Large Files

Limitations

Logging

License

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes