Skip to main content

A powerful, local-first library and CLI for video transcription and subtitle generation using Whisper.

Project description

README  ·  Contributing  ·  License


Auto-Subs Logo

Auto-Subs

Effortless Subtitle Generation from Whisper Transcriptions.

A powerful, local-first library and CLI for generating subtitles with precise, word-level accuracy.

PyPI Version CI Status Code Coverage
Code style: ruff Types: Mypy License: MIT


Auto-Subs bridges the gap between raw transcription data and perfectly formatted subtitles. Whether you're a developer integrating transcription into your application or a content creator needing quick subtitles, auto-subs provides a robust, simple, and reliable solution.

Key Features

  • 🎯 Intelligent Word Segmentation: Automatically splits word-level transcriptions into perfectly timed subtitle lines based on character limits and natural punctuation breaks.
  • ⚙️ Simple & Powerful API: Use it as a library with a clean, dictionary-based input that requires no complex objects, or as a feature-rich command-line tool.
  • 🛡️ Robust Validation: Automatically handles common data issues, like inverted timestamps (start > end), ensuring your process never breaks on imperfect data.
  • 📄 Multiple Formats: Generate subtitles in the most popular formats: SRT, ASS, and plain TXT.
  • ✅ High Quality & Tested: Strictly typed with Mypy, linted with Ruff, and rigorously tested to ensure reliability.

Installation

pip install auto-subs

Quickstart

As a Command-Line Tool (CLI)

The fastest way to generate a subtitle file from a Whisper-compatible JSON.

# Generate an SRT file with default settings
auto-subs generate path/to/transcription.json

# Generate a styled ASS file with a custom character limit
auto-subs generate input.json -f ass -o styled.ass --max-chars 42

CLI Options:

  • --output, -o: Specify the output file path. (Defaults to the input filename with a new extension)
  • --format, -f: Choose the output format (srt, ass, txt). (Defaults to srt)
  • --max-chars: Set the maximum characters per subtitle line. (Defaults to 35)

As a Python Library

Integrate auto-subs directly into your application for full control.

import json
from auto_subs import generate

# 1. Load your Whisper-compatible transcription data (as a dict)
with open("path/to/transcription.json", "r", encoding="utf-8") as f:
    transcription_data = json.load(f)

try:
    # 2. Generate SRT content with a 40-character limit per line
    srt_content = generate(transcription_data, "srt", max_chars=40)

    # 3. Save the content to a file
    with open("output.srt", "w", encoding="utf-8") as f:
        f.write(srt_content)

    print("Successfully generated subtitles!")

except ValueError as e:
    # Handle validation errors for malformed input data
    print(f"Error: {e}")

API Design: Simplicity First

The public API of auto-subs is designed to be as simple as possible. All functions, like auto_subs.generate(), accept a standard Python dictionary (dict).

This approach was chosen intentionally to:

  • Reduce Friction: You can directly use the JSON output from Whisper after loading it into a dictionary, without needing to import and instantiate our internal Pydantic models.
  • Decouple Your Code: Your project doesn't need to depend on our internal data structures, making your code more resilient to future updates.

While the input is a simple dictionary, auto-subs performs robust internal validation to ensure the data is well-formed, giving you the best of both worlds: a simple API and the safety of strong data validation.

Contributing

Contributions are welcome! If you find a bug or have a feature request, please open an issue. If you'd like to contribute code, please open a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_subs-0.3.0.tar.gz (61.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_subs-0.3.0-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file auto_subs-0.3.0.tar.gz.

File metadata

  • Download URL: auto_subs-0.3.0.tar.gz
  • Upload date:
  • Size: 61.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for auto_subs-0.3.0.tar.gz
Algorithm Hash digest
SHA256 be4e6e9d437229f3e5812761465c0155297ec9238da8a4b9aebaacef6ff89174
MD5 ba07d954137f7aaf7cdc54f497dc3d47
BLAKE2b-256 4b79cf75efa315dc3e12d78765db0d6577f101e21c0dbbd2fd766c4b2f372aad

See more details on using hashes here.

File details

Details for the file auto_subs-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: auto_subs-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for auto_subs-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0a4835bbfff2a88f441a37e8c0ae891a94b3d78a1a3316bf0cbb3351443bf287
MD5 55c4fae9c9a447744f80c09f5a609425
BLAKE2b-256 1c292d26427c00cd5e0c160d07188114ca0c68d93f3a9fa99e772eb7f04d230c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page