Skip to main content

PyCaptions, parser and converter for captions formats

Project description

PyCaptions

PyPI - Version PyPI - License PyPI - Python PyPI - Wheel PyPI - Status PyPI - Downloads

PyCaptions is a caption reading/writing library.

Why LGPL-3.0? This is just to ensure that source code for the library is always under the same licence and cannot be closed-sourced. All the conditions for this licence only apply for the the library itself and it's modifications. We reccomend to just contribute to the project if you are making modifications, unless they are drastic and specific to your case.


Table of Contents

Read the Wiki


Installation

  • PIP
    pip install --upgrade pycaptions
    
  • Source
    git clone https://github.com/adfreelife/PyCaptions.git
    cd PyCaptions
    python setup.py install
    

Supported Formats

*Limited functionality

Future plans

Examples

Generic from file name

from pycaptions import Captions

with Captions("tests/test.en.srt") as captions:
    captions.saveVTT("test")

Generic from file stream

with open("tests/test.en.srt", encoding="UTF-8") as f:
    captions = Captions(f) # or captions = Captions()
                           # captions.read(f)
    captions.saveVTT("test")

Generic from string

srt = """1
00:00:00,500 --> 00:00:02,000
This is a test file
"""
captions = Captions(srt) # or captions = Captions()
                         # captions.detect(srt)
captions.saveVTT("test")

Specific reader

Have the same functions as generic, except

from pycaptions import SubRip, detectSRT

with open("tests/test.en.srt", encoding="UTF-8") as f:
    if detectSRT(f): # or SubRip.detect(f)
        captions = SubRip().read(f)
        captions.saveVTT("test")

Multilingual

from pycaptions import Captions

# if the format supports multiple languages
with Captions("tests/test.ttml") as captions:
    # first line will be in english, second one in spanish
    captions.saveSRT("test", ["en","es"] lines=1) # recomended to specify lines=1
    
# if you have multiple files and you want to make multilingual one
with Captions("tests/test.en.srt") as captions:
    with Captions("tests/test.es.srt") as captions2:
        # only subtitle text and comments (if format supports them) are added
        captions+=captions2 
    # first line will be in english, second one in spanish
    captions.save("test", ["en","es"], lines=1) # recomended to specify lines=1

Combine files

with Captions("tests/test.en.srt") as captions:
    captions.joinFile("tests/test.en.srt", add_end_time=True)
    captions.save("test")

Changelog

v0.7.0

Release date: 2024-02-06

Changes:

  • Added cli support (e.g pycaptions "path/to/file/file.srt" -f vtt)
  • Added autoformat for all values of lines
  • Added function CaptionsFormat.getLanguagesAndFilename
  • Added function CaptionsFormat.getFilename
  • Added MicroTime.fromMicrotime creates a MicroTime from a list
  • Added MicroTime.toMicrotime returns a MicroTime as a list
  • Added MicroTime.fromAnyFormat returns a MicroTime from provided format (case insensitive)
  • MicroTime.fromSUBTime and MicroTime.toSUBTime now supports framerate as string
  • Captions.save output_format is now case insensitive
  • Improved MicroDVD style conversion
  • Internal restructure for faster development
  • Invalid style argument will result in style=None
  • Added style_options for changing style globaly, default style="full" lines=-1, this affects how the style is parsed. (e.g. style_options.style=None and then using argument style="full" will not convert any style due to optimizations for faster conversion)
  • Hypens at the end of the lines (e.g "Some-
    thing") will be removed if lines is >-1
  • Styling is now split into StyleFormat and Styling(StyleFormat)

Fixes:

  • Fixed "lxml is not installed" error
  • Fixed Styling.getTTML converting invalid css properties into ttml properties. To-do: add value checks for these properties.
  • Fixed CaptionsFormat.getLanguagesFromFilename getting languages from directory path (e.g. \path.to.file\file.en.srt -> ["to", "en"])
  • Fixed width and height not being saved to json

Read past changes here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycaptions-0.7.0.tar.gz (36.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycaptions-0.7.0-py3-none-any.whl (46.2 kB view details)

Uploaded Python 3

File details

Details for the file pycaptions-0.7.0.tar.gz.

File metadata

  • Download URL: pycaptions-0.7.0.tar.gz
  • Upload date:
  • Size: 36.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for pycaptions-0.7.0.tar.gz
Algorithm Hash digest
SHA256 de85f4a6db588af4a8925f3751adfd080cacfee9590897030a721b706c61bb10
MD5 7ea4a9820690ecd14cc7aef409bfaec2
BLAKE2b-256 12abae7701ce74ccf73d64fadd3ef8c5e401b1160dfc4af88f3026bf3231fb21

See more details on using hashes here.

File details

Details for the file pycaptions-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: pycaptions-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 46.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for pycaptions-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4815c5b8dbee636921df9606f4461786126281d8d6c0601f2e7339cb140a4b4f
MD5 9d3e0cc838232e63da7f6ee286a250d0
BLAKE2b-256 001b31ba2fcd77f61247a2ad6ef2b59c1b7b71be9cf26bd9954e127f8d84cf17

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page