Library for conversion of common timed text formats

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: BSD License
Programming Language
Topic
- Multimedia
- Software Development :: Build Tools

Project description

ttconv (Timed Text Conversion)

  $$\     $$\                                             
  $$ |    $$ |                                            
$$$$$$\ $$$$$$\    $$$$$$$\  $$$$$$\  $$$$$$$\ $$\    $$\ 
\_$$  _|\_$$  _|  $$  _____|$$  __$$\ $$  __$$\\$$\  $$  |
  $$ |    $$ |    $$ /      $$ /  $$ |$$ |  $$ |\$$\$$  / 
  $$ |$$\ $$ |$$\ $$ |      $$ |  $$ |$$ |  $$ | \$$$  /  
  \$$$$  |\$$$$  |\$$$$$$$\ \$$$$$$  |$$ |  $$ |  \$  /   
   \____/  \____/  \_______| \______/ \__|  \__|   \_/

Introduction

ttconv is a library and command line application written in pure Python for converting between timed text formats used in the presentations of captions, subtitles, karaoke, etc.

TTML / IMSC ---                         --- IMSC / TTML
                \                     /
SCC / CEA 608 ----- Canonical Model -------- WebVTT
                /                     \
EBU STL -------                         --- SRT
              /
SRT ---------
            /
WebVTT ----

ttconv works by mapping the input document, whatever its format, into an internal canonical model, which is then mapped to the format of the output document is derived. The canonical model closely follows the TTML 2 data model, as constrained by the IMSC 1.1 Text Profile specification.

Online demo

https://ttconv.sandflow.com/

Format support

ttconv currently supports the following input and output formats. Additional input and output formats are planned, and suggestions/contributions are welcome.

Input Formats

Output Formats

Quick start

To install the latest version of ttconv, including pre-releases:

pip install --pre ttconv

tt convert -i <input .scc file> -o <output .ttml file>

Documentation

Command line

tt convert [-h] -i INPUT -o OUTPUT [--itype ITYPE] [--otype OTYPE] [--config CONFIG] [--config_file CONFIG_FILE]

--itype: TTML | SCC | STL | SRT (extrapolated from the filename, if omitted)
--otype: TTML | SRT | VTT (extrapolated from the filename, if omitted)
--config and --config_file: JSON dictionaries with the following members:
- "general": JSON object: General configuration options (see below)
- "imsc_writer": JSON object: IMSC Writer configuration options (see below)
- "stl_reader": JSON object: STL Reader configuration options (see below)
- "vtt_writer": JSON object: WebVTT Writer configuration options (see below)
- "srt_writer": JSON object: SRT Writer configuration options (see below)
- "scc_reader": JSON object: SCC Reader configuration options (see below)

Example:

tt convert -i <.scc file> -o <.ttml file> --itype SCC --otype TTML --config '{"general": {"progress_bar":false, "log_level":"WARN"}}'

General configuration

progress_bar

"progress_bar": true | false

A progress bar is displayed if progress_bar is true and log_level is "INFO".

Default: true

log_level

"log_level": "INFO" | "WARN" | "ERROR"

Logging verbosity

Default: "INFO"

document_lang

"document_lang": <RFC 5646 language tag>

Overrides the top-level language of the input document.

Example: "document_lang": "es-419"

Default: None

IMSC Writer configuration

time_format

"time_format": "frames" | "clock_time" | "clock_time_with_frames"

Specifies whether the TTML time expressions are in frames (f), HH:MM:SS.mmm or HH:MM:SS:FF

Default: "frames" if "fps" is specified, "clock_time" otherwise

fps

"fps": "<num>/<denom>"

Specifies the ttp:frameRate and ttp:frameRateMultiplier of the output document.

Required when time_format is frames or clock_time_with_frames. No effect otherwise.

Example:

--config '{"general": {"progress_bar":false, "log_level":"WARN"}, "imsc_writer": {"time_format":"clock_time_with_frames", "fps": "25/1"}}'

STL Reader configuration

disable_fill_line_gap

"disable_fill_line_gap" : true | false

true means that the STL reader does not fill gaps between lines

Default: false

disable_line_padding

"disable_line_padding" : true | false

true means that the STL reader does not add padding at the begining/end of lines

Default: false

program_start_tc

"program_start_tc" : "TCP" | "HH:MM:SS:FF"

Specifies a starting offset, either the TCP field of the GSI block or a user-specified timecode

Default: "00:00:00:00"

font_stack

"font_stack" : [<font-families>](https://www.w3.org/TR/ttml2/#style-value-font-families)

Overrides the font stack

Default: "Verdana, Arial, Tiresias, sansSerif"

max_row_count

"max_row_count" : "MNR" | integer

Specifies a maximum number of rows for open subtitles, either the MNR field of the GSI block or a user-specified value

Default: 23

SRT Writer configuration

text_formatting

"text_formatting" : true | false

false means that the SRT writer does not output any text formatting tags

Default: true

VTT Writer configuration

line_position

"line_position" : true | false

true means that the VTT writer outputs line and line alignment cue settings

Default: false

text_align

"text_align" : true | false

true means that the VTT writer outputs text alignment cue settings

Default: false

cue_id

"cue_id" : true | false

true means that the VTT writer outputs cue identifiers

Default: true

SCC Reader configuration

text_align

"text_align" : "auto" | "left" | "center" | "right"

Specifies the text alignment. "auto" means the reader will use heuristics to determine text alignment.

Default: "auto"

Library

The overall architecture of the library is as follows:

Reader modules validate and convert input files into instances of the canonical model (see ttconv.imsc.reader.to_model() for example);
Filter modules transform instances of the canonical data model, e.g. all text styling and positioning might be removed from an instance of the canonical model to match the limited capabilities of downstream devices; and
Writer modules convert instances of the canonical data model into output files.

Processing shared across multiple reader and writer modules is factored out in common modules whenever possible. For example, several output formats require an instance of the canonical data model to be transformed into a sequence of discrete temporal snapshots – a process called ISD generation.

The library uses the Python logging module to report non-fatal events.

Unit tests illustrate the use of the library, e.g. ReaderWriterTest.test_imsc_1_test_suite at src/test/python/test_imsc_writer.py.

Detailed documentation including reference documents is under doc.

Dependencies

Runtime

python >= 3.7

Development

The project uses pipenv to manage dependencies.

Development

Setup

Local

run pipenv install --dev
set the PYTHONPATH environment variable to src/main/python, e.g. export PYTHONPATH=src/main/python
pipenv run can then be used

Docker

docker build --rm -f Dockerfile -t ttconv:latest .
docker run -it --rm ttconv:latest bash

Example

From the root directory of the project:

mkdir build
pipenv install --dev
export PYTHONPATH=src/main/python
python src/main/python/ttconv/tt.py convert -i src/test/resources/scc/mix-rows-roll-up.scc -o build/mix-rows-roll-up.ttml

Code coverage

Unit test code coverage is provided by the script at scripts/coverage.sh

Continuous integration

Overview

Automated testing is provided by the script at scripts/ci.sh

Local

Run PYTHONPATH=src/main/python ./scripts/ci.sh

GitHub actions

See .github/workflows/main.yml

Docker

Run docker run -it --rm ttconv:latest /bin/sh scripts/ci.sh

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: BSD License
Programming Language
Topic
- Multimedia
- Software Development :: Build Tools

Release history Release notifications | RSS feed

1.1.0

Mar 14, 2024

1.1.0rc1 pre-release

Feb 29, 2024

This version

1.0.8

Nov 27, 2023

1.0.8rc1 pre-release

Nov 20, 2023

1.0.8a2 pre-release

Sep 2, 2023

1.0.8a1 pre-release

Sep 1, 2023

1.0.7

Aug 12, 2023

1.0.6

Jun 3, 2023

1.0.6b1 pre-release

Jan 3, 2023

1.0.5

Mar 5, 2022

1.0.5rc1 pre-release

Feb 27, 2022

1.0.4

Oct 16, 2021

1.0.4rc1 pre-release

Oct 13, 2021

1.0.4.dev5 pre-release

Oct 12, 2021

1.0.4.dev4 pre-release

Oct 12, 2021

1.0.4.dev3 pre-release

Oct 12, 2021

1.0.4.dev2 pre-release

Oct 11, 2021

1.0.4.dev1 pre-release

Oct 11, 2021

1.0.3

Sep 21, 2021

1.0.3rc1 pre-release

Sep 14, 2021

1.0.3b3 pre-release

Sep 10, 2021

1.0.3b2 pre-release

Sep 3, 2021

1.0.3b1 pre-release

Aug 19, 2021

1.0.2

Aug 7, 2021

1.0.2rc1 pre-release

Jul 23, 2021

1.0.1

May 26, 2021

1.0.1rc2 pre-release

May 14, 2021

1.0.1rc1 pre-release

Apr 19, 2021

1.0.0

Jan 28, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ttconv-1.0.8.tar.gz (101.0 kB view hashes)

Uploaded Nov 27, 2023 Source

Built Distribution

ttconv-1.0.8-py3-none-any.whl (152.4 kB view hashes)

Uploaded Nov 27, 2023 Python 3

Hashes for ttconv-1.0.8.tar.gz

Hashes for ttconv-1.0.8.tar.gz
Algorithm	Hash digest
SHA256	`9dc7e68b0c80f8c1cd4c2eff624ccef7399953cf8620630ce5f89727c5d529e4`
MD5	`1616d37cae657438e3f7e3992e5ef4c1`
BLAKE2b-256	`cca03708520c508da0656caf77ac75a74cc839cb47e1d556f9e52fd00debf7ef`

Hashes for ttconv-1.0.8-py3-none-any.whl

Hashes for ttconv-1.0.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`340ca3413703f689764c764f56c92c7a647c18ecafb7aab857de49c3a49cfac2`
MD5	`c9a42d870b962a11bd6b79f4e6caee43`
BLAKE2b-256	`7a341989b7b5bc6e4342c6bbc127bfc176fd929170a1acef01a6d5e8bf1bc95d`

ttconv 1.0.8

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

ttconv (Timed Text Conversion)

Introduction

Online demo

Format support

Input Formats

Output Formats

Quick start

Documentation

Command line

General configuration

progress_bar

log_level

document_lang

IMSC Writer configuration

time_format

fps

STL Reader configuration

disable_fill_line_gap

disable_line_padding

program_start_tc

font_stack

max_row_count

SRT Writer configuration

text_formatting

VTT Writer configuration

line_position

text_align

cue_id

SCC Reader configuration

text_align

Library

Dependencies

Runtime

Development

Development

Setup

Local

Docker

Example

Code coverage

Continuous integration

Overview

Local

GitHub actions

Docker

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution