Skip to main content

Utility to extract the contents of a subtitle file

Project description

pysub-parser

Version Build Status Quality Gate Status CodeCoverage

Utility to extract the contents of a subtitle file.

Supported types:

For more information: http://write.flossmanuals.net/video-subtitling/file-formats

Usage

The method parse requires the following parameters:

  • path: location of the subtitle file.
  • subtype: one of the supported file types, by default file extension is used.
  • encoding: encoding of the file, utf-8 by default.
  • **kwargs: optional parameters.
    • fps: framerate (only used by sub files), 23.976 by default.
from pysubparser import parser

subtitles = parser.parse('./files/space-jam.srt')

for subtitle in subtitles:
    print(subtitle)

Output:

0 > [BALL BOUNCING]
1 > Michael?
2 > What are you doing out here, son? It's after midnight.
3 > MICHAEL: Couldn't sleep, Pops.

Subtitle Class

Each line of a dialogue is represented with a Subtitle object with the following properties:

  • index: position in the file.
  • start: timestamp of the start of the dialog.
  • end: timestamp of the end of the dialog.
  • text: dialog contents.
for subtitle in subtitles:
    print(f'{subtitle.start} > {subtitle.end}')
    print(subtitle.text)
    print()

Output:

00:00:36.328000 > 00:00:38.329000
[BALL BOUNCING]

00:01:03.814000 > 00:01:05.189000
Michael?

00:01:08.402000 > 00:01:11.404000
What are you doing out here, son? It's after midnight.

00:01:11.572000 > 00:01:13.072000
MICHAEL: Couldn't sleep, Pops.

Cleaners

Currently, 4 cleaners are provided:

  • ascii will translate every unicode character to its ascii equivalent.
  • brackets will remove anything between them (e.g., [BALL BOUNCING])
  • formatting will remove formatting keys like <i> and </i>.
  • lower_case will lower case all text.
from pysubparser.cleaners import ascii, brackets, formatting, lower_case

subtitles = brackets.clean(
    lower_case.clean(
        subtitles
    )
)

for subtitle in subtitles:
    print(subtitle)
0 > 
1 > michael?
2 > what are you doing out here, son? it's after midnight.
3 > michael: couldn't sleep, pops.

Writers

Given any list of Subtitle and a path it will output those subtitles in a srt format.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysub-parser-1.4.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysub_parser-1.4-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file pysub-parser-1.4.tar.gz.

File metadata

  • Download URL: pysub-parser-1.4.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.0

File hashes

Hashes for pysub-parser-1.4.tar.gz
Algorithm Hash digest
SHA256 fd1757727c5514ea5c2e8119566973240379a7a25ec1897570ff1b68a561fe0c
MD5 a2193a880e69c853f7c31a779e9eb8a1
BLAKE2b-256 caccd1b0d29003a9d35e1575bded9be096fd110c87d489bc4b1d39c434d75c5e

See more details on using hashes here.

File details

Details for the file pysub_parser-1.4-py3-none-any.whl.

File metadata

  • Download URL: pysub_parser-1.4-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.0

File hashes

Hashes for pysub_parser-1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 7eef1d7df22c398da6443b291dcb038a93c8d864c093b9e520c359ec65396ea3
MD5 4314b0b707ab068787d02473d0c05346
BLAKE2b-256 2511bc845475f6e769988d0a03e4e6f4e1bff2ff76004206d88e9f6cf5d79278

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page