Utility to extract the contents of a subtitle file.
Project description
pysub-parser
Utility to extract the contents of a subtitle file.
Supported types:
ass: Advanced SubStation Alphassa: SubStation Alphasrt: SubRipsub: MicroDVDtxt: Sub Viewer
For more information: http://write.flossmanuals.net/video-subtitling/file-formats
Usage
The method parse requires the following parameters:
path: location of the subtitle file.subtype: one of the supported file types, by default file extension is used.encoding: encoding of the file,utf-8by default.**kwargs: optional parameters.fps: framerate (only used bysubfiles),23.976by default.
from pysubparser import parser
subtitles = parser.parse('./files/space-jam.srt')
for subtitle in subtitles:
print(subtitle)
Output:
0 > [BALL BOUNCING]
1 > Michael?
2 > What are you doing out here, son? It's after midnight.
3 > MICHAEL: Couldn't sleep, Pops.
Subtitle Class
Each line of a dialogue is represented with a Subtitle object with the following properties:
index: position in the file.start: timestamp of the start of the dialog.end: timestamp of the end of the dialog.text: dialog contents.
for subtitle in subtitles:
print(f'{subtitle.start} > {subtitle.end}')
print(subtitle.text)
print()
Output:
00:00:36.328000 > 00:00:38.329000
[BALL BOUNCING]
00:01:03.814000 > 00:01:05.189000
Michael?
00:01:08.402000 > 00:01:11.404000
What are you doing out here, son? It's after midnight.
00:01:11.572000 > 00:01:13.072000
MICHAEL: Couldn't sleep, Pops.
Cleaners
Currently, 4 cleaners are provided:
asciiwill translate every unicode character to its ascii equivalent.bracketswill remove anything between them (e.g.,[BALL BOUNCING])formattingwill remove formatting keys like<i>and</i>.lower_casewill lower case all text.
from pysubparser.cleaners import ascii, brackets, formatting, lower_case
subtitles = brackets.clean(
lower_case.clean(
subtitles
)
)
for subtitle in subtitles:
print(subtitle)
0 >
1 > michael?
2 > what are you doing out here, son? it's after midnight.
3 > michael: couldn't sleep, pops.
Writers
Given any list of Subtitle and a path it will output those subtitles in a srt format.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pysub_parser-1.7.1.tar.gz.
File metadata
- Download URL: pysub_parser-1.7.1.tar.gz
- Upload date:
- Size: 7.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.8.18 Linux/6.2.0-1016-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f539d30a1b23c0674047835505816abe5ba661414b63497b13153ab4421eda5
|
|
| MD5 |
bd1633d4e2a3918fd10312281236a03c
|
|
| BLAKE2b-256 |
6a4280a9cee612de7d5f3d940befd2bcfe149e39c3e43662048b49fdadb607ab
|
File details
Details for the file pysub_parser-1.7.1-py3-none-any.whl.
File metadata
- Download URL: pysub_parser-1.7.1-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.8.18 Linux/6.2.0-1016-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
02fd234a49a8ab4e36d98a3ed58801466e73178a11b7eab4e62b347ba92b24a9
|
|
| MD5 |
c86ea6e5a6bf3352f31e912977206517
|
|
| BLAKE2b-256 |
3b98e49af609f6a654d1beb4293dd583dcdb80e67f300a6c2d345ab02c3f0631
|