Utility to extract the contents of a subtitle file
Project description
pysub-parser
Utility to extract the contents of a subtitle file.
Supported types:
ass
: Advanced SubStation Alphassa
: SubStation Alphasrt
: SubRipsub
: MicroDVDtxt
: Sub Viewer
For more information: http://write.flossmanuals.net/video-subtitling/file-formats
Usage
The method parse requires the following parameters:
path
: location of the subtitle file.subtype
: one of the supported file types, by default file extension is used.encoding
: encoding of the file,utf-8
by default.**kwargs
: optional parameters.fps
: framerate (only used bysub
files),23.976
by default.
from pysubparser import parser
subtitles = parser.parse('./files/space-jam.srt')
for subtitle in subtitles:
print(subtitle)
Output:
0 > [BALL BOUNCING]
1 > Michael?
2 > What are you doing out here, son? It's after midnight.
3 > MICHAEL: Couldn't sleep, Pops.
Subtitle Class
Each line of a dialogue is represented with a Subtitle
object with the following properties:
index
: position in the file.start
: timestamp of the start of the dialog.end
: timestamp of the end of the dialog.text
: dialog contents.
for subtitle in subtitles:
print(f'{subtitle.start} > {subtitle.end}')
print(subtitle.text)
print()
Output:
00:00:36.328000 > 00:00:38.329000
[BALL BOUNCING]
00:01:03.814000 > 00:01:05.189000
Michael?
00:01:08.402000 > 00:01:11.404000
What are you doing out here, son? It's after midnight.
00:01:11.572000 > 00:01:13.072000
MICHAEL: Couldn't sleep, Pops.
Cleaners
Currently, 4 cleaners are provided:
ascii
will translate every unicode character to its ascii equivalent.brackets
will remove anything between them (e.g.,[BALL BOUNCING]
)formatting
will remove formatting keys like<i>
and</i>
.lower_case
will lower case all text.
from pysubparser.cleaners import ascii, brackets, formatting, lower_case
subtitles = brackets.clean(
lower_case.clean(
subtitles
)
)
for subtitle in subtitles:
print(subtitle)
0 >
1 > michael?
2 > what are you doing out here, son? it's after midnight.
3 > michael: couldn't sleep, pops.
Writers
Given any list of Subtitle
and a path it will output those subtitles in a srt
format.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pysub-parser-1.2.tar.gz
(5.4 kB
view details)
Built Distribution
File details
Details for the file pysub-parser-1.2.tar.gz
.
File metadata
- Download URL: pysub-parser-1.2.tar.gz
- Upload date:
- Size: 5.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
ce83bf461dd12848e4d6ece2b441cc9ae8b8d226a01d49e0935a7d7f97b2ee69
|
|
MD5 |
427e19f2f67590baabe41f3144017564
|
|
BLAKE2b-256 |
d730de98f379a8693d40d96771c529df0806272ba5cb55058b591043f859cad6
|
File details
Details for the file pysub_parser-1.2-py3-none-any.whl
.
File metadata
- Download URL: pysub_parser-1.2-py3-none-any.whl
- Upload date:
- Size: 8.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
8c1bcf6608745518be3b27f5dfd9692e188b93bee35637f6c147a4c0904c2eca
|
|
MD5 |
55823d19d4982ba7e14b55ce87eefce9
|
|
BLAKE2b-256 |
6f2d594f00afc2d479a4ac6f2c01319db8beba9ea9b9ed1fee0319875adfd2ec
|