Skip to main content

Convert WebVTT to JSON, optionally removing duplicate lines

Project description

webvtt-to-json

PyPI Changelog Tests License

Convert WebVTT to JSON, optionally removing duplicate lines

Installation

Install this tool using pip:

pip install webvtt-to-json

Usage

To output JSON for a WebVTT file:

webvtt-to-json subtitles.vtt

This will output to standard output. Use -o filename to send it to a specified file.

Subtitles can often include duplicate lines. Add -d or --dedupe to attempt to remove those duplicates from the output:

webvtt-to-json --dedupe subtitles.vtt

Use -s or --single to output single "line" keys instead of a "lines" array.

You can also use:

python -m webvtt_to_json ...

Output

Standard output:

[
    {
        "start": "00:00:00.000",
        "end": "00:00:01.829",
        "lines": [
            " ",
            "my<00:00:00.160><c> career</c><00:00:00.480><c> in</c><00:00:00.640><c> side</c><00:00:00.880><c> projects</c><00:00:01.280><c> and</c><00:00:01.520><c> open</c>"
        ]
    }
]

--dedupe output:

[
    {
        "start": "00:00:01.829",
        "end": "00:00:01.839",
        "lines": ["my career in side projects and open"]
    }
]

--dedupe --single output:

[
    {
        "start": "00:00:01.829",
        "end": "00:00:01.839",
        "line": "my career in side projects and open"
    }
]

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd webvtt-to-json
python -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webvtt-to-json-0.2.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

webvtt_to_json-0.2-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file webvtt-to-json-0.2.tar.gz.

File metadata

  • Download URL: webvtt-to-json-0.2.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for webvtt-to-json-0.2.tar.gz
Algorithm Hash digest
SHA256 4d0917a0903c6ae18193ecaa74a5ec058356c00b8b0888a411d74508ac330293
MD5 ed8dddd3d8c9d01e70ea32ffd2139ebe
BLAKE2b-256 8e50d55c03339299ad8ed5ba931d2278755dcf10ffe9af2f1e99bffb2585fd56

See more details on using hashes here.

File details

Details for the file webvtt_to_json-0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for webvtt_to_json-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 28dc9b38318854850e237a28a5350e06a40711e5616b70e86bb37a42bdf0f276
MD5 ef014aadc081f3b67b0d5f5a674bbddf
BLAKE2b-256 5ee7dd630a459f8bac81373a5fd0ba706f68be2eff12d6aff3e14cf3a14c9012

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page