tpro processes transcripts from speech-to-text services and outputs to various formats.
Project description
tpro
Transcript Processing! tpro
takes JSON-formatted transcripts produced by
various speech-to-text services and converts them to various standardized
formats.
Installation and Usage
Non-pip Requirement: Stanford NER JAR
- download and unzip this
- put these files in in /usr/local/bin/:
- stanford-ner.jar
- classifiers/english.all.3class.distsim.crf.ser.gz
- you might have to update Java on Linux
Pip
$ pip install tpro
Usage
$ tpro --help
Usage: tpro [OPTIONS] TRANSCRIPT_DATA_PATH OUTPUT_PATH
[amazon|gentle|speechmatics|google] [universal|vo]
Options:
-p, --print-output pretty print the transcript, breaks pipeability
--language-code TEXT specify language, defaults to en-US.
--help Show this message and exit.
Example
$ cat transcript.json
{ "job": {
"lang": "en",
"user_id": 2152310,
"name": "recording.mp4",
"duration": 7,
"created_at": "Mon Nov 12 14:57:06 2018",
"id": 9871364
},
"speakers": [
{
"duration": "6.87",
"confidence": null,
"name": "M2",
"time": "5.98"
}
],
"words": [
{
"duration": "0.13",
"confidence": "0.670",
"name": "Hello",
"time": "5.98"
},
{
"duration": "0.45",
"confidence": "1.000",
"name": "there",
"time": "6.14"
}
]
}
$ tpro transcript.json converted_transcript.json speechmatics universal_transcript
[
{
"start": 5.98,
"end": 6.11,
"confidence": 0.67,
"word": "Hello",
"always_capitalized": false,
"punc_after": false,
"punc_before": false
},
{
"start": 6.14,
"end": 6.59,
"confidence": 1.0,
"word": "there",
"always_capitalized": false,
"punc_after": false,
"punc_before": false
}
]
☝☝☝ There\'s your transcript, which was saved to converted_transcript.json.
STT Services
Planned
Output Formats
- Universal Transcript (JSON)
- viraloverlay (JSON)
Planned
- Word (
.doc
,.docx
) - text files
- SRT (subtitles)
- Draft.js JSON
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tpro-0.15.tar.gz
(7.8 kB
view details)
Built Distribution
tpro-0.15-py3-none-any.whl
(13.2 kB
view details)
File details
Details for the file tpro-0.15.tar.gz
.
File metadata
- Download URL: tpro-0.15.tar.gz
- Upload date:
- Size: 7.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f5b9223b255a443d2ca3a34328774b0a9d227675b26a9aa752256157eb61b214 |
|
MD5 | 52fcffe4ebe806fb376994062a57fa9d |
|
BLAKE2b-256 | 72d2f52bcefea054f740d24f41047a8674c8e88e5557c254fb5deadc65ad5622 |
File details
Details for the file tpro-0.15-py3-none-any.whl
.
File metadata
- Download URL: tpro-0.15-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37b367721a4d7b1c9da78296edda29eab26d65677d079d409ff1eac77ec3d201 |
|
MD5 | 26014101dadc59f5456032ab6af59c4b |
|
BLAKE2b-256 | 33123479cfc232bb3051f23fb265ee4240e43590c0269714054a8bac81cd8252 |