Skip to main content

tpro processes transcripts from speech-to-text services and outputs to various formats.

Project description

tpro

Transcript Processing! tpro takes JSON-formatted transcripts produced by various speech-to-text services and converts them to various standardized formats.

Installation and Usage

Non-pip Requirement: Stanford NER JAR

  • download and unzip this
  • put these files in in /usr/local/bin/:
    • stanford-ner.jar
    • classifiers/english.all.3class.distsim.crf.ser.gz
  • you might have to update Java on Linux

Pip

$ pip install tpro

Usage

$ tpro --help

Usage: tpro [OPTIONS] JSON_PATH_OR_DATA [amazon|gentle|speechmatics]
        [universal_transcript|viral_overlay]

Options:
  -s, --save TEXT  save to file
  --help           Show this message and exit.

Example

$ tpro '{

    "job": {
      "lang": "en",
      "user_id": 2152310,
      "name": "recording.mp4",
      "duration": 7,
      "created_at": "Mon Nov 12 14:57:06 2018",
      "id": 9871364
    },
    "speakers": [
      {
        "duration": "6.87",
        "confidence": null,
        "name": "M2",
        "time": "5.98"
      }
    ],
    "words": [
      {
        "duration": "0.13",
        "confidence": "0.670",
        "name": "Hello",
        "time": "5.98"
      },
      {
        "duration": "0.45",
        "confidence": "1.000",
        "name": "there",
        "time": "6.14"
      }
  ]

}' speechmatics universal_transcript

[
    {
        "start": 5.98,
        "end": 6.11,
        "confidence": 0.67,
        "word": "Hello",
        "always_capitalized": false,
        "punc_after": false,
        "punc_before": false
    },
    {
        "start": 6.14,
        "end": 6.59,
        "confidence": 1.0,
        "word": "there",
        "always_capitalized": false,
        "punc_after": false,
        "punc_before": false
    }
]

$

STT Services

Planned

Output Formats

Planned

  • Word (.doc, .docx)
  • text files
  • SRT (subtitles)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tpro-0.8.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tpro-0.8-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file tpro-0.8.tar.gz.

File metadata

  • Download URL: tpro-0.8.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for tpro-0.8.tar.gz
Algorithm Hash digest
SHA256 0ddaa6f9922e6cc81880015e1e3a9982c5c5ba419989918f919fc376d0cb9409
MD5 31a870c516984cffb65e3109be4eeece
BLAKE2b-256 20eed159c035075a3b3ca07a5413bcee0053b3ce8518737278f8c80639db7d4d

See more details on using hashes here.

File details

Details for the file tpro-0.8-py3-none-any.whl.

File metadata

  • Download URL: tpro-0.8-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for tpro-0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 c751bc080fa4acbfc6ed37f714d8443fda3b66817f9a7cde79c176acfb41a0da
MD5 e0b0c19e4df1c63c5a1dabf524bfdfb0
BLAKE2b-256 37831819f86ad86f1cfe9b734aafb641759c43e5b41635c3a606d77f3909c92c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page