Skip to main content

tpro processes transcripts from speech-to-text services and outputs to various formats.

Project description

tpro

Transcript Processing! tpro takes JSON-formatted transcripts produced by various speech-to-text services and converts them to various standardized formats.

Installation and Usage

Non-pip Requirement: Stanford NER JAR

  • download and unzip this
  • put these files in in /usr/local/bin/:
    • stanford-ner.jar
    • classifiers/english.all.3class.distsim.crf.ser.gz
  • you might have to update Java on Linux

Pip

$ pip install tpro

Usage

$ tpro --help

Usage: tpro [OPTIONS] JSON_PATH_OR_DATA [amazon|gentle|speechmatics]
        [universal_transcript|viral_overlay]

Options:
  -s, --save TEXT  save to file
  --help           Show this message and exit.

Example

$ tpro '{

    "job": {
      "lang": "en",
      "user_id": 2152310,
      "name": "recording.mp4",
      "duration": 7,
      "created_at": "Mon Nov 12 14:57:06 2018",
      "id": 9871364
    },
    "speakers": [
      {
        "duration": "6.87",
        "confidence": null,
        "name": "M2",
        "time": "5.98"
      }
    ],
    "words": [
      {
        "duration": "0.13",
        "confidence": "0.670",
        "name": "Hello",
        "time": "5.98"
      },
      {
        "duration": "0.45",
        "confidence": "1.000",
        "name": "there",
        "time": "6.14"
      }
  ]

}' speechmatics universal_transcript

[
    {
        "start": 5.98,
        "end": 6.11,
        "confidence": 0.67,
        "word": "Hello",
        "always_capitalized": false,
        "punc_after": false,
        "punc_before": false
    },
    {
        "start": 6.14,
        "end": 6.59,
        "confidence": 1.0,
        "word": "there",
        "always_capitalized": false,
        "punc_after": false,
        "punc_before": false
    }
]

$

STT Services

Planned

Output Formats

Planned

  • Word (.doc, .docx)
  • text files
  • SRT (subtitles)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tpro-0.9.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tpro-0.9-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file tpro-0.9.tar.gz.

File metadata

  • Download URL: tpro-0.9.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for tpro-0.9.tar.gz
Algorithm Hash digest
SHA256 23d1d75ea50b666bf582e22c634bed97b247587ee4ae7a5c51c927db369a175a
MD5 86ca252bc4e83832e01b53d7ad3301d1
BLAKE2b-256 21d073309cc77647921a7215041c0fdf1bab5888086b6494eacaf8d4734ab027

See more details on using hashes here.

File details

Details for the file tpro-0.9-py3-none-any.whl.

File metadata

  • Download URL: tpro-0.9-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for tpro-0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 07f508fd20b60a70dc9c585085828d3141350b5971dca3fbe9acccfe136ce537
MD5 d35104d3e57886225e861ae2f6f65fa8
BLAKE2b-256 1e63173002b30ad31e2445c32b2115b61470988146e6ff6dac24d639da9a734d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page