Skip to main content

tpro processes transcripts from speech-to-text services and outputs to various formats.

Project description

tpro

Transcript Processing! tpro takes JSON-formatted transcripts produced by various speech-to-text services and converts them to various standardized formats.

Installation and Usage

Non-pip Requirement: Stanford NER JAR

  • download and unzip this
  • put these files in in /usr/local/bin/:
    • stanford-ner.jar
    • classifiers/english.all.3class.distsim.crf.ser.gz
  • you might have to update Java on Linux

Pip

$ pip install tpro

Usage

$ tpro --help

Usage: tpro [OPTIONS] JSON_PATH_OR_DATA [amazon|gentle|speechmatics]
        [universal_transcript|viral_overlay]

Options:
  -s, --save TEXT  save to file
  --help           Show this message and exit.

Example

$ tpro '{

    "job": {
      "lang": "en",
      "user_id": 2152310,
      "name": "recording.mp4",
      "duration": 7,
      "created_at": "Mon Nov 12 14:57:06 2018",
      "id": 9871364
    },
    "speakers": [
      {
        "duration": "6.87",
        "confidence": null,
        "name": "M2",
        "time": "5.98"
      }
    ],
    "words": [
      {
        "duration": "0.13",
        "confidence": "0.670",
        "name": "Hello",
        "time": "5.98"
      },
      {
        "duration": "0.45",
        "confidence": "1.000",
        "name": "there",
        "time": "6.14"
      }
  ]

}' speechmatics universal_transcript

[
    {
        "start": 5.98,
        "end": 6.11,
        "confidence": 0.67,
        "word": "Hello",
        "always_capitalized": false,
        "punc_after": false,
        "punc_before": false
    },
    {
        "start": 6.14,
        "end": 6.59,
        "confidence": 1.0,
        "word": "there",
        "always_capitalized": false,
        "punc_after": false,
        "punc_before": false
    }
]

$

STT Services

Planned

Output Formats

Planned

  • Word (.doc, .docx)
  • text files
  • SRT (subtitles)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tpro-0.6.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tpro-0.6-py3-none-any.whl (9.4 kB view details)

Uploaded Python 3

File details

Details for the file tpro-0.6.tar.gz.

File metadata

  • Download URL: tpro-0.6.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for tpro-0.6.tar.gz
Algorithm Hash digest
SHA256 29c768065c8206ee381a1b20bf4ab523ddd4dd774bcbd7c3b6a5dab6ffdb25e1
MD5 44808d4cbbbcad37971825776c943927
BLAKE2b-256 9ada2f01342895bc91263d5c91de4a8432c60b824ca1758f73825f0f6c5814b8

See more details on using hashes here.

File details

Details for the file tpro-0.6-py3-none-any.whl.

File metadata

  • Download URL: tpro-0.6-py3-none-any.whl
  • Upload date:
  • Size: 9.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for tpro-0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 a4a1162db3315ab7c3f113eb26eff8e642138491a318fcbb529ff7a5dd5b4319
MD5 13bf12808d408e02d7bc76137c6de5bc
BLAKE2b-256 c9f948e78048fc9102deafd446e7352b979b1fe8f8505187aa9d6722569d198f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page