Skip to main content

A wrapper for two ASR services: Yandex SpeechKit and whisperX (based on OpenAI's Whisper) intended to asynchronously transcribe audio records.

Project description

SpeechKitty

python-package docker-image Upload Python Package

SpeechKitty is a wrapper for two ASR services: Yandex SpeechKit and whisperX (based on OpenAI's Whisper) intended to asynchronously transcribe audio records.

NOTE

It's very initial version of the package. It works perfectly in my case with Asterisk records, but it's not tested in other use cases and with other records so you may want to wait for version 0.2 to try it.

Key features:

  1. Scans directory recursively for wav files.
  2. Applies regex mask to include and exclude certain files.
  3. Skips already transcribed files.
  4. Does all intermediate work like converting and uploading audio files to object storage.
  5. Transcribes and puts json and html files into directory next to audio files.
  6. Can obfuscate html files' names using hash.

Usage

You can use it as a package or a docker container.

Prerequisites

-OR-

Python Package

  1. Install required ffmpeg library.

  2. Create venv (preferably) and install package.

pip install speechkitty
  1. Download scripts from sample directory at project page:
  1. Fill in credentials into .env

  2. Start transcribing a directory (/mnt/Records in the example below):

python transcribe_directory.py /mnt/Records

Docker Container

  1. Install Docker.

  2. Download project's code from project page on GitHub.

  3. Put credentials into .env file.

  4. Build docker image. For that open project directory in terminal then type:

docker build -t speechkitty .

Building image may take a while. After it finishes:

  1. Run container. Assuming you have records in /mnt/Records and/or its subdirectories, current directory in terminal is project's directory, and you have .env file in the sample directory, the command will look like:
docker run -i --rm --env-file sample/.env -v /mnt/Records:/mnt/Records \
speechkitty /bin/bash -c "python sample/transcribe_directory.py /mnt/Records"

Or you can use shell script:

source sample/transcribe_directory.sh /mnt/Records

To name html files using hash of the audio files names, add hash function as a second parameter like that:

source sample/transcribe_directory.sh /mnt/Records md5

This can be useful if records directory is being published using a web server (with option preventing directory listing, of course) and you don't want to reveal names of audio files to prevent files from being downloaded via direct link. So you can put something like SELECT CONCAT(TO_HEX(MD5(recordingfile)), ".html") AS transcript into the DB view to get names of the html files.

Transcribing job may take a while. A good sign that indicates it's working is an appearance of some new json and html files in records directory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechkitty-0.2.1.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

speechkitty-0.2.1-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file speechkitty-0.2.1.tar.gz.

File metadata

  • Download URL: speechkitty-0.2.1.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for speechkitty-0.2.1.tar.gz
Algorithm Hash digest
SHA256 a5a4784bc75907c0a637750a56a76c42869d9228a777e3f62763d9d9eae3d3e9
MD5 7fd830c3a5a33654d7350ac3652cd61f
BLAKE2b-256 1aa564023ffe5283c25e213d11197f1b8eea77f67b4d7d64397d76bb1340decb

See more details on using hashes here.

File details

Details for the file speechkitty-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: speechkitty-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for speechkitty-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1f24afb84fb4c95f6337d11cba340ee8f7756902652a17ef2effd22566e2909f
MD5 051cf8861e7ffde2f97e83a1078422ce
BLAKE2b-256 d325d6b83812174f7aa52a34afcb64d9e1f7f3e6f31f1cb8a4d2b8467946dbeb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page