A wrapper for Yandex SpeechKit API to asyncronously transcribe audio records.
Project description
SpeechKitty
SpeechKitty is a wrapper for Yandex SpeechKit API to asyncronously transcribe audio records.
NOTE
It's very initial version of the package. It works perfectly in my case with Asterisk records, but it's not tested in other use cases and with other records so you may want to wait for version 0.2 to try it.
Key features:
- Scans directory recursively for wav files.
- Applies regex mask to include and exclude certain files.
- Skips already transcribed files.
- Does all intermediate work like converting and uploading audio files to object storage.
- Transcribes and puts json and html files into directory next to audio files.
- Can obfuscate html files' names using hash.
Usage
You can use it as a package or a docker container.
Prerequisites
- Yandex Cloud account.
- Bucket at Object Storage.
- Static access key for Object Storage.
- API key for SpeechKit.
Python Package
-
Install required ffmpeg library.
-
Create venv (preferably) and install package.
pip install speechkitty
- Download scripts from sample directory at project page:
- credentials-example.ini — rename to
credentials.ini
- transcribe_directory.py
-
Fill credentials into
credentials.ini
-
Start transcribing a directory (
/mnt/Records
in the example below):
export $(grep = credentials.ini | xargs)
python transcribe_directory.py /mnt/Records
Docker Container
-
Install Docker.
-
Download project's code from project page on GitHub.
-
Retrieve credentials from Yandex Cloud and put them into
credentials.ini
file. -
Build docker image. For that open project directory in terminal then type:
docker build -t speechkitty .
Building image may take a while. After it finishes:
- Run container. Assuming you have records in
/mnt/Records
and/or its subdirectories, current directory in terminal is project's directory, and you havecredentials.ini
file in thesample
directory, the command will look like:
grep = sample/credentials.ini > sample/credentials.txt
docker run -i --rm --env-file sample/credentials.txt -v /mnt/Records:/mnt/Records \
speechkitty /bin/bash -c "python sample/transcribe_directory.py /mnt/Records"
Or you can use shell script:
source sample/transcribe_directory.sh /mnt/Records
To name html files using hash of the audio files names, add hash function as a second parameter like that:
source sample/transcribe_directory.sh /mnt/Records md5
This can be useful if records directory is being published using a web server (with option preventing directory listing, of course) and you don't want to reveal names of audio files to prevent files from being downloaded via direct link. So you can put something like SELECT CONCAT(TO_HEX(MD5(recordingfile)), ".html") AS transcript
into the DB view to get names of the html files.
Transcribing job may take a while. A good sign that indicates it's working is an appearance of some new json and html files in records directory.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file speechkitty-0.1.5.tar.gz
.
File metadata
- Download URL: speechkitty-0.1.5.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7726639a389146094f99e92ff68389d10c984dc3cc21a1cc5b95a3870e9b6543 |
|
MD5 | 9717d559a786d88e2e46ee0a3633720d |
|
BLAKE2b-256 | b7a5afc7df1bc20ca1ac173a6b8107df178dec217830104bcd8ded809efa5637 |
File details
Details for the file speechkitty-0.1.5-py3-none-any.whl
.
File metadata
- Download URL: speechkitty-0.1.5-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11cf25ef6d8929fa0660591c3f49ac3eb8fbb7e1120017634692c1af03e3f8c7 |
|
MD5 | 6cccaf255eafcdda846f47074ebd7b79 |
|
BLAKE2b-256 | 82ff8bcfa70e4417e6cdcbb1460595db02dd04797b363dc0357118e4579819ef |