Skip to main content

Mozilla's DeepSpeech transcriber in a pip installable package.

Project description

Actions Status Actions Status Actions Status

pydeepspeech

Why you need this

Mozilla's deep speech can't process long voice samples. pydeepspeech fixes this by "chunking" the input sound into seperate wav files that are then individualy processed. Wav files are cut along periods of detected silence, controlled by the aggressive parameter.

Besides this, pydeepspeech is probably better to use anyway because it's much simpler to install than Mozilla's Deepspeech because the required data models needed for pydeepspeech are automatically downloaded and installed on first use.

Quick start

Console api:

$ pip install pydeepspeech
$ pydeepspeech --wav_file <WAVE_FILE> --aggressive 1 --out_file <TEXT_FILE>

-or-

$ pip install pydeepspeech
$ pydeepspeech --wav_file <WAVE_FILE> --out_file <TEXT_FILE> --model_dir <MY_PBMM_AND_SCORER_FILES>

-or-

$ pip install pydeepspeech
$ pydeepspeech_installmodels --pbmm <PBMM_FILE_OR_URL> --scorer <SCORER_FILE_OR_URL>
$ pydeepspeech --wav_file <WAVE_FILE> --out_file <TEXT_FILE>

Or in python

from pydeepspeech.transcribe import transcribe
transcribe(...)

Optional: Create a virtual python package

Download and install virtual env:

# Download
curl -X GET https://raw.githubusercontent.com/zackees/make_venv/main/make_venv.py -o make_env.py
python make_env.py  # Make the environment
source activate.sh  # Enter environment
$ pip install pydeepspeech

To get back into the environment execute source activate.sh (if windows, you must be using git-bash)

Testing

Testing and linting is very simple. Just run tox (link).

$ pip install tox
$ tox

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydeepspeech-1.1.7.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

pydeepspeech-1.1.7-py2.py3-none-any.whl (15.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pydeepspeech-1.1.7.tar.gz.

File metadata

  • Download URL: pydeepspeech-1.1.7.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.9.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.0

File hashes

Hashes for pydeepspeech-1.1.7.tar.gz
Algorithm Hash digest
SHA256 1b56517d032791714358e9d73a926886e22f748202ea67798b9f4fccf332be2c
MD5 e35d02201ac0752a9806d1246a7a5aff
BLAKE2b-256 f15476c4a7880a27b83f38573c4c75c9c636d41d4e25785ff5e3d655677ad0c4

See more details on using hashes here.

File details

Details for the file pydeepspeech-1.1.7-py2.py3-none-any.whl.

File metadata

  • Download URL: pydeepspeech-1.1.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 15.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.9.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.0

File hashes

Hashes for pydeepspeech-1.1.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 22597a259cf4faef6e9104e3780ba8923e5b03debfed46c52956f7f95336cc59
MD5 be0da214074e7b39d691d0dae0c6044d
BLAKE2b-256 7dff8dfb90c3b4c87d78deefd4530605393d7d2fb159ffa84f51adfca4cb783b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page