A library for running inference on a DeepSpeech model
Project description
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu’s Deep Speech research paper. Project DeepSpeech uses Google’s TensorFlow to make the implementation easier.
To install and use deepspeech all you have to do is:
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate
# Install DeepSpeech
pip3 install deepspeech
# Download pre-trained English model and extract
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.5.1/deepspeech-0.5.1-models.tar.gz
tar xvf deepspeech-0.5.1-models.tar.gz
# Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.5.1/audio-0.5.1.tar.gz
tar xvf audio-0.5.1.tar.gz
# Transcribe an audio file
deepspeech --model deepspeech-0.5.1-models/output_graph.pbmm --lm deepspeech-0.5.1-models/lm.binary --trie deepspeech-0.5.1-models/trie --audio audio/2830-3980-0043.wav
A pre-trained English model is available for use and can be downloaded using the instructions below. Currently, only 16-bit, 16 kHz, mono-channel WAVE audio files are supported in the Python client. A package with some example audio files is available for download in our release notes.
Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the release notes to find which GPUs are supported. To run deepspeech on a GPU, install the GPU specific package:
# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate
# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu
# Transcribe an audio file.
deepspeech --model deepspeech-0.5.1-models/output_graph.pbmm --lm deepspeech-0.5.1-models/lm.binary --trie deepspeech-0.5.1-models/trie --audio audio/2830-3980-0043.wav
Please ensure you have the required CUDA dependencies.
See the output of deepspeech -h for more information on the use of deepspeech. (If you experience problems running deepspeech, please check required runtime dependencies).
Table of Contents
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Hashes for deepspeech_gpu-0.6.0a15-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85f8355ec395760d5c47471b6132bf904b1bdfc94e1ba56497ebc24ebc070a60 |
|
MD5 | f538a39dca73fa916b0bbea361b270c3 |
|
BLAKE2b-256 | 4b6db2892cc0586b2951d08d2db8069ffe9adb1e76dce17f2449c887b24a75dd |
Hashes for deepspeech_gpu-0.6.0a15-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99efab044cc5e369a071194d3b5acc38b76ec49c82c23fd06f5007118cccd9aa |
|
MD5 | 27a98a93a34939a36654de17650a3f45 |
|
BLAKE2b-256 | 58d6d3115e2479686817179d28c390c59ae87e079799f0148f6f0c93b16a6baf |
Hashes for deepspeech_gpu-0.6.0a15-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a57d2f482dc6eb3b23d28128ab350b0a93238cf7119c035bb6d0a50c629fcfb5 |
|
MD5 | 75a5ca3056c217ad15d583e2447e2100 |
|
BLAKE2b-256 | b66c43a0a48a10193f8540e4c0efe873147d67eacd9b77d4ba30ba2f80577528 |
Hashes for deepspeech_gpu-0.6.0a15-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cdf9b32a6918b2db5237d0539abe01ce223d775a8cce74e78bc7df491da13d07 |
|
MD5 | 8e75788bfda22766773c6871749dde75 |
|
BLAKE2b-256 | 48577594a95c3d8222f277e81ef06e8734eeb321cb4d2c2bc02cece0feaa8f6b |
Hashes for deepspeech_gpu-0.6.0a15-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 342e4cc83626e78cb6441a49ae338ae5b56c11591c657bc4f5927876d7c53315 |
|
MD5 | 6eb7f6f6c9560410f37801a33ed691b7 |
|
BLAKE2b-256 | bd1b23775b05e558767d9c49f8b9e885cbf11cadd86f917df930b5fb896eafbb |
Hashes for deepspeech_gpu-0.6.0a15-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ae4ede351d81becd815ad62207b054c9fb21e72720e19157cb7c2bf7ad8d532 |
|
MD5 | 0f753a7449f08b1cabed83f90d59d3b1 |
|
BLAKE2b-256 | a0aa5353685eefa5ff3f532e3229ac73d339bba7074c003ebc0ef93c23a8e3de |
Hashes for deepspeech_gpu-0.6.0a15-cp35-cp35m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3603a82b63ded4284f0d8b713976502700be3174c8b24b58abddbf0f0b2524c6 |
|
MD5 | 299c52f054237d7aaffb92ee2ad23fda |
|
BLAKE2b-256 | 3e417e879669c2e6c638e57d79ec989fb1e7ac1eaae168927dc367a2003b33e5 |
Hashes for deepspeech_gpu-0.6.0a15-cp35-cp35m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4dc58e70207d4a1f6a0d0935aca1faa7faeee010c75f50a93cf47d2b0d6294c7 |
|
MD5 | 1076a2b31e1d70efb5f7fb4b8b32cb67 |
|
BLAKE2b-256 | eca0613798fb38f7fecf1cd1a77bc485cb14a2296124ed63d4d72da17d2a25a1 |
Hashes for deepspeech_gpu-0.6.0a15-cp34-cp34m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f3afcd62634edd28c7fc884813f107cf039b727a3ad82d8cb8e16cb728ad4ec |
|
MD5 | 58318246aac9c420bba84be4fd36d342 |
|
BLAKE2b-256 | 98f760fa98c1caf51b0e8c77a33ce16df17c1c5f7a75374de8dd51388ed9a43b |
Hashes for deepspeech_gpu-0.6.0a15-cp27-cp27mu-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7f2123632f74e548a0a026aaf4305b842624a91306ba1f83e499d6eae77b968c |
|
MD5 | 30ebd22a42381df56bf9e40d1e57d4da |
|
BLAKE2b-256 | 042cffbcb2015e767de6d5caf0608cd19a411eaef8539d1b213d979484c52589 |
Hashes for deepspeech_gpu-0.6.0a15-cp27-cp27m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95e777a05484113620c2c916bb700baaae93831e812cfda9fbddecb9fe17606e |
|
MD5 | e4c6f9ac2c613eee09a3fdda64937cb4 |
|
BLAKE2b-256 | 773fb1682075d0b8d1c84c4991dd721722b2adb3f724fe9e2e7fad6060156754 |