server for mozilla deepspeech
Project description
Key Features
This is an http server that can be used to test the Mozilla DeepSpeech project. You need an environment with DeepSpeech and a model to run this server.
Installation
You first need to install deepspeech. Depending on your system you can use the CPU package:
pip3 install deepspeech
Or the GPU package:
pip3 install deepspeech-gpu
Then you can install the deepspeech server:
python3 setup.py install
The server is also available on pypi, so you can install it with pip:
pip3 install deepspeech-server
Note that python 3.5 is the minimum version required to run the server.
Starting the server
deepspeech-server --config config.json
You can use deepspeech without training a model yourself. Pre-trained models are provided by Mozilla in the release page of the project (See the assets section of the release note):
https://github.com/mozilla/DeepSpeech/releases
Once your downloaded a pre-trained model, you can untar it and directly use the sample configuration file:
cp config.sample.json config.json
deepspeech-server --config config.json
Server configuration
The configuration is done with a json file, provided with the “–config” argument. Its structure is the following one:
{
"deepspeech": {
"model" :"models/output_graph.pb",
"lm": "models/lm.binary",
"trie": "models/trie",
"features": {
"beam_width": 500,
"lm_alpha": 0.75,
"lm_beta": 1.85
}
},
"server": {
"http": {
"host": "0.0.0.0",
"port": 8080,
"request_max_size": 1048576
}
},
"log": {
"level": [
{ "logger": "deepspeech_server", "level": "DEBUG"}
]
}
}
The configuration file contains several sections and sub-sections.
deepspeech section configuration
Section “deepspeech” contains configuration of the deepspeech engine:
model is the protobuf model that was generated by deepspeech
lm is the language model.
trie is the trie file.
features contains the features settings that have been used to train the model. This field can be set to null to keep the default settings.
Section “server” contains configuration of the access part, with on subsection per protocol:
http section configuration
request_max_size (default value: 1048576, i.e. 1MiB) is the maximum payload size allowed by the server. A received payload size above this threshold will return a “413: Request Entity Too Large” error.
host (default value: “0.0.0.0”) is the listen address of the http server.
port (default value: 8080) is the listening port of the http server.
log section configuration
The log section can be used to set the log levels of the server. This section contains a list of log entries. Each log entry contains the name of a logger and its level. Both follow the convention of the python logging module.
Using the server
Inference on the model is done via http post requests. For example with the following curl command:
curl -X POST --data-binary @testfile.wav http://localhost:8080/stt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file deepspeech-server-2.0.0.tar.gz
.
File metadata
- Download URL: deepspeech-server-2.0.0.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.5.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 452495225de8e244a4edc1052349edca8db60735f88cd3872ccf3226c801deb6 |
|
MD5 | da03391dacd26d8ec26ffd3f0e82cbc9 |
|
BLAKE2b-256 | 77faa7286f58214d696f708d1ba7ca2954b25b86f37fa221fe7e981869fde48b |