sonosco·PyPI

Framework for training deep automatic speech recognition models.

These details have not been verified by PyPI

Project links

Homepage

Project description

# Sonosco

Sonosco (from Lat. sonus - sound and nōscō - I know, recognize) is a library for training and deploying deep speech recognition models.

The goal of this project is to enable fast, repeatable and structured training of deep automatic speech recognition (ASR) models as well as providing a transcription server (REST API & frontend) to try out the trained models for transcription.
Additionally, we provide interfaces to ROS in order to use it with the anthropomimetic robot Roboy.

Installation

Via pip

The easiest way to use Sonosco's functionality is via pip:

pip install sonosco

Note: Sonosco requires Python 3.6 or higher.

For reliability, we recommend using an environment virtualization tool, like virtualenv or conda.

For developers or trying out the transcription server

Clone the repository and install dependencies:

# Clone the repo and cd inside it
git clone https://github.com/Roboy/sonosco.git && cd sonosco

# Create a virtual python environment to not pollute the global setup
python -m venv venv

# Activate the virtual environment
source venv/bin/activate

# Install normal requirements
pip install -r requirements.txt

# Link your local sonosco clone into your virtual environment
pip install -e .

Now you can check out some of the Getting Started tutorials, to train a model or use the transcription server.

Quick Start

Dockerized inference server

Get the hold of our new fully trained models from the latest release! Try out the LAS model for the best performance. Then specify the folder with the model to the runner script as shown underneath.

You can get the docker image from dockerhub under yuriyarabskyy/sonosco-inference:1.0. Just run cd server && ./run.sh yuriyarabskyy/sonosco-inference:1.0 to pull and start the server or optionally build your own image by executing the following commands.

cd server

# Build the docker image
./build.sh

# Run the built image
./run.sh sonosco_server

You can also specify the path to your own models by writing ./run.sh <image_name> <path/to/models>.

Open http://localhost:5000 in Chrome. You should be able to add models for performing transcription by clicking on the plus button. Once the models are added, record your own voice by clicking on the record button. You can replay and transcribe with the corresponding buttons.

You can get pretrained models from the release tab in this repository.

High Level Design

# High-Level-Design

The project is split into 4 parts that correlate with each other:

For data(-processing) scripts are provided to download and preprocess some publicly available datasets for speech recognition. Additionally, we provide scripts and functions to create manifest files (i.e. catalog files) for your own data and merge existing manifest files into one.

This data or rather the manifest files can then be used to easily train and evaluate an ASR model. We provide some ASR model architectures, such as LAS, TDS and DeepSpeech2 but also individual pytorch models can be designed to be trained.

The trained model can then be used in a transcription server, that consists of a REST API as well as a simple Vue.js frontend to transcribe voice recorded by a microphone and compare the transcription results to other models (that can be downloaded in our Github repository).

Further we provide example code, how to use different ASR models with ROS and especially the Roboy ROS interfaces (i.e. topics & messages).

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.0.1

Sep 29, 2019

1.0.0

Sep 29, 2019

0.1.1

Sep 1, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sonosco-1.0.1.tar.gz (57.8 kB view details)

Uploaded Sep 29, 2019 Source

Built Distribution

sonosco-1.0.1-py3-none-any.whl (130.3 kB view details)

Uploaded Sep 29, 2019 Python 3

File details

Details for the file sonosco-1.0.1.tar.gz.

File metadata

Download URL: sonosco-1.0.1.tar.gz
Upload date: Sep 29, 2019
Size: 57.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.12.4 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.26.0 CPython/3.5.2

File hashes

Hashes for sonosco-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`7a4e6563e18a77d93bd436e6b3afdaca9c48bebc72d3b5016255b7b78b156451`
MD5	`fe751df5666739a9d025a99f8aa78c7d`
BLAKE2b-256	`a7f62216ec693c880176d5316628095c88dcff21e03364572261cde4c575414c`

See more details on using hashes here.

File details

Details for the file sonosco-1.0.1-py3-none-any.whl.

File metadata

Download URL: sonosco-1.0.1-py3-none-any.whl
Upload date: Sep 29, 2019
Size: 130.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.12.4 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.26.0 CPython/3.5.2

File hashes

Hashes for sonosco-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dbcae42073dc2fa2eb48f2e05f8f0a2262a57773536b9cf0356c91d49ec764e7`
MD5	`0bfcba21f9abefa368251a2617aff7d9`
BLAKE2b-256	`7a354afbf9b5443d11253cbdc90d70bd0f5982b00705e14553cbf6aa6a9b293d`

See more details on using hashes here.

sonosco 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Installation

Via pip

For developers or trying out the transcription server

Quick Start

Dockerized inference server

High Level Design

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes