Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (pypi.python.org).
Help us improve Python packaging - Donate today!

A collection of basic python modules for spoken natural language processing

Project Description

py-nltools
==========

+-----------------------------------------------------------------------------------------------+
| nltools |
| +-----------+ +-----------+ +------------+ |
| | tokenizer | | phonetics | | threadpool | |
| +-----------+ +-----------+ +------------+ |
| |
| +-----------+ +-----------+ +-----------+ +-----------+ +-----------+ |
| | tts | | asr | | vad | | g2p | | audio | |
| +-----------+ +-----------+ +-----------+ +-----------+ +-----------+ |
| | | | | | |
+-----------------------------------------------------------------------------------------------+
| | | | |
+--------+---------+ +------+----+ | | |
| | | | | | | |
v v v v v v v v
+------+ +--------+ +------+ +-------+ +-----------+ +--------+ +----------+ +------------+
| mary | | eSpeak | | pico | | kaldi | | cmusphinx | | webrtc | | sequitur | | pulseaudio |
+------+ +--------+ +------+ +-------+ +-----------+ +--------+ +----------+ +------------+

A collection of abstraction layers and support functions that form the
natural language processing foundation of the Zamia AI project:

- `phonetics`: translation functions between various phonetic
alphabets (IPA, X-SAMPA, X-ARPABET, …)

- `tts`: abstraction layer towards using eSpeak NG, MaryTTS, SVOX Pico
TTS or a remote TTS server and sequitur g2p

- `asr`: abstraction layer towards using kaldi-asr and pocketsphinx,
models can be found here: <http://goofy.zamia.org/voxforge/>

- `sequiturclient`: g2p using sequitur

- `pulseplayer`: audio playback through pulseaudio

- `pulserecorder`: audio recording through pulseaudio

- `tokenizer`: english and german word tokenizer aimed at spoken
language applications

- `threadpool`: simple thread pool implementation

- `vad`: Voice Activity Detection finite state machine based on webrtc
VAD

I plan to add modules as I need them in the Zamia AI projects. Some
modules like `phonetics` and `tokenizer` have some overlap with larger
projects like NLTK or spaCy - my modules tend to be more hands-on and
simple minded than these and therefore are in no way meant to replace
them.

Requirements
------------

**Note**: probably incomplete.

- Python 2.7

- for TTS one or more of:

- MaryTTS, py-marytts

- espeak-ng, py-espeak-ng

- SVOX Pico TTS, py-picotts

- for ASR one or more of:

- kaldi-asr 5.1, py-kaldi-asr

- pocketsphinx

- sequitur

- pulseaudio

- webrtc

License
-------

My own code is LGPLv3 licensed unless otherwise noted in the script’s
copyright headers.

Some scripts and files are based on works of others, in those cases it
is my intention to keep the original license intact. Please make sure to
check the copyright headers inside for more information.

Author
------

Guenter Bartsch \<<guenter@zamia.org>\>


Release History

This version
History Node

0.1.3

History Node

0.1.2

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, Size & Hash SHA256 Hash Help File Type Python Version Upload Date
py_nltools-0.1.3-py2.py3-none-any.whl
(32.0 kB) Copy SHA256 Hash SHA256
Wheel py2.py3 Feb 12, 2018
py-nltools-0.1.3.tar.gz
(23.7 kB) Copy SHA256 Hash SHA256
Source None Feb 12, 2018

Supported By

Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Google Google Cloud Servers DreamHost DreamHost Log Hosting