A collection of basic python modules for spoken natural language processing
Project description
py-nltools
----------
A collection of abstraction layers and support functions that form the natural
language processing foundation of the Zamia AI project:
* `phonetics`: translation functions between various phonetic alphabets (IPA, X-SAMPA, X-ARPABET, ...)
* `tts`: abstraction layer towards using eSpeak NG, MaryTTS, SVOX Pico TTS or a remote TTS server and sequitur g2p
* `asr`: abstraction layer towards using kaldi-asr, models can be found here: http://www.zamia-speech.org
* `sequiturclient`: g2p using sequitur
* `pulseplayer`: audio playback through pulseaudio
* `pulserecorder`: audio recording through pulseaudio
* `tokenizer`: english, french and german word tokenizers aimed at spoken language applications
* `threadpool`: simple thread pool implementation
* `vad`: Voice Activity Detection finite state machine based on webrtc VAD
* `macro_engine`: Simple macro engine aimed at generating natural language expansions
I plan to add modules as I need them in the Zamia AI projects. Some modules like `phonetics` and `tokenizer`
have some overlap with larger projects like NLTK or spaCy - my modules tend to be more hands-on and simple minded
than these and therefore are in no way meant to replace them.
ifndef::imagesdir[:imagesdir: images]
ifndef::env-github[]
[ditaa,"highlevel"]
....
+-----------------------------------------------------------------------------------------------+
| nltools |
| +-----------+ +-----------+ +------------+ +--------------+ |
| | tokenizer | | phonetics | | threadpool | | macro_engine | |
| +-----------+ +-----------+ +------------+ +--------------+ |
| |
| +-----------+ +-----------+ +-----------+ +-----------+ +-----------+ |
| | tts | | asr | | vad | | g2p | | audio | |
| +-----------+ +-----------+ +-----------+ +-----------+ +-----------+ |
| | | | | | |
+-----------------------------------------------------------------------------------------------+
| | | | |
+--------+---------+ | | | |
| | | | | | |
v v v v v v v
+------+ +--------+ +------+ +-------+ +--------+ +----------+ +------------+
| mary | | eSpeak | | pico | | kaldi | | webrtc | | sequitur | | pulseaudio |
+------+ +--------+ +------+ +-------+ +--------+ +----------+ +------------+
....
endif::env-github[]
ifdef::env-github[]
image::highlevel.png[Highlevel Diagram]
endif::env-github[]
Requirements
~~~~~~~~~~~~
*Note*: probably incomplete.
* Python 2.7
* for TTS one or more of:
- MaryTTS, py-marytts
- espeak-ng, py-espeak-ng
- SVOX Pico TTS, py-picotts
* for ASR
- kaldi-asr 5.4.248, py-kaldi-asr
* sequitur
* pulseaudio
* webrtc
License
~~~~~~~
My own code is Apache-2.0 licensed unless otherwise noted in the script's copyright
headers.
Some scripts and files are based on works of others, in those cases it is my
intention to keep the original license intact. Please make sure to check the
copyright headers inside for more information.
Authors
~~~~~~~
Guenter Bartsch <guenter@zamia.org>
Paul Guyot <pguyot@kallisys.net>
----------
A collection of abstraction layers and support functions that form the natural
language processing foundation of the Zamia AI project:
* `phonetics`: translation functions between various phonetic alphabets (IPA, X-SAMPA, X-ARPABET, ...)
* `tts`: abstraction layer towards using eSpeak NG, MaryTTS, SVOX Pico TTS or a remote TTS server and sequitur g2p
* `asr`: abstraction layer towards using kaldi-asr, models can be found here: http://www.zamia-speech.org
* `sequiturclient`: g2p using sequitur
* `pulseplayer`: audio playback through pulseaudio
* `pulserecorder`: audio recording through pulseaudio
* `tokenizer`: english, french and german word tokenizers aimed at spoken language applications
* `threadpool`: simple thread pool implementation
* `vad`: Voice Activity Detection finite state machine based on webrtc VAD
* `macro_engine`: Simple macro engine aimed at generating natural language expansions
I plan to add modules as I need them in the Zamia AI projects. Some modules like `phonetics` and `tokenizer`
have some overlap with larger projects like NLTK or spaCy - my modules tend to be more hands-on and simple minded
than these and therefore are in no way meant to replace them.
ifndef::imagesdir[:imagesdir: images]
ifndef::env-github[]
[ditaa,"highlevel"]
....
+-----------------------------------------------------------------------------------------------+
| nltools |
| +-----------+ +-----------+ +------------+ +--------------+ |
| | tokenizer | | phonetics | | threadpool | | macro_engine | |
| +-----------+ +-----------+ +------------+ +--------------+ |
| |
| +-----------+ +-----------+ +-----------+ +-----------+ +-----------+ |
| | tts | | asr | | vad | | g2p | | audio | |
| +-----------+ +-----------+ +-----------+ +-----------+ +-----------+ |
| | | | | | |
+-----------------------------------------------------------------------------------------------+
| | | | |
+--------+---------+ | | | |
| | | | | | |
v v v v v v v
+------+ +--------+ +------+ +-------+ +--------+ +----------+ +------------+
| mary | | eSpeak | | pico | | kaldi | | webrtc | | sequitur | | pulseaudio |
+------+ +--------+ +------+ +-------+ +--------+ +----------+ +------------+
....
endif::env-github[]
ifdef::env-github[]
image::highlevel.png[Highlevel Diagram]
endif::env-github[]
Requirements
~~~~~~~~~~~~
*Note*: probably incomplete.
* Python 2.7
* for TTS one or more of:
- MaryTTS, py-marytts
- espeak-ng, py-espeak-ng
- SVOX Pico TTS, py-picotts
* for ASR
- kaldi-asr 5.4.248, py-kaldi-asr
* sequitur
* pulseaudio
* webrtc
License
~~~~~~~
My own code is Apache-2.0 licensed unless otherwise noted in the script's copyright
headers.
Some scripts and files are based on works of others, in those cases it is my
intention to keep the original license intact. Please make sure to check the
copyright headers inside for more information.
Authors
~~~~~~~
Guenter Bartsch <guenter@zamia.org>
Paul Guyot <pguyot@kallisys.net>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
py-nltools-0.5.0.tar.gz
(83.1 kB
view hashes)
Built Distribution
Close
Hashes for py_nltools-0.5.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a962d5ccac65e4092962350f23ff35a55e3539ebcf4cd21e5ddc2db2b21959c |
|
MD5 | d82665ef06b12be3ef9afe0bd464f4ae |
|
BLAKE2b-256 | 849374447d8a3542a18a3b13db17b6bbb600658a2747e9fceae8aa3e0ff65e5f |