Tiny DSL to generate training dataset for NLU engines
Project description
pychatl |travis| |coveralls| |pypi| |license|
=================================
.. |travis| image:: https://travis-ci.org/atlassistant/pychatl.svg?branch=master
:target: https://travis-ci.org/atlassistant/pychatl
.. |coveralls| image:: https://coveralls.io/repos/github/atlassistant/pychatl/badge.svg?branch=master
:target: https://coveralls.io/github/atlassistant/pychatl?branch=master
.. |pypi| image:: https://badge.fury.io/py/pychatl.svg
:target: https://badge.fury.io/py/pychatl
.. |license| image:: https://img.shields.io/badge/License-GPL%20v3-blue.svg
:target: https://www.gnu.org/licenses/gpl-3.0
Tiny DSL to generate training dataset for NLU engines. Based on the javascript implementation of `chatl <https://github.com/atlassistant/chatl>`_.
Installation
------------
pip
~~~
.. code-block:: bash
$ pip install pychatl
source
~~~~~~
.. code-block:: bash
$ git clone https://github.com/atlassistant/pychatl.git
$ cd pychatl
$ python setup.py install
or
.. code-block:: bash
$ pip install -e .
Usage
-----
From the terminal
~~~~~~~~~~~~~~~~~
.. code-block:: bash
$ pychatl .\example\forecast.dsl .\example\lights.dsl -a snips -o '{ \"language\": \"en\" }'
From the code
~~~~~~~~~~~~~
.. code-block:: python
from pychatl import parse
result = parse("""
# pychatl is really easy to understand.
#
# You can defines:
# - Intents
# - Entities (with or without variants)
# - Synonyms
# - Comments (only at the top level)
# Inside an intent, you got training data.
# Training data can refer to one or more entities and/or synonyms, they will be used
# by generators to generate all possible permutations and training samples.
%[my_intent]
~[greet] some training data @[date]
another training data that uses an @[entity] at @[date#with_variant]
~[greet]
hi
hello
# Entities contains available samples and could refer to a synonym.
@[entity]
some value
other value
~[a synonym]
# Synonyms contains only raw values
~[a synonym]
possible synonym
another one
# Entities and intents can define arbitrary properties that will be made available
# to generators.
# For snips, `snips:type`, `extensible` and `strictness` are used for example.
@[date](snips:type=snips/datetime)
tomorrow
today
# Variants is used only to generate training sample with specific values that should
# maps to the same entity name, here `date`. Props will be merged with the root entity.
@[date#with_variant]
the end of the day
nine o clock
twenty past five
""")
# Now you got a parsed dataset so you may want to process it for a specific NLU engines
from pychatl.postprocess import snips
snips_dataset = snips(result) # Or give options with `snips(result, language='en')`
# And now you got your dataset ready to be fitted within snips-nlu!
Adapters
--------
For now, only the `snips adapter <https://github.com/snipsco/snips-nlu>`_ has been done. Here is a list of adapters and their respective properties:
+-----------------+----------------------+
| adapter | snips |
+=================+======================+
| type (1) | ✔️ with `snips:type` |
+-----------------+----------------------+
| extensible (2) | ✔️ |
+-----------------+----------------------+
| strictness (3) | ✔️ |
+-----------------+----------------------+
1. Specific type of the entity to use (such as datetime, temperature and so on)
2. Are values outside of training samples allowed?
3. Parser threshold
Testing
-------
.. code-block:: bash
$ pip install -e .[test]
$ python -m nose --with-doctest -v --with-coverage --cover-package=pychatl
=================================
.. |travis| image:: https://travis-ci.org/atlassistant/pychatl.svg?branch=master
:target: https://travis-ci.org/atlassistant/pychatl
.. |coveralls| image:: https://coveralls.io/repos/github/atlassistant/pychatl/badge.svg?branch=master
:target: https://coveralls.io/github/atlassistant/pychatl?branch=master
.. |pypi| image:: https://badge.fury.io/py/pychatl.svg
:target: https://badge.fury.io/py/pychatl
.. |license| image:: https://img.shields.io/badge/License-GPL%20v3-blue.svg
:target: https://www.gnu.org/licenses/gpl-3.0
Tiny DSL to generate training dataset for NLU engines. Based on the javascript implementation of `chatl <https://github.com/atlassistant/chatl>`_.
Installation
------------
pip
~~~
.. code-block:: bash
$ pip install pychatl
source
~~~~~~
.. code-block:: bash
$ git clone https://github.com/atlassistant/pychatl.git
$ cd pychatl
$ python setup.py install
or
.. code-block:: bash
$ pip install -e .
Usage
-----
From the terminal
~~~~~~~~~~~~~~~~~
.. code-block:: bash
$ pychatl .\example\forecast.dsl .\example\lights.dsl -a snips -o '{ \"language\": \"en\" }'
From the code
~~~~~~~~~~~~~
.. code-block:: python
from pychatl import parse
result = parse("""
# pychatl is really easy to understand.
#
# You can defines:
# - Intents
# - Entities (with or without variants)
# - Synonyms
# - Comments (only at the top level)
# Inside an intent, you got training data.
# Training data can refer to one or more entities and/or synonyms, they will be used
# by generators to generate all possible permutations and training samples.
%[my_intent]
~[greet] some training data @[date]
another training data that uses an @[entity] at @[date#with_variant]
~[greet]
hi
hello
# Entities contains available samples and could refer to a synonym.
@[entity]
some value
other value
~[a synonym]
# Synonyms contains only raw values
~[a synonym]
possible synonym
another one
# Entities and intents can define arbitrary properties that will be made available
# to generators.
# For snips, `snips:type`, `extensible` and `strictness` are used for example.
@[date](snips:type=snips/datetime)
tomorrow
today
# Variants is used only to generate training sample with specific values that should
# maps to the same entity name, here `date`. Props will be merged with the root entity.
@[date#with_variant]
the end of the day
nine o clock
twenty past five
""")
# Now you got a parsed dataset so you may want to process it for a specific NLU engines
from pychatl.postprocess import snips
snips_dataset = snips(result) # Or give options with `snips(result, language='en')`
# And now you got your dataset ready to be fitted within snips-nlu!
Adapters
--------
For now, only the `snips adapter <https://github.com/snipsco/snips-nlu>`_ has been done. Here is a list of adapters and their respective properties:
+-----------------+----------------------+
| adapter | snips |
+=================+======================+
| type (1) | ✔️ with `snips:type` |
+-----------------+----------------------+
| extensible (2) | ✔️ |
+-----------------+----------------------+
| strictness (3) | ✔️ |
+-----------------+----------------------+
1. Specific type of the entity to use (such as datetime, temperature and so on)
2. Are values outside of training samples allowed?
3. Parser threshold
Testing
-------
.. code-block:: bash
$ pip install -e .[test]
$ python -m nose --with-doctest -v --with-coverage --cover-package=pychatl
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pychatl-1.2.5.tar.gz
(6.5 kB
view details)
File details
Details for the file pychatl-1.2.5.tar.gz
.
File metadata
- Download URL: pychatl-1.2.5.tar.gz
- Upload date:
- Size: 6.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.20.0 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.19.6 CPython/3.6.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a815c9f66854a4786b9d6a729654dafd86f596f41142dae1df76cbb95cc027cb |
|
MD5 | d378b45ab89db0e3de3b54e782c06b87 |
|
BLAKE2b-256 | 3dcbc57448c24b9831022e1842431290f53ba3ef9d816ad3bb0f73fdd782a44a |