Kidx NLU a natural language parser for bots
Project description
# Kidx NLU
[![Join the forum at https://forum.rasa.com](https://img.shields.io/badge/forum-join%20discussions-brightgreen.svg)](https://forum.rasa.com/?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![PyPI version](https://badge.fury.io/py/rasa-nlu.svg)](https://badge.fury.io/py/rasa-nlu)
[![Supported Python Versions](https://img.shields.io/pypi/pyversions/rasa_nlu.svg)](https://pypi.python.org/pypi/rasa_nlu)
[![Build Status](https://travis-ci.com/RasaHQ/rasa_nlu.svg?branch=master)](https://travis-ci.com/RasaHQ/rasa_nlu)
[![Coverage Status](https://coveralls.io/repos/github/RasaHQ/rasa_nlu/badge.svg?branch=master)](https://coveralls.io/github/RasaHQ/rasa_nlu?branch=master)
[![Documentation Status](https://img.shields.io/badge/docs-stable-brightgreen.svg)](https://rasa.com/docs/nlu/)
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2FRasaHQ%2Frasa_nlu.svg?type=shield)](https://app.fossa.io/projects/git%2Bgithub.com%2FRasaHQ%2Frasa_nlu?ref=badge_shield)
<img align="right" height="244" src="https://www.rasa.com/assets/img/sara/sara-open-source-lg.png">
Kidx NLU (Natural Language Understanding) is a tool for understanding what is being said in short pieces of text.
For example, taking a short message like:
> *"I'm looking for a Mexican restaurant in the center of town"*
And returning structured data like:
```
intent: search_restaurant
entities:
- cuisine : Mexican
- location : center
```
Kidx NLU is primarily used to build chatbots and voice apps, where this is called intent classification and entity extraction.
To use Rasa, *you have to provide some training data*.
That is, a set of messages which you've already labelled with their intents and entities.
Rasa then uses machine learning to pick up patterns and generalise to unseen sentences.
You can think of Rasa NLU as a set of high level APIs for building your own language parser using existing NLP and ML libraries.
If you are new to Rasa NLU and want to create a bot, you should start with the [**tutorial**](https://rasa.com/docs/nlu/quickstart/).
- **What does Rasa NLU do? 🤔** [Read About the Kidx Stack](http://kidx.ai/products/kidx-stack/)
- **I'd like to read the detailed docs 🤓** [Read The Docs](https://rasa.com/docs/nlu/)
- **I'm ready to install Rasa NLU! 🚀** [Installation](https://rasa.com/docs/nlu/installation/)
- **I have a question ❓** [Rasa Community Forum](https://forum.rasa.com)
- **I would like to contribute 🤗** [How to contribute](# how-to-contribute)
### Important Note
Current github master version does NOT support python 2.7 anymore (neither
will the next major release). If you want to use Kidx NLU with python
2.7, please install the most recent version from pypi (0.14).
# Quick Install
For the full installation instructions, please head over to the documentation: [Installation](https://rasa.com/docs/nlu/installation/)
**Via Docker Image**
From docker hub:
```
```
(for more docker installation options see [Advanced Docker Installation](#advanced-docker))
For developer
```
docker build -f docker/{test_dockerfile} . -t {YOUR_VERSION_TAG}
```
Start nlu service
```
docker run --name {YOUR_DOCKER_NAME} -p 5000:5000 -itd {YOUR_VERSION_TAG}
```
**Via Python Library**
From pypi:
```
pip install kidx_nlu
python -m kidx_nlu.server &
```
(for more python installation options see [Advanced Python Installation](#advanced-python))
### Basic test
The below command can be executed for either method used above.
```
curl 'http://localhost:5000/parse?q=hello'
```
# Example use
### Get the Server Status
```
curl 'http://localhost:5000/status'
```
### Check the Server Version
```
curl 'http://localhost:5000/version'
```
### Training New Models
[Examples](http://git.mykidx.com/nlp/kidx_nlu/tree/master/data/examples/kidx)
and [Documentation](https://rasa.com/docs/nlu/dataformat/) of the training data
format are provided. But as a quick start execute the below command to train
a new model
#### Json format
```
curl 'http://git.mykidx.com/nlp/kidx_nlu/raw/master/sample_configs/config_train_server_json.yml' | \
curl --request POST --header 'content-type: application/x-yml' --data-binary @- --url 'localhost:5000/train?project=test_model'
```
This will train a simple keyword based models (not usable for anything but this demo). For better
pipelines consult the documentation.
#### Markdown format
```
wget 'http://git.mykidx.com/nlp/kidx_nlu/raw/master/sample_configs/config_train_server_md.yml'
curl --request POST --header 'content-type: application/x-yml' --data-binary @config_train_server_md.yml --url 'localhost:5000/train?project=test_model'
```
The above command does the following:
1. It Fetches some of the example data in the repo
2. It `POSTS` that data to the `/train` endpoint and names the model `project=test_model`
### Parsing New Requests
Make sure the above command has finished before executing the below. You can check with the `/status` command above.
```
curl 'http://localhost:5000/parse?q=hello&project=test_model'
```
# FAQ
### Who is it for?
The intended audience is mainly __people developing bots__, starting from scratch or looking to find a a drop-in replacement for [wit](https://wit.ai), [LUIS](https://www.luis.ai), or [Dialogflow](https://dialogflow.com). The setup process is designed to be as simple as possible. Kidx NLU is written in Python, but you can use it from any language through a [HTTP API](https://rasa.com/docs/nlu/http/). If your project is written in Python you can [simply import the relevant classes](https://rasa.com/docs/nlu/python/). If you're currently using wit/LUIS/Dialogflow, you just:
1. Download your app data from wit, LUIS, or Dialogflow and feed it into Kidx NLU
2. Run Kidx NLU on your machine and switch the URL of your wit/LUIS api calls to `localhost:5000/parse`.
### Why should I use Kidx NLU?
* You don't have to hand over your data to FB/MSFT/GOOG
* You don't have to make a `https` call to parse every message.
* You can tune models to work well on your particular use case.
These points are laid out in more detail in a
[blog post](https://blog.rasa.com/put-on-your-robot-costume-and-be-the-minimum-viable-bot-yourself/).
Rasa is a set of tools for building more advanced bots, developed by
the company [KidxAI](https://kidx.ai). Kidx NLU is the natural language
understanding module, and the first component to be open-sourced.
### What languages does it support?
The `supervised_embeddings` pipeline works in any language.
If you want to use pre-trained word embeddings, there are models available for
many languages. See details [here](https://rasa.com/docs/nlu/languages/)
### How to contribute
We are very happy to receive and merge your contributions. There is some more information about the style of the code and docs in the [documentation](https://rasa.com/docs/contributing/).
In general the process is rather simple:
1. create an issue describing the feature you want to work on (or have a look at issues with the label [help wanted](https://github.com/RasaHQ/kidx_nlu/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22))
2. write your code, tests and documentation
3. create a pull request describing your changes
You pull request will be reviewed by a maintainer, who might get back to you about any necessary changes or questions. You will also be asked to sign the [Contributor License Agreement](https://cla-assistant.io/RasaHQ/rasa_nlu)
# Advanced installation
### Advanced Python
From github:
```
git clone git@git.mykidx.com:nlp/kidx_nlu.git
cd kidx_nlu
pip install -r requirements.txt
pip install -e .
```
For local development make sure you install the development requirements:
```
pip install -r alt_requirements/requirements_dev.txt
pip install -e .
```
To test the installation use (this will run a very stupid default model. you need to [train your own model](https://rasa.com/docs/nlu/quickstart/) to do something useful!):
### Advanced Docker
Before you start, ensure you have the latest version of docker engine on your machine. You can check if you have docker installed by typing ```docker -v``` in your terminal.
To see all available builds go to the [Rasa docker hub](https://hub.docker.com/r/rasa/rasa_nlu/), but to get up and going the quickest just run:
```
docker run -p 5000:5000 rasa/kidx_nlu:latest-full
```
There are also three volumes, which you may want to map: `/app/projects`, `/app/logs`, and `/app/data`. It is also possible to override the config file used by the server by mapping a new config file to the volume `/app/config.json`. For complete docker usage instructions go to the official [docker hub readme](https://hub.docker.com/r/rasa/rasa_nlu/).
To test run the below command after the container has started. For more info on using the HTTP API see [here](https://rasa.com/docs/nlu/http/#endpoints)
```
curl 'http://localhost:5000/parse?q=hello'
```
### Docker Cloud
Warning! setting up Docker Cloud is quite involved - this method isn't recommended unless you've already configured Docker Cloud Nodes (or swarms)
[![Deploy to Docker Cloud](https://files.cloud.docker.com/images/deploy-to-dockercloud.svg)](https://cloud.docker.com/stack/deploy/?repo=https://github.com/RasaHQ/rasa_nlu/tree/master/docker)
### Install Pretrained Models for Spacy & Mitie
In order to use the Spacy or Mitie backends make sure you have one of their pretrained models installed.
```
python -m spacy download en
```
To download the Mitie model run and place it in a location that you can
reference in your configuration during model training:
```
wget https://github.com/mit-nlp/MITIE/releases/download/v0.4/MITIE-models-v0.2.tar.bz2
tar jxf MITIE-models-v0.2.tar.bz2
```
If you want to run the tests, you need to copy the model into the Rasa folder:
```
cp MITIE-models/english/total_word_feature_extractor.dat KIDX_NLU_ROOT/data/
```
Where `KIDX_NLU_ROOT` points to your Rasa installation directory.
# Development Internals
### Steps to release a new version
Releasing a new version is quite simple, as the packages are build and distributed by travis. The following things need to be done to release a new version
1. update [kidx_nlu/version.py](http://git.mykidx.com:8888/nlp/kidx_nlu/blob/master/rasa_nlu/version.py) to reflect the correct version number
2. edit the [CHANGELOG.rst](http://git.mykidx.com:8888/nlp/kidx_nlu/blob/master/CHANGELOG.rst), create a new section for the release (eg by moving the items from the collected master section) and create a new master logging section
3. edit the [migration guide](http://git.mykidx.com:8888/nlp/kidx_nlu/blob/master/docs/migrations.rst) to provide assistance for users updating to the new version
4. commit all the above changes and tag a new release, e.g. using
```
git tag -f 0.7.0 -m "Some helpful line describing the release"
git push origin 0.7.0
```
travis will build this tag and push a package to [pypi](https://pypi.python.org/pypi/kidx_nlu)
5. only if it is a **major release**, a new branch should be created pointing to the same commit as the tag to allow for future minor patches, e.g.
```
git checkout -b 0.7.x
git push origin 0.7.x
```
### Running and changing the unit test
To build & edit the docs, first install all necessary dependencies:
```
docker build -f docker/Dockerfile_dev . -t kidx_nlu_test
docker run --name kidx_nlu_test -v "$PWD":/app -it kidx_nlu_test:0.0.1a2 bash
```
After the docker container start run command in docker
```
pip install -e . --no-cache-dir -i https://mirrors.aliyun.com/pypi/simple//
make lint
make test
```
Look the coverage should be no failure and pass 100%
## License
Licensed under the Apache License, Version 2.0. Copyright 2019
Rasa Technologies GmbH. [Copy of the license](LICENSE.txt).
A list of the Licenses of the dependencies of the project can be found at
the bottom of the
[Libraries Summary](https://libraries.io/github/RasaHQ/rasa_nlu).
[![Join the forum at https://forum.rasa.com](https://img.shields.io/badge/forum-join%20discussions-brightgreen.svg)](https://forum.rasa.com/?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![PyPI version](https://badge.fury.io/py/rasa-nlu.svg)](https://badge.fury.io/py/rasa-nlu)
[![Supported Python Versions](https://img.shields.io/pypi/pyversions/rasa_nlu.svg)](https://pypi.python.org/pypi/rasa_nlu)
[![Build Status](https://travis-ci.com/RasaHQ/rasa_nlu.svg?branch=master)](https://travis-ci.com/RasaHQ/rasa_nlu)
[![Coverage Status](https://coveralls.io/repos/github/RasaHQ/rasa_nlu/badge.svg?branch=master)](https://coveralls.io/github/RasaHQ/rasa_nlu?branch=master)
[![Documentation Status](https://img.shields.io/badge/docs-stable-brightgreen.svg)](https://rasa.com/docs/nlu/)
[![FOSSA Status](https://app.fossa.io/api/projects/git%2Bgithub.com%2FRasaHQ%2Frasa_nlu.svg?type=shield)](https://app.fossa.io/projects/git%2Bgithub.com%2FRasaHQ%2Frasa_nlu?ref=badge_shield)
<img align="right" height="244" src="https://www.rasa.com/assets/img/sara/sara-open-source-lg.png">
Kidx NLU (Natural Language Understanding) is a tool for understanding what is being said in short pieces of text.
For example, taking a short message like:
> *"I'm looking for a Mexican restaurant in the center of town"*
And returning structured data like:
```
intent: search_restaurant
entities:
- cuisine : Mexican
- location : center
```
Kidx NLU is primarily used to build chatbots and voice apps, where this is called intent classification and entity extraction.
To use Rasa, *you have to provide some training data*.
That is, a set of messages which you've already labelled with their intents and entities.
Rasa then uses machine learning to pick up patterns and generalise to unseen sentences.
You can think of Rasa NLU as a set of high level APIs for building your own language parser using existing NLP and ML libraries.
If you are new to Rasa NLU and want to create a bot, you should start with the [**tutorial**](https://rasa.com/docs/nlu/quickstart/).
- **What does Rasa NLU do? 🤔** [Read About the Kidx Stack](http://kidx.ai/products/kidx-stack/)
- **I'd like to read the detailed docs 🤓** [Read The Docs](https://rasa.com/docs/nlu/)
- **I'm ready to install Rasa NLU! 🚀** [Installation](https://rasa.com/docs/nlu/installation/)
- **I have a question ❓** [Rasa Community Forum](https://forum.rasa.com)
- **I would like to contribute 🤗** [How to contribute](# how-to-contribute)
### Important Note
Current github master version does NOT support python 2.7 anymore (neither
will the next major release). If you want to use Kidx NLU with python
2.7, please install the most recent version from pypi (0.14).
# Quick Install
For the full installation instructions, please head over to the documentation: [Installation](https://rasa.com/docs/nlu/installation/)
**Via Docker Image**
From docker hub:
```
```
(for more docker installation options see [Advanced Docker Installation](#advanced-docker))
For developer
```
docker build -f docker/{test_dockerfile} . -t {YOUR_VERSION_TAG}
```
Start nlu service
```
docker run --name {YOUR_DOCKER_NAME} -p 5000:5000 -itd {YOUR_VERSION_TAG}
```
**Via Python Library**
From pypi:
```
pip install kidx_nlu
python -m kidx_nlu.server &
```
(for more python installation options see [Advanced Python Installation](#advanced-python))
### Basic test
The below command can be executed for either method used above.
```
curl 'http://localhost:5000/parse?q=hello'
```
# Example use
### Get the Server Status
```
curl 'http://localhost:5000/status'
```
### Check the Server Version
```
curl 'http://localhost:5000/version'
```
### Training New Models
[Examples](http://git.mykidx.com/nlp/kidx_nlu/tree/master/data/examples/kidx)
and [Documentation](https://rasa.com/docs/nlu/dataformat/) of the training data
format are provided. But as a quick start execute the below command to train
a new model
#### Json format
```
curl 'http://git.mykidx.com/nlp/kidx_nlu/raw/master/sample_configs/config_train_server_json.yml' | \
curl --request POST --header 'content-type: application/x-yml' --data-binary @- --url 'localhost:5000/train?project=test_model'
```
This will train a simple keyword based models (not usable for anything but this demo). For better
pipelines consult the documentation.
#### Markdown format
```
wget 'http://git.mykidx.com/nlp/kidx_nlu/raw/master/sample_configs/config_train_server_md.yml'
curl --request POST --header 'content-type: application/x-yml' --data-binary @config_train_server_md.yml --url 'localhost:5000/train?project=test_model'
```
The above command does the following:
1. It Fetches some of the example data in the repo
2. It `POSTS` that data to the `/train` endpoint and names the model `project=test_model`
### Parsing New Requests
Make sure the above command has finished before executing the below. You can check with the `/status` command above.
```
curl 'http://localhost:5000/parse?q=hello&project=test_model'
```
# FAQ
### Who is it for?
The intended audience is mainly __people developing bots__, starting from scratch or looking to find a a drop-in replacement for [wit](https://wit.ai), [LUIS](https://www.luis.ai), or [Dialogflow](https://dialogflow.com). The setup process is designed to be as simple as possible. Kidx NLU is written in Python, but you can use it from any language through a [HTTP API](https://rasa.com/docs/nlu/http/). If your project is written in Python you can [simply import the relevant classes](https://rasa.com/docs/nlu/python/). If you're currently using wit/LUIS/Dialogflow, you just:
1. Download your app data from wit, LUIS, or Dialogflow and feed it into Kidx NLU
2. Run Kidx NLU on your machine and switch the URL of your wit/LUIS api calls to `localhost:5000/parse`.
### Why should I use Kidx NLU?
* You don't have to hand over your data to FB/MSFT/GOOG
* You don't have to make a `https` call to parse every message.
* You can tune models to work well on your particular use case.
These points are laid out in more detail in a
[blog post](https://blog.rasa.com/put-on-your-robot-costume-and-be-the-minimum-viable-bot-yourself/).
Rasa is a set of tools for building more advanced bots, developed by
the company [KidxAI](https://kidx.ai). Kidx NLU is the natural language
understanding module, and the first component to be open-sourced.
### What languages does it support?
The `supervised_embeddings` pipeline works in any language.
If you want to use pre-trained word embeddings, there are models available for
many languages. See details [here](https://rasa.com/docs/nlu/languages/)
### How to contribute
We are very happy to receive and merge your contributions. There is some more information about the style of the code and docs in the [documentation](https://rasa.com/docs/contributing/).
In general the process is rather simple:
1. create an issue describing the feature you want to work on (or have a look at issues with the label [help wanted](https://github.com/RasaHQ/kidx_nlu/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22))
2. write your code, tests and documentation
3. create a pull request describing your changes
You pull request will be reviewed by a maintainer, who might get back to you about any necessary changes or questions. You will also be asked to sign the [Contributor License Agreement](https://cla-assistant.io/RasaHQ/rasa_nlu)
# Advanced installation
### Advanced Python
From github:
```
git clone git@git.mykidx.com:nlp/kidx_nlu.git
cd kidx_nlu
pip install -r requirements.txt
pip install -e .
```
For local development make sure you install the development requirements:
```
pip install -r alt_requirements/requirements_dev.txt
pip install -e .
```
To test the installation use (this will run a very stupid default model. you need to [train your own model](https://rasa.com/docs/nlu/quickstart/) to do something useful!):
### Advanced Docker
Before you start, ensure you have the latest version of docker engine on your machine. You can check if you have docker installed by typing ```docker -v``` in your terminal.
To see all available builds go to the [Rasa docker hub](https://hub.docker.com/r/rasa/rasa_nlu/), but to get up and going the quickest just run:
```
docker run -p 5000:5000 rasa/kidx_nlu:latest-full
```
There are also three volumes, which you may want to map: `/app/projects`, `/app/logs`, and `/app/data`. It is also possible to override the config file used by the server by mapping a new config file to the volume `/app/config.json`. For complete docker usage instructions go to the official [docker hub readme](https://hub.docker.com/r/rasa/rasa_nlu/).
To test run the below command after the container has started. For more info on using the HTTP API see [here](https://rasa.com/docs/nlu/http/#endpoints)
```
curl 'http://localhost:5000/parse?q=hello'
```
### Docker Cloud
Warning! setting up Docker Cloud is quite involved - this method isn't recommended unless you've already configured Docker Cloud Nodes (or swarms)
[![Deploy to Docker Cloud](https://files.cloud.docker.com/images/deploy-to-dockercloud.svg)](https://cloud.docker.com/stack/deploy/?repo=https://github.com/RasaHQ/rasa_nlu/tree/master/docker)
### Install Pretrained Models for Spacy & Mitie
In order to use the Spacy or Mitie backends make sure you have one of their pretrained models installed.
```
python -m spacy download en
```
To download the Mitie model run and place it in a location that you can
reference in your configuration during model training:
```
wget https://github.com/mit-nlp/MITIE/releases/download/v0.4/MITIE-models-v0.2.tar.bz2
tar jxf MITIE-models-v0.2.tar.bz2
```
If you want to run the tests, you need to copy the model into the Rasa folder:
```
cp MITIE-models/english/total_word_feature_extractor.dat KIDX_NLU_ROOT/data/
```
Where `KIDX_NLU_ROOT` points to your Rasa installation directory.
# Development Internals
### Steps to release a new version
Releasing a new version is quite simple, as the packages are build and distributed by travis. The following things need to be done to release a new version
1. update [kidx_nlu/version.py](http://git.mykidx.com:8888/nlp/kidx_nlu/blob/master/rasa_nlu/version.py) to reflect the correct version number
2. edit the [CHANGELOG.rst](http://git.mykidx.com:8888/nlp/kidx_nlu/blob/master/CHANGELOG.rst), create a new section for the release (eg by moving the items from the collected master section) and create a new master logging section
3. edit the [migration guide](http://git.mykidx.com:8888/nlp/kidx_nlu/blob/master/docs/migrations.rst) to provide assistance for users updating to the new version
4. commit all the above changes and tag a new release, e.g. using
```
git tag -f 0.7.0 -m "Some helpful line describing the release"
git push origin 0.7.0
```
travis will build this tag and push a package to [pypi](https://pypi.python.org/pypi/kidx_nlu)
5. only if it is a **major release**, a new branch should be created pointing to the same commit as the tag to allow for future minor patches, e.g.
```
git checkout -b 0.7.x
git push origin 0.7.x
```
### Running and changing the unit test
To build & edit the docs, first install all necessary dependencies:
```
docker build -f docker/Dockerfile_dev . -t kidx_nlu_test
docker run --name kidx_nlu_test -v "$PWD":/app -it kidx_nlu_test:0.0.1a2 bash
```
After the docker container start run command in docker
```
pip install -e . --no-cache-dir -i https://mirrors.aliyun.com/pypi/simple//
make lint
make test
```
Look the coverage should be no failure and pass 100%
## License
Licensed under the Apache License, Version 2.0. Copyright 2019
Rasa Technologies GmbH. [Copy of the license](LICENSE.txt).
A list of the Licenses of the dependencies of the project can be found at
the bottom of the
[Libraries Summary](https://libraries.io/github/RasaHQ/rasa_nlu).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kidx-nlu-0.0.1a4.tar.gz
(136.6 kB
view hashes)
Built Distribution
kidx_nlu-0.0.1a4-py2.py3-none-any.whl
(172.0 kB
view hashes)
Close
Hashes for kidx_nlu-0.0.1a4-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1d3327dd1b6a32d35e40164e71f9a2026fa3e47de6ea653c5c17ded2e5e74fa6 |
|
MD5 | 055ffb9e70685e4f6377748185da05bc |
|
BLAKE2b-256 | f4b847f6be63dd1aa2fbb934bbf801125c5453d5df8426139a2156116c4f84cc |