A Python CLI for Ruth NLP
Project description
Ruth Natural Language Understanding
Welcome to the RUTH NLU documentation. RUTH is a open sourced Natural Language Understanding (NLU) framework developed by puretalk.ai. It is a Python module that allows you to parse natural language sentences and extract information from them.
RUTH is cli based tool that can be used to train and test models.
Installation
Quick installation
$ pip install ruth-python
Installation from source
$ git clone https://github.com/prakashr7d/Research-implementation-NLU-engine.git
$ cd Research-implementation-NLU-engine
$ python setup.py install
Using makefile (for linux & mac users)
Makefile is a file that contains a set of directives used by make build automation tool to generate executables and other non-source files of a program from the program's source files.
$ git clone https://github.com/prakashr7d/Research-implementation-NLU-engine.git
$ cd Research-implementation-NLU-engine
for ubuntu,
$ make bootstrap
for mac,
$ make bootstrap-mac
then finally to install package run the following bash command
$ make install
Pytorch installation with GPU support
$ pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
Documentation
Getting Started
The main objective of this lib performs to extract information by parsing the sentence written in natural language. To getting started with RUTH let's follow the below steps. Run the following command to build an initial project with data and a default pipeline file.
$ mkdir project_name
$ ruth init
Output
Project will be initialized with below structure
.
├── data
│ └── example.yml
└── pipeline.yml
Project will be created with example data and pipeline
CLI
RUTH has a CLI interface to train and test the model, to get started with the CLI interface, run the following command
$ ruth --help
for example, to train the model, run the following command
usage: $ ruth [-h] [-v] {train,test} ...
Training
To train the model, run the following command
$ ruth train -p path/to/pipeline.yaml
-d path/to/dataset.json
Parameters
-p, --pipeline Pipeline file
-d, --data dataset path
Saving Trained models
Once the training is finished the model will be saved in a directory named models in the current working directory.
Dataset format
RUTH uses a yaml file to store the training data, the yaml file should have the following syntax
example
version: "0.1"
nlu:
- intent: ham
examples: |
- WHO ARE YOU SEEING?
- Great! I hope you like your man well endowed. I am <#> inches
- Didn't you get hep b immunisation in nigeria.
- Fair enough, anything going on?
- Yeah hopefully, if tyler can't do it I could maybe ask around a bit
- intent: spam
examples: |
- Did you hear about the new Divorce Barbie? It comes with all of Ken's stuff!
- I plane to give on this month end.
- Wah lucky man Then can save money Hee
- Finished class where are you.
- K..k:)where are you?how did you performed?
In pipeline-data.yml file is used to define the pipeline and its components Example of pipeline-basic.yml file for Support Vector Machine (SVM) based intent classifier and CountVectorizer based featurizer.
task:
pipeline:
- name: 'WhiteSpaceTokenizer'
- name: 'CountVectorFeaturizer'
- name: 'SVMClassifier'
task:
pipeline:
- name: 'HFTokenizer'
model_name: 'bert-base-uncased'
- name: 'HFClassifier'
model_name: 'bert-base-uncased'
Parsing
To parse the text, run the following command
$ ruth parse -m path/to/model_dir
-t "I want to book a flight from delhi to mumbai"
Parameters
-m, --model_path model file (optional)
-t, --text text message (required)
If model path is not provided, Parse function will use the latest model in the model directory as a default model.
Testing
To test the model performance, run the following command
$ ruth evaluate -p path/to/pipeline-basic.yml
-d path/to/dataset
Parameters
-p, --pipeline pipeline file
-d, --data dataset file
-o, --output_folder to save result as PNG file (optional)
-m, --model_path (optionol)
If model path is not provided, Evaluate function will use the latest model in the model directory as a default model. If output folder is not provided, the result will be saved in results folder in the current working directory.
Deployment
RUTH uses FastAPI to deploy the model as a REST API, to deploy the model, run the following command
$ ruth deploy -m path/to/model_dir
Parameters
-m, --model_path model file (required)
-p, --port port number (optional)
-h, --host host name (optional)
Output
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://localhost:5500 (Press CTRL+C to quit)
API
Once the model is deployed, you can use the following API to parse the text
POST /parse
{
"text": "I want to book a flight from delhi to mumbai"
}
Output
{
"text": "hello ruth!",
"intent_ranking": [
{
"name": "greet",
"accuracy": 0.9843385815620422
},
{
"name": "how_are_you",
"accuracy": 0.0017248070798814297
},
{
"name": "voice_mail",
"accuracy": 0.0008955258526839316
},
],
"intent": {
"name": "greet",
"accuracy": 0.9843385815620422
}
}
Connect us with
Devoloped by Puretalk@2022
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ruth-nlu-0.0.3.tar.gz.
File metadata
- Download URL: ruth-nlu-0.0.3.tar.gz
- Upload date:
- Size: 28.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b224e0b29261ea25a8f77198a9d69c032c27d4d4593cc4f9f437ea69c3048ff
|
|
| MD5 |
2010286a62dc540dbfd3524b0d106dd2
|
|
| BLAKE2b-256 |
e6a9a60c39a8d83464246a23fdb064cec7bd9b9f228cbf31af430aad63876e6e
|
File details
Details for the file ruth_nlu-0.0.3-py3.8.egg.
File metadata
- Download URL: ruth_nlu-0.0.3-py3.8.egg
- Upload date:
- Size: 100.2 kB
- Tags: Egg
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/37.2 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.26.14 tqdm/4.64.1 importlib-metadata/4.12.0 keyring/21.8.0 rfc3986/2.0.0 colorama/0.4.5 CPython/3.8.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bf59a893a9dd57223abcce1653161256765c4868954ae987f21f0af51a3ff1b
|
|
| MD5 |
6bd58e21e6def3aa81434c5726b49404
|
|
| BLAKE2b-256 |
fc68643261d5e0e3efa1956ac1a77a7e3729133f46d5f1f39e7de9b6e37d12a2
|
File details
Details for the file ruth_nlu-0.0.3-py3-none-any.whl.
File metadata
- Download URL: ruth_nlu-0.0.3-py3-none-any.whl
- Upload date:
- Size: 38.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ef2d7c6fea5768eb30bb5abb8d610e29073679126ed0c56d8e1dfa3f643f889
|
|
| MD5 |
68d1ec6d9579260ab861805eb6e4a917
|
|
| BLAKE2b-256 |
b7f1e1caa41f845084fdb136719cda296a3801559669de082d0f832d9c4bdf73
|