Skip to main content

A Python CLI for Ruth NLP

Project description

Ruth Natural Language Understanding

Welcome to the RUTH NLU documentation. RUTH is a open sourced Natural Language Understanding (NLU) framework developed by puretalk.ai. It is a Python module that allows you to parse natural language sentences and extract information from them.

RUTH is cli based tool that can be used to train and test models.

Installation

Quick installation

$ pip install ruth-python

Installation from source

$ git clone https://github.com/prakashr7d/Research-implementation-NLU-engine.git
$ cd Research-implementation-NLU-engine
$ python setup.py install

Using makefile (for linux & mac users)

Makefile is a file that contains a set of directives used by make build automation tool to generate executables and other non-source files of a program from the program's source files.

$ git clone https://github.com/prakashr7d/Research-implementation-NLU-engine.git
$ cd Research-implementation-NLU-engine

for ubuntu,

$ make bootstrap

for mac,

$ make bootstrap-mac

then finally to install package run the following bash command

$ make install

Pytorch installation with GPU support

$ pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

Documentation

Getting Started

The main objective of this lib performs to extract information by parsing the sentence written in natural language. To getting started with RUTH let's follow the below steps. Run the following command to build an initial project with data and a default pipeline file.

$ mkdir project_name
$ ruth init 

Output

Project will be initialized with below structure

.
├── data
│    └── example.yml
└── pipeline.yml

Project will be created with example data and pipeline

CLI

RUTH has a CLI interface to train and test the model, to get started with the CLI interface, run the following command

$ ruth --help

for example, to train the model, run the following command

usage: $ ruth [-h] [-v] {train,test} ...

Training

To train the model, run the following command

$ ruth train -p path/to/pipeline.yaml 
  -d path/to/dataset.json

Parameters

-p, --pipeline  Pipeline file 
-d, --data dataset path 

Saving Trained models

Once the training is finished the model will be saved in a directory named models in the current working directory.

Dataset format

RUTH uses a yaml file to store the training data, the yaml file should have the following syntax

example

version: "0.1"
nlu:
- intent: ham
  examples: |
    - WHO ARE YOU SEEING?
    - Great! I hope you like your man well endowed. I am  <#>  inches
    - Didn't you get hep b immunisation in nigeria.
    - Fair enough, anything going on?
    - Yeah hopefully, if tyler can't do it I could maybe ask around a bit
- intent: spam
  examples: |
    - Did you hear about the new Divorce Barbie? It comes with all of Ken's stuff!
    - I plane to give on this month end.
    - Wah lucky man Then can save money Hee
    - Finished class where are you.
    - K..k:)where are you?how did you performed?
Pipeline
--- RUTH is a pipeline based NLU engine, it has 3 basic main components - Tokenizer - Featurizer - Intent Classifier

In pipeline-data.yml file is used to define the pipeline and its components Example of pipeline-basic.yml file for Support Vector Machine (SVM) based intent classifier and CountVectorizer based featurizer.

task:
pipeline:
  - name: 'WhiteSpaceTokenizer'
  - name: 'CountVectorFeaturizer'
  - name: 'SVMClassifier'
task:
pipeline:
  - name: 'HFTokenizer'
    model_name: 'bert-base-uncased'
  - name: 'HFClassifier'
    model_name: 'bert-base-uncased'

Parsing

To parse the text, run the following command

$ ruth parse -m path/to/model_dir 
  -t "I want to book a flight from delhi to mumbai"

Parameters

-m, --model_path  model file (optional)
-t, --text  text message (required)

If model path is not provided, Parse function will use the latest model in the model directory as a default model.

Testing

To test the model performance, run the following command

$ ruth evaluate -p path/to/pipeline-basic.yml 
  -d path/to/dataset

Parameters

-p, --pipeline  pipeline file 
-d, --data  dataset file
-o, --output_folder  to save result as PNG file (optional)
-m, --model_path (optionol)

If model path is not provided, Evaluate function will use the latest model in the model directory as a default model. If output folder is not provided, the result will be saved in results folder in the current working directory.

Deployment

RUTH uses FastAPI to deploy the model as a REST API, to deploy the model, run the following command

$ ruth deploy -m path/to/model_dir

Parameters

-m, --model_path  model file (required)
-p, --port port number (optional)
-h, --host host name (optional)

Output

INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:5500 (Press CTRL+C to quit)

API

Once the model is deployed, you can use the following API to parse the text

POST /parse
{
    "text": "I want to book a flight from delhi to mumbai"
}

Output

{
    "text": "hello ruth!",
    "intent_ranking": [
        {
            "name": "greet",
            "accuracy": 0.9843385815620422
        },
        {
            "name": "how_are_you",
            "accuracy": 0.0017248070798814297
        },
        {
            "name": "voice_mail",
            "accuracy": 0.0008955258526839316
        },
    ],
    "intent": {
        "name": "greet",
        "accuracy": 0.9843385815620422
    }
}

Connect us with

Puretalk | LinkedInPuretalk | TwitterSanjaypranav



Devoloped by Puretalk@2022

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ruth-nlu-0.0.3.tar.gz (28.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ruth_nlu-0.0.3-py3.8.egg (100.2 kB view details)

Uploaded Egg

ruth_nlu-0.0.3-py3-none-any.whl (38.9 kB view details)

Uploaded Python 3

File details

Details for the file ruth-nlu-0.0.3.tar.gz.

File metadata

  • Download URL: ruth-nlu-0.0.3.tar.gz
  • Upload date:
  • Size: 28.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for ruth-nlu-0.0.3.tar.gz
Algorithm Hash digest
SHA256 7b224e0b29261ea25a8f77198a9d69c032c27d4d4593cc4f9f437ea69c3048ff
MD5 2010286a62dc540dbfd3524b0d106dd2
BLAKE2b-256 e6a9a60c39a8d83464246a23fdb064cec7bd9b9f228cbf31af430aad63876e6e

See more details on using hashes here.

File details

Details for the file ruth_nlu-0.0.3-py3.8.egg.

File metadata

  • Download URL: ruth_nlu-0.0.3-py3.8.egg
  • Upload date:
  • Size: 100.2 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/37.2 requests/2.28.1 requests-toolbelt/0.9.1 urllib3/1.26.14 tqdm/4.64.1 importlib-metadata/4.12.0 keyring/21.8.0 rfc3986/2.0.0 colorama/0.4.5 CPython/3.8.0

File hashes

Hashes for ruth_nlu-0.0.3-py3.8.egg
Algorithm Hash digest
SHA256 9bf59a893a9dd57223abcce1653161256765c4868954ae987f21f0af51a3ff1b
MD5 6bd58e21e6def3aa81434c5726b49404
BLAKE2b-256 fc68643261d5e0e3efa1956ac1a77a7e3729133f46d5f1f39e7de9b6e37d12a2

See more details on using hashes here.

File details

Details for the file ruth_nlu-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: ruth_nlu-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 38.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for ruth_nlu-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9ef2d7c6fea5768eb30bb5abb8d610e29073679126ed0c56d8e1dfa3f643f889
MD5 68d1ec6d9579260ab861805eb6e4a917
BLAKE2b-256 b7f1e1caa41f845084fdb136719cda296a3801559669de082d0f832d9c4bdf73

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page