Skip to main content

Rasa train data generator

Project description

RasaGen: A Rasa chatbot training data generator

(中文版) | (English)

Installation

Install the latest version of RasaGen by running:

pip install rasa_gen

or install with source code:

pip install git+https://github.com/SchweitzerGAO/rasa-train-generator

Basic usage

An Example

Though the template in this example is in Chinese, It supports mainstream languages like English, French and Japanese etc.

from rasa_gen import NLUTemplate, Generator

if __name__ == '__main__':
    sentence_template = [
        '[{}](operation_set_temp)[{}]{{"entity":"value","role":"temperature"}}度',
        '把[{}](operation_set_temp)[{}]{{"entity":"value","role":"temperature"}}度',
        '空调[{}](operation_set_temp)[{}]{{"entity":"value","role":"temperature"}}度',
        '把空调[{}](operation_set_temp)[{}]{{"entity":"value","role":"temperature"}}度',

    ]
    word_template = [
        '温度降低到', '温度升高到', '温度升高至', '温度降低至', '温度调整到', '温度调整至', '温度调到', '温度调至',
    ]
    template = NLUTemplate().add_sentence(sentence_template)\
                            .add_word(word_template)\
                            .add_random_val(16, 30)
generator = Generator('test_intent').add_template(template)
generator.generate_from_template(50, './test_template.yml')

A detailed example is in example.py

Creating a NLUTemplate

As shown in the example, you can create a NLU training data generating template by creating a NLUTemplate instance and add sentence, word and random value to fill in the template in a streaming way.

Using a Generator

  1. Create a Generator

You can create a Generator instance with specifying the name of the key of the training data. For example, you shall specify the name of intent when generating data for an intent or the name of lookup for a lookup table.

Note: If you are generating data from a csv or tsv file with the name of lookup in the first column, there is no need to specify the name when creating a Generator

For now, only intent and lookup data types in nlu class are supported. Other types like rule and story will be supported in the future

  1. Generate data by a Generator

There are 2 supported ways to generate data by a Generator

  • From a template (Recommended for intent data)

You can add a Template instance by add_template method and generate the data by generate_from_template method.

  • From a file (Recommended for lookup data)

There is no need to create Template instances. Just specify the input file and output file and use the generate_from_file method will be OK.

Coming Soon...

  • Generating story, synonym and rule data

  • The detailed document will be released soon.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rasa_gen-0.0.3.tar.gz (9.3 kB view details)

Uploaded Source

File details

Details for the file rasa_gen-0.0.3.tar.gz.

File metadata

  • Download URL: rasa_gen-0.0.3.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.8

File hashes

Hashes for rasa_gen-0.0.3.tar.gz
Algorithm Hash digest
SHA256 2e6181136ef4357f0fcbe8c4e5cf331cd78776141fbc6e331e48472ed2eadb73
MD5 5b85a5e7b0ad4866e4f3fa79284f4d62
BLAKE2b-256 4896dc172d3a02bd547b7a06fd21b13417096cec30ed0724ac9f8882bf0d4f00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page