Skip to main content

No project description provided

Project description

JanexNLG

JanexNLG is a version of Janex which is designed to allow you to train a .bin Janex model using your own datasets stored in txt files, and then generate new text using a Recurrent Neural Network (Feed-Forward Neural Network used in versions prior to v0.0.4).

This program works by breaking down the sentences in your txt files and finding the trends of which words are commonly used next to each other, as well as accounting for the sentence structure trends. The more data, the better.

This is in Alpha stage as I am still trying to understand this area of Machine Learning, but for now here's how you can use this library to build your own Text Generation Model!

Training the model

First, I would recommend creating a file named 'train.py' which you would use to create the binary file.

In this file, you would write:

from JanexNLG.trainer import *

NLG = NLGTraining() # Create an instance of the JanexNLG training module.
NLG.set_directory("./files") # Set this to the name of a folder in the same directory as your train.py file. This folder will contain all of your txt files you wish to train the model with.
NLG.set_spacy_model("en_core_web_md") # You can set this to any Spacy model of your choosing. I would recommend en_core_web_sm for weak or older hardware.
NLG.train_data() # Finally, train the data. This will save everything collected into a .bin file in your program's directory.

Optional GPU support:

NLG.set_device("cuda")

Finetuning the model

For versions > 0.0.2, a finetuning feature is available. After training your model, if you wish to add extra modifications to alter the model for a specific purpose, you can set the directory to a new folder, put these new data pieces in there, and then continue to finetune the model.

from JanexNLG.trainer import *

NLG = NLGTraining()
NLG.set_directory("./files_for_finetuning")
NLG.set_spacy_model("en_core_web_md")
NLG.finetune_model("janex.bin") # You've got to add your model name to this function so the library knows what it is finetuning.

Using the model

Once you've created the binary data, effectively teaching the AI the connections between words and sentence structures, you can then use it to generate text.

from JanexNLG import *

Generator = NLG("en_core_web_md", "janex.bin") # Your chosen spacy model and the name of the .bin file generated by the training program.
input_sentence = input("You: ")
ResponseOutput = Generator.generate_sentence(input_sentence)
print(ResponseOutput)

Warning:

The larger the txt file, the larger the .bin file will be. Make sure you are using the appropriate hardware. The more diverse data there is in the txt files, the higher the accuracy and more coherent the responses will be. I hope this comes in useful! :)

Thank you for using JanexNLG <3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

JanexNLG-0.0.7.tar.gz (6.4 kB view details)

Uploaded Source

File details

Details for the file JanexNLG-0.0.7.tar.gz.

File metadata

  • Download URL: JanexNLG-0.0.7.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.31.0 rfc3986/1.5.0 tqdm/4.66.1 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for JanexNLG-0.0.7.tar.gz
Algorithm Hash digest
SHA256 d6378d65b6fd93a2e40b71f1d61c67b309001ba5a7276faf102d16df904647d3
MD5 2d636bae9937930a160d7a832bf9968f
BLAKE2b-256 43c2cc9e7a3ba34367dea46e4de6a209db0ee8c0a0cb8ba34eba496db2d386ef

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page