No project description provided
Project description
JanexNLG
JanexNLG is a version of Janex which is designed to allow you to train a .bin Janex model using your own datasets stored in txt files, and then generate new text using a Recurrent Neural Network (Feed-Forward Neural Network used in versions prior to v0.0.4).
This program works by breaking down the sentences in your txt files and finding the trends of which words are commonly used next to each other, as well as accounting for the sentence structure trends. The more data, the better.
This is in Alpha stage as I am still trying to understand this area of Machine Learning, but for now here's how you can use this library to build your own Text Generation Model!
Training the model
First, I would recommend creating a file named 'train.py' which you would use to create the binary file.
In this file, you would write:
from JanexNLG.trainer import *
NLG = NLGTraining() # Create an instance of the JanexNLG training module.
NLG.set_directory("./files") # Set this to the name of a folder in the same directory as your train.py file. This folder will contain all of your txt files you wish to train the model with.
NLG.set_spacy_model("en_core_web_md") # You can set this to any Spacy model of your choosing. I would recommend en_core_web_sm for weak or older hardware.
NLG.train_data() # Finally, train the data. This will save everything collected into a .bin file in your program's directory.
Optional GPU support:
NLG.set_device("cuda")
Finetuning the model
For versions > 0.0.2, a finetuning feature is available. After training your model, if you wish to add extra modifications to alter the model for a specific purpose, you can set the directory to a new folder, put these new data pieces in there, and then continue to finetune the model.
from JanexNLG.trainer import *
NLG = NLGTraining()
NLG.set_directory("./files_for_finetuning")
NLG.set_spacy_model("en_core_web_md")
NLG.finetune_model("janex.bin") # You've got to add your model name to this function so the library knows what it is finetuning.
Using the model
Once you've created the binary data, effectively teaching the AI the connections between words and sentence structures, you can then use it to generate text.
from JanexNLG import *
Generator = NLG("en_core_web_md", "janex.bin") # Your chosen spacy model and the name of the .bin file generated by the training program.
input_sentence = input("You: ")
ResponseOutput = Generator.generate_sentence(input_sentence)
print(ResponseOutput)
Warning:
The larger the txt file, the larger the .bin file will be. Make sure you are using the appropriate hardware. The more diverse data there is in the txt files, the higher the accuracy and more coherent the responses will be. I hope this comes in useful! :)
Thank you for using JanexNLG <3
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.