OpenAI Model Trainer and Formatter
Project description
codara-model-trainer
Overview
The codara-model-trainer
is a Python package designed to assist in creating datasets for fine-tuning machine learning
models, particularly language models. It simplifies the process of gathering and formatting training data in a JSON
Lines (JSONL) format.
Features
- Easy creation of training data sets in JSONL format.
- Methods to set system instructions, training prompts, and generative responses.
- Automatically handles file creation and appending data in the correct format.
Installation
pip install codara-model-trainer
Usage
- Create the data set with agent instructions, training prompts, and generative responses as needed:
from codara_model_trainer import create_data_set gpt_response = openai_api_call("User prompt here") create_data_set("System instructions here", "User prompt here", gpt_response, "optional-filepath/filename.jsonl")
The data will be saved in the model-training/fine-tune-data-set.jsonl
file if the filepath isn't set.
Structure of Data
The data is structured in JSON Lines format, where each line is a valid JSON object. An example of the data structure:
{
"messages": [
{
"role": "system",
"content": "System instructions here"
},
{
"role": "user",
"content": "User prompt here"
},
{
"role": "assistant",
"content": "Model response here"
}
]
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for codara-model-trainer-2.0.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4fb466f5ad0885438933c900941b5e9dd8ca68a5cbe9d23f3c7c7179dc22d106 |
|
MD5 | 4f69e5b6e8c7d964f37b962ba2509fa4 |
|
BLAKE2b-256 | 5f97e1edffb5d36746a3b9c114d4714bf8187b87280364b48d696cbf2c0db90d |
Close
Hashes for codara_model_trainer-2.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 83877bed0cccc6f1031d872ff71f813f04e43a15c07c8da4970519628cdc4683 |
|
MD5 | 36b169a61cd563bd467896cd835bd84d |
|
BLAKE2b-256 | eec35665642aac12ae2a885a4086ead4a113889414cb236007f377416866fd08 |