SFT-DPO-QLora Trainer Package

These details have not been verified by PyPI

Project links

Homepage

Project description

SFT-DPO-QLora Trainer Package

Overview

Welcome to the SFT-DPO-QLora Trainer package! This package is designed to streamline the training process for large language models (LLMs) using the SFT (Scaling-Free Training) and QLora techniques, specifically tailored for Dialogue Preference Optimization (DPO) scenarios. This trainer handles the entire pipeline, from dataset processing to model fine-tuning, providing users with a convenient solution for their custom use cases.

Installation

To install the SFT-DPO-QLora Trainer package, follow these steps:

pip install git+https://github.com/YourUsername/SFT-DPO-QLora.git

Usage

1. Import the Trainer and Config classes

from sft_dpo_qlora import Trainer, Config

2. Create a Config object

config = Config(
    MODEL_ID="Model/quantized-model",
    DATA=["YourHuggingFaceDataset/dataset-name","Questions","Answers]
    BITS=4,
    LORA_R=8,
    LORA_ALPHA=8,
    LORA_DROPOUT=0.1,
    TARGET_MODULES=["q_proj", "v_proj"],
    BIAS="none",
    TASK_TYPE="CAUSAL_LM",
    BATCH_SIZE=8,
    OPTIMIZER="paged_adamw_32bit",
    LR=2e-4,
    NUM_TRAIN_EPOCHS=1,
    MAX_STEPS=250,
    FP16=True,
    DATASET_SPLIT="test_prefs",
    MAX_LENGTH=512,
    MAX_TARGET_LENGTH=256,
    MAX_PROMPT_LENGTH=256,
    INFERENCE_MODE=False,
    LOGGING_FIRST_STEP=True,
    LOGGING_STEPS=10,
    OUTPUT_DIR="your_output_directory",
    PUSH_TO_HUB=True
)

3. Initialize the Trainer with the Config object

trainer = Trainer(config)

4. Train the model

trainer.train()

Configuring the Trainer

The Config class allows you to customize various parameters for the training process. Key parameters include:

MODEL_ID: The identifier of the base model to use.
DATA: The Hugging Face dataset name , Instruction , Target
BITS: Number of bits for quantization.
LORA_R, LORA_ALPHA, LORA_DROPOUT: LoRA Adapter configuration.
TARGET_MODULES: Target modules for the LoRA Adapter.
BIAS: Bias for LoRA Adapter.
TASK_TYPE: Type of the task (e.g., "CAUSAL_LM").
BATCH_SIZE: Training batch size.
OPTIMIZER: Optimizer for training.
LR: Learning rate.
NUM_TRAIN_EPOCHS: Number of training epochs.
MAX_STEPS: Maximum number of training steps.
FP16: Enable mixed-precision training.
DATASET_SPLIT: Split of the dataset to use.
MAX_LENGTH, MAX_TARGET_LENGTH, MAX_PROMPT_LENGTH: Maximum sequence lengths.
INFERENCE_MODE: Enable inference mode for LoRA Adapter.
LOGGING_FIRST_STEP, LOGGING_STEPS: Logging configuration.
OUTPUT_DIR: Output directory for training results.
PUSH_TO_HUB: Push results to the Hugging Face Hub.

Adjust these parameters based on your specific requirements and the characteristics of your dialogue preference optimization task.

Dataset Processing

The Trainer class provides methods to download and process the training dataset:

_dpo_data: Downloads and processes the DPO dataset.

Model Preparation

The train method initiates the training loop using the specified dataset and model:

Downloads and processes the DPO dataset.
Prepares the model.
Sets training arguments.
Initializes the DPOTrainer from the trl library.
Trains the model.

Conclusion

With the SFT-DPO-QLora Trainer package, you can easily fine-tune large language models for dialogue preference optimization in your custom use case. Experiment with different configurations, datasets, and models to achieve optimal results. If you encounter any issues or have questions, please refer to the documentation or reach out to our support community. Happy training!

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.1.4

Jan 16, 2024

0.1.3

Jan 16, 2024

0.1.2

Jan 15, 2024

0.1.1

Jan 15, 2024

This version

0.1.0

Jan 15, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sft_dpo_qlora-0.1.0.tar.gz (8.9 kB view details)

Uploaded Jan 15, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sft_dpo_qlora-0.1.0-py3-none-any.whl (9.6 kB view details)

Uploaded Jan 15, 2024 Python 3

File details

Details for the file sft_dpo_qlora-0.1.0.tar.gz.

File metadata

Download URL: sft_dpo_qlora-0.1.0.tar.gz
Upload date: Jan 15, 2024
Size: 8.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for sft_dpo_qlora-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`07b4d94f9c3f4dadada64fa2871b9d07f61d2bdd6f8622221590a388657fc469`
MD5	`f51dede9e906f1d01bc60d5d196a9974`
BLAKE2b-256	`a98e7cfc41f3f164be6ba38e0aa87532306af05fdf9cd96a95a2459d8e5e9de3`

See more details on using hashes here.

File details

Details for the file sft_dpo_qlora-0.1.0-py3-none-any.whl.

File metadata

Download URL: sft_dpo_qlora-0.1.0-py3-none-any.whl
Upload date: Jan 15, 2024
Size: 9.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for sft_dpo_qlora-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8efc547db71a4b7e5fa5fa79c8d99c6a97f353940cd68bf70dc7ea7d1ddb8f93`
MD5	`bdcec4c06454fc89e18def908a0bb7cd`
BLAKE2b-256	`21911c76bb917105f534697ec57af0d2448d73664dfb7e69ef7c46828c5b5657`

See more details on using hashes here.

sft-dpo-qlora 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SFT-DPO-QLora Trainer Package

Overview

Installation

Usage

1. Import the Trainer and Config classes

2. Create a Config object

3. Initialize the Trainer with the Config object

4. Train the model

Configuring the Trainer

Dataset Processing

Model Preparation

Conclusion

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes