OptiNet is a Python library for optimizing traditional machine learning models.

Project description

OptiNet - A Versatile Library for ML and NLP Model Training

OptiNet is a Python library designed to simplify and optimize traditional Machine Learning (ML) and Natural Language Processing (NLP) workflows. With an easy-to-use interface, OptiNet allows you to prepare datasets, train models, and evaluate performance for both ML and large language models (LLMs). This library supports scikit-learn models as well as transformer-based models from Hugging Face, with support for LoRA and QLoRA for parameter-efficient fine-tuning.

Features

Unified Interface: Train and evaluate both traditional ML models and transformer-based NLP models.
Data Preparation: Quickly load, split, and prepare data for training.
Tokenizer Integration: Easily tokenize text datasets using Hugging Face's transformers for NLP tasks.
Model Training: Train both ML models (e.g., scikit-learn) and large language models using Trainer from Hugging Face.
LoRA & QLoRA Support: Fine-tune large language models with Low-Rank Adaptation (LoRA) and Quantized LoRA (QLoRA) for efficient training.
Scalable Evaluations: Evaluate trained models and get performance metrics like accuracy.

Installation

You can install OptiNet using pip:

pip install OptiNet

Usage

1. Import and Initialize OptiNet

OptiNet can be used for both ML models (e.g., scikit-learn classifiers) and NLP models (e.g., transformers). Here is how you can get started:

from optinet import OptiNet
from sklearn.ensemble import RandomForestClassifier
from transformers import AutoModelForSequenceClassification

# Example ML Model
ml_model = RandomForestClassifier()
optinet_ml = OptiNet(model=ml_model, model_type='ml')

# Example NLP Model
llm_model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
optinet_nlp = OptiNet(model=llm_model, model_type='llm', model_name='distilbert-base-uncased')

2. Prepare Data

For ML models, OptiNet can load and split datasets like digits from scikit-learn:

# Prepare data for ML model
X_train, X_test, y_train, y_test = optinet_ml.prepare_data(dataset='digits')

For NLP models, you can load datasets from Hugging Face's datasets library:

# Prepare data for NLP model
nlp_dataset = optinet_nlp.prepare_data(dataset='imdb')  # e.g., IMDB movie reviews dataset

Custom Dataset Support

If you have a custom dataset (e.g., loaded from a file or a database), you can pass the dataset directly using the dataset_obj parameter:

# Prepare data from a custom dataset
my_dataset = load_dataset("csv", data_files="my_custom_data.csv")
nlp_dataset = optinet_nlp.prepare_data(dataset_obj=my_dataset)

This approach allows flexibility to use any custom dataset, without being restricted to the built-in ones.

3. Tokenize Data (For NLP Models)

If you're working with NLP models, you need to tokenize the data before training:

# Tokenize NLP dataset
tokenized_dataset = optinet_nlp.tokenize_data(nlp_dataset)

4. Train the Model

You can train both ML and NLP models using the train_model() method. This is where you can choose to fine-tune your model with LoRA and QLoRA by passing the relevant parameters.

Train ML model:

# Train ML model
optinet_ml.train_model(X_train, y_train)

Train NLP model with LoRA:

# Train NLP model with LoRA fine-tuning
optinet_nlp.train_model(
    tokenized_dataset['train'],
    output_dir="./output_lora",
    num_train_epochs=3,
    per_device_train_batch_size=8,
    lora_r=4,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.1,
    task_type="SEQ_CLS"
)

Train NLP model with QLoRA (4-bit quantization):

# Train NLP model with QLoRA (using 4-bit quantization)
optinet_nlp.train_model(
    tokenized_dataset['train'],
    output_dir="./output_qlora",
    num_train_epochs=3,
    per_device_train_batch_size=8,
    quantization_config={"load_in_4bit": True, "bnb_4bit_compute_dtype": torch.float16},
    lora_r=4,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.1,
    task_type="SEQ_CLS"
)

5. Evaluate the Model

Evaluate the performance of your trained model:

# Evaluate ML model
accuracy = optinet_ml.evaluate_model(X_test, y_test)
print(f"ML Model Accuracy: {accuracy:.2f}")

# Evaluate NLP model
results = optinet_nlp.evaluate_model(tokenized_dataset['test'])
print("NLP Model Evaluation:", results)

Requirements

OptiNet depends on several popular Python packages for ML and NLP tasks:

scikit-learn
transformers
datasets
torch
peft (for LoRA and QLoRA support)

To install these requirements, you can use the following command:

pip install scikit-learn transformers datasets torch peft

License

This project is licensed under the MIT License - see the LICENSE file for details.

Authors

Vishwanath Akuthota
Ganesh Thota
Krishna Avula

Contributing

We welcome contributions to improve OptiNet. Please feel free to submit issues and pull requests on the GitHub repository.

Project details

Release history Release notifications | RSS feed

This version

0.1.7

Mar 27, 2025

0.1.6

Mar 27, 2025

0.1.5

Mar 1, 2025

0.1.4

Oct 25, 2024

0.1.3

Oct 25, 2024

0.1.2

Jul 18, 2024

0.1.1

Jul 16, 2024

0.1.0

Jul 15, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optinet-0.1.7.tar.gz (5.5 kB view details)

Uploaded Mar 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

optinet-0.1.7-py3-none-any.whl (5.5 kB view details)

Uploaded Mar 27, 2025 Python 3

File details

Details for the file optinet-0.1.7.tar.gz.

File metadata

Download URL: optinet-0.1.7.tar.gz
Upload date: Mar 27, 2025
Size: 5.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for optinet-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`00a55cb54b0a515dc3f7901b0122d40369c7377e6bdb34b4f5f8ab00d8b7a0d7`
MD5	`5ac97dc22523c47d4d72b538d5908652`
BLAKE2b-256	`cffc1adfcfcaff428fd49e6889c4d43a64d1f796a2f25d28213e9b9585e7b07b`

See more details on using hashes here.

File details

Details for the file optinet-0.1.7-py3-none-any.whl.

File metadata

Download URL: optinet-0.1.7-py3-none-any.whl
Upload date: Mar 27, 2025
Size: 5.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for optinet-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7a8086949a3f7292905267b8ba743ad34fb78c3a18e2897b50acf8dc503c5acc`
MD5	`514a9e16c351882f299198a1b79cdf1b`
BLAKE2b-256	`6766f65de69522ea675b654adbe46994d27bba0115de813cff6bb651e0a2345f`

See more details on using hashes here.

OptiNet 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

OptiNet - A Versatile Library for ML and NLP Model Training

Features

Installation

Usage

1. Import and Initialize OptiNet

2. Prepare Data

Custom Dataset Support

3. Tokenize Data (For NLP Models)

4. Train the Model

5. Evaluate the Model

Requirements

License

Authors

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes