A package for automated hyper parameter tuning and machine learning workflows. Build an end-to-end pipeline or fine tune an LLM on consumer hardware in a few lines of code.

These details have not been verified by PyPI

Project description

AutoHPSearch

A Python package for automatic hyperparameter tuning of machine learning models for cross-sectional data. AutoHPSearch simplifies the process of hyperparameter optimization for various machine learning models by providing a unified interface to tune hyperparameters across multiple model types.

AutoHPSearch also contains functionality for full end-to-end pipelines that include cleaning, parameter search, model evaluation, automated production of data reports in markdown format (example here), as well as fine tuning large language models (LLMs) with just a few lines of code.

The hyperparameter search space is navigated with grid, random, or bayesian search. Random search is faster but provides a less comprehensive coverage of the search space. CUDA-enabled computing for neural network implementations is included.

Installation

pip install autohpsearch

Or install directly from the repository:

git clone https://github.com/rudyvdbrink/autohpsearch.git
cd autohpsearch
pip install -e .

To enable CUDA you need to manually install the right version of torch+cuda depending on your GPU and system.

Usage

Examples Scripts

Classification - Demonstrates simple binary classification
Regression - Simple regression example
Neural Network Usage - Syntax examples for using scikit-learn compatible neural networks
Iris Example - Examples of both classification and regression solving using real data
Pipeline Example - An example of a full automated end-to-end pipeline
LLM Example - An example of how to fine tune an LLM for a sequence classification task

Creating and Fitting a Full End-To-End Automatic Pipeline

# Import requirements
from autohpsearch.datasets.dataloaders import fetch_housing
from autohpsearch import AutoMLPipeline

# Load an example dataset
X_train, X_test, y_train, y_test = fetch_housing()

# Fit the pipeline: this will clean the data run hyperparameter search, train models, and evaluate them
pipeline = AutoMLPipeline(task_type='regression')
pipeline.fit(X_train=X_train,X_test=X_test,y_train=y_train,y_test=y_test)

Automated Reports on Data Distributions And Model Performance

AutoHPSearch can generate a report on the data that includes plots of feature distributions before and after data cleaning, and statistics on requested properties of the data such as the number of outliers etc. It will also include plots for the best performing model to examine its performance on the test set. You can find an example report here. To create a report, simply run:

# Write a report in markdown format 
pipeline.generate_data_report()

Example Classification With Specified Models

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

from autohpsearch import tune_hyperparameters, generate_hypergrid

# Generate synthetic data
X, y = make_classification(n_samples=1000, n_features=10, random_state=42)

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Generate hyperparameter grid for multiple models
hypergrid = generate_hypergrid(['logistic_regression', 'random_forest_clf', 'xgboost_clf'])

# Tune hyperparameters
results = tune_hyperparameters(
    X_train, y_train, 
    X_test, y_test, 
    hypergrid=hypergrid, 
    scoring='balanced_accuracy',
    search_type='random',
    cv=5
)

# Access best model and results
best_model = results['best_model'] # The winning model
optimal_params = results['optimal_params'] # Best paramters for each model
performance_results = results['results'] # cross-validation and test score table

print(f"Best model: {type(best_model).__name__}")
print(f"Optimal parameters: {optimal_params}")
print(f"Results summary:\n{performance_results}")

Fitting Neural Network Models

from autohpsearch.models.nn import AutoHPSearchClassifier

# Create a neural network classifier with custom parameters
nn_clf = AutoHPSearchClassifier(
    hidden_layers=(64, 32),
    activation='relu',
    dropout_rate=0.2,
    learning_rate=0.001,
    optimizer='adam',
    batch_size=32,
    epochs=100
)

# Train the model
nn_clf.fit(X_train_scaled, y_train)

# Make predictions
y_pred = nn_clf.predict(X_test_scaled)

Fine Tuning Large Language Models

AutoHPSearch includes functionality for low-rank adaptation of large language models. The fitting process is integrated with the transformers library, so pre-trained base models are downloaded from huggingface. Model classes also contain methods for pushing trained models to hugginface hub.

from autohpsearch import AutoLoraForSeqClass
from autohpsearch.datasets.dataloaders import fetch_imdb

# Get the data (a selection of imdb reviews, which can be positive or negative)
dataset = fetch_imdb()

# Initialize the model with a base model, and LoRA parameters
model = AutoLoraForSeqClass(base_model='bert-base-uncased',
                            r=2,
                            train_batch_size=8,
                            eval_batch_size=8,
                            num_train_epochs=3,
                            )

# Fit the model on the dataset
model.fit(dataset)

# Push the model to hugging face hub
model.push()

Available Models

AutoHPSearch supports the following model types for end-to-end training:

Classification Models

logistic_regression: Logistic regression classifier (including L1 / L2 / elastic net regularization)
random_forest_clf: Random forest classifier
gradient_boosting_clf: Gradient boosting classifier
svm_clf: Support vector machine classifier
knn_clf: K-nearest neighbors classifier
xgboost_clf: XGBoost classifier
dnn_clf: Deep neural network classifier

Regression Models

linear_regression: Linear regression
ridge: Ridge regression
lasso: Lasso regression
elastic_net: Elastic Net regression
random_forest_reg: Random forest regressor
gradient_boosting_reg: Gradient boosting regressor
svr: Support vector regression
knn_reg: K-nearest neighbors regressor
xgboost_reg: XGBoost regressor
dnn_reg: Deep neural network regressor

Large Language Models

The base models for sequence tasks are drawn from HuggingFace.co, so any model that is hosted there is supported in principle. These include popular pre-trained models such as Meta's Llama models, Mistral, GPT-Neo, and others.

Hyperparameter Tuning

The generate_hypergrid() function creates a comprehensive grid of hyperparameters for each model type. You can:

Generate grids for all supported models: generate_hypergrid(task_type='classification')
Generate a grid for a specific model: generate_hypergrid('random_forest_clf') or generate_hypergrid('random_forest_reg', task_type='regression')
Generate grids for multiple models: generate_hypergrid(['logistic_regression', 'xgboost_clf'])

The tune_hyperparameters() function performs grid search cross-validation on the specified models and returns:

The best overall model
Optimal parameters for each model
Performance metrics for each model

Neural Network Models / LLMs

AutoHPSearch includes custom neural network implementations that are compatible with scikit-learn:

AutoHPSearchClassifier: For classification tasks
AutoHPSearchRegressor: For regression tasks

These models provide flexibility in architecture design and training configuration while maintaining the familiar scikit-learn API.

Large language model classes:

AutoLoraForSeqClass: For sequence classificiation tasks
AutoLoraForSeqReg: For sequence regression tasks
AutoLoraForSeqDual: for dual task models that fit both a regression and classification head simultaneously

Author

Rudy van den Brink

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.1.3

Sep 26, 2025

1.1.2

Sep 11, 2025

1.1.1

Sep 11, 2025

This version

1.1.0

Sep 10, 2025

1.0.2

Sep 10, 2025

1.0.1

Aug 6, 2025

1.0.0

Jun 19, 2025

0.6.4

Jun 19, 2025

0.6.3

Jun 19, 2025

0.6.2

Jun 13, 2025

0.6.1

Jun 13, 2025

0.6.0

Jun 11, 2025

0.5.1

Jun 5, 2025

0.5.0

Jun 5, 2025

0.4.0

Jun 5, 2025

0.3.0

Jun 4, 2025

0.2.0

Jun 4, 2025

0.1.0

Jun 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

autohpsearch-1.1.0.tar.gz (79.9 kB view details)

Uploaded Sep 10, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

autohpsearch-1.1.0-py3-none-any.whl (75.0 kB view details)

Uploaded Sep 10, 2025 Python 3

File details

Details for the file autohpsearch-1.1.0.tar.gz.

File metadata

Download URL: autohpsearch-1.1.0.tar.gz
Upload date: Sep 10, 2025
Size: 79.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for autohpsearch-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`9888b7db7cefdb3ba52cb5fd710e14af6cea35f19c79e7c11db6084d9450d300`
MD5	`45a5e48416c32bd1a61213b55b6d747b`
BLAKE2b-256	`a68c0bc83f4ea18a718ed07176d348536a4b1daf91415c50a89ba7f9917714b0`

See more details on using hashes here.

File details

Details for the file autohpsearch-1.1.0-py3-none-any.whl.

File metadata

Download URL: autohpsearch-1.1.0-py3-none-any.whl
Upload date: Sep 10, 2025
Size: 75.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for autohpsearch-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`614b8a8e73a8c510ca7eda18c08cecf299f468d5f80bdaacf231d179f89fa271`
MD5	`cc63a920052f6427608584888cce58c4`
BLAKE2b-256	`c7194027627d475b640f16605970696e5a521a74e2454bc3cbff1b8f24f17f0a`

See more details on using hashes here.

autohpsearch 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

AutoHPSearch

Installation

Usage

Examples Scripts

Creating and Fitting a Full End-To-End Automatic Pipeline

Automated Reports on Data Distributions And Model Performance

Example Classification With Specified Models

Fitting Neural Network Models

Fine Tuning Large Language Models

Available Models

Classification Models

Regression Models

Large Language Models

Hyperparameter Tuning

Neural Network Models / LLMs

Author

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes