Skip to main content

Optimize hyperparameter search using Latin Hypercube Sampling principles

Project description

GridSearchReductor

⚠️ Disclaimer: This project was almost purely vibecoded with assistance from aider.chat. While it includes comprehensive pytest tests, the author doesn't have the mathematical expertise to verify that the Latin Hypercube Sampling implementation is mathematically sound. Use at your own discretion for production workloads.

A Python package for optimizing hyperparameter search using Latin Hypercube Sampling principles.

Inspired by NightHawkInLight's video on Taguchi arrays.

Do fewer experiments than grid search, but do the right ones using Latin Hypercube Sampling!

Why use GridSearchReductor?

This library is designed to work seamlessly with scikit-learn's ParameterGrid, providing a drop-in replacement that can significantly reduce your hyperparameter search space.

When tuning machine learning models, traditional grid search can require an exponentially large number of experiments. GridSearchReductor helps reduce the number of experiments needed while still effectively exploring the parameter space.

Instead of testing every possible combination of parameters (which can be computationally expensive), this package uses Latin Hypercube Sampling principles to:

  1. Reduce the number of experiments needed
  2. Maintain excellent coverage of the parameter space through stratified sampling
  3. Ensure each parameter dimension is sampled uniformly
  4. Provide better space-filling properties than random sampling
  5. Generate deterministic results by default - the same parameter grid will always produce the same reduced combinations

Getting started

Installation

  • From PyPI:
    • Via uv: uv pip install GridSearchReductor
    • Via pip: pip install GridSearchReductor
  • From GitHub:
    • Clone this repo then pip install .

Basic Usage

from sklearn.model_selection import ParameterGrid
from GridSearchReductor import GridSearchReductor

grid_converter = GridSearchReductor()

sample_grid = {
    'kernel': ['linear', 'rbf', 'poly'],
    'C': [0.1, 1, 10],
    'gamma': ['scale', 'auto'],
    'verbose': [True],  # also handles length 1 lists for fixed params
}

full_grid = ParameterGrid(sample_grid)

reduced_grid = grid_converter.fit_transform(sample_grid)
# Alternative way:
# reduced_grid = grid_converter.fit_transform(full_grid)

# Use the reduced grid in your experiments
for params in reduced_grid:
    # Your training/evaluation code here
    print(params)

The reduced experiments list will be significantly smaller than the full grid while maintaining good parameter space coverage through Latin Hypercube Sampling.

The full experiments list would have been 18 combinations (3×3×2×1), but the reduced grid provides effective coverage with fewer experiments!

Advanced Usage

Reproducible Results

GridSearchReductor is deterministic by default (using random_state=42). The same parameter grid will always produce the same reduced combinations.

# Default behavior - deterministic results
grid_converter = GridSearchReductor()
reduced_grid = grid_converter.fit_transform(sample_grid)

# Use a different random_state if needed
grid_converter = GridSearchReductor(random_state=123)
reduced_grid = grid_converter.fit_transform(sample_grid)

# Use global random state (non-deterministic)
grid_converter = GridSearchReductor(random_state=None)
reduced_grid = grid_converter.fit_transform(sample_grid)

Verbose Logging

# Enable verbose logging to see the sampling process
grid_converter = GridSearchReductor(verbose=True)
reduced_grid = grid_converter.fit_transform(sample_grid)

How it works

The converter takes a parameter grid (similar to scikit-learn's ParameterGrid) and:

  1. Separates fixed parameters (single values) from variable parameters
  2. Determines the number of levels for each variable parameter
  3. Generates Latin Hypercube Samples in normalized [0,1] space
  4. Maps these samples to discrete parameter indices
  5. Creates a reduced set ensuring uniform coverage across all parameter dimensions
  6. Removes duplicate combinations and ensures the result is smaller than the full grid

Latin Hypercube Sampling Benefits

Latin Hypercube Sampling (LHS) provides superior space-filling properties compared to random sampling:

  • Stratified sampling: Each parameter dimension is divided into equally probable intervals
  • Uniform coverage: Exactly one sample per interval ensures no clustering
  • Better convergence: More efficient exploration of the parameter space
  • Reproducible: When using a fixed random_state

This approach is particularly useful when:

  • You have limited computational resources
  • You need comprehensive parameter space exploration with fewer experiments
  • You want better coverage than random search
  • You need reproducible hyperparameter optimization results

Dependencies

  • numpy
  • scikit-learn
  • joblib

This project was almost purely vibecoded with assistance from aider.chat.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gridsearchreductor-0.3.1.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gridsearchreductor-0.3.1-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file gridsearchreductor-0.3.1.tar.gz.

File metadata

  • Download URL: gridsearchreductor-0.3.1.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for gridsearchreductor-0.3.1.tar.gz
Algorithm Hash digest
SHA256 efb10807b2f50f4345aedff51851b08dc1318320122cd2c45e1d6455d15974ec
MD5 b4e2847cb089d85157c41c16c9261e15
BLAKE2b-256 243e9607a532f79569828363400f59d84d4a41e5424890cc3539b569f098cadc

See more details on using hashes here.

File details

Details for the file gridsearchreductor-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for gridsearchreductor-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 475506634d9080610f983f9991e555426379ee18c6c377cf5e68831f56d5edb6
MD5 a73dc6e731a311b056e7eba33d867705
BLAKE2b-256 88f0e4b0dc2129a500b8eb31523f9500ce1bef4af7312e74cb90c338a45f1ede

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page