GridSearcher simplifies running grid searches for machine learning projects in Python, emphasizing parallel execution and GPU scheduling without dependencies on SLURM or other workload managers.

These details have not been verified by PyPI

Project links

Repository

Project description

GridSearcher 𖣯🔍

GridSearcher is a pure Python project designed to simplify the process of running grid searches for Machine Learning projects. It serves as a robust alternative to traditional bash scripts, providing a more flexible and user-friendly way to manage and execute multiple programs in parallel.

⚠️ It is designed for systems where users have direct SSH access to machines and can run their python scripts right away.

Features ✨󠁇󠁇󠁇

Grid Search Made Easy: Define parameter grids effortlessly and the cartesian product of your hyper-parameters will be computed automatically and an instance of your script will be run for all possible combinations.
Parallel Execution: Run multiple programs concurrently, maximizing your computational resources.
GPU Scheduling: Built-in GPU allocation ensures efficient use of available GPUs. Specify the number of GPUs and jobs per GPU, and GridSearcher will handle the rest
Flexible Configuration: Easily control the number of parallel jobs and GPU assignments through a scheduling dictionary.
Pure Python: No more dealing with complex bash scripts. GridSearcher is written entirely in Python, making it easy to integrate into your existing Python workflows.

Why GridSearcher? 🤔

User-Friendly: Simplifies the setup and execution of grid searches, allowing you to focus on your Machine Learning models.
Efficient Resource Management: Optimize the use of your GPUs and computational resources.
Pythonic Approach: Seamlessly integrates with your Python projects and leverages Python's rich ecosystem.
Direct SSH Access: Ideal for systems where users have direct SSH access to machines, providing a straightforward setup and execution process without the need for SLURM or other workload managers, ensuring a smooth and efficient operation.

Installation 🛠️

Install GridSearcher via pip:

pip install gridsearcher

How to use GridSearcher?

We provide a minimal working example in the file example.py. Just set debug=True with debug=False in the run method call to run on GPUs. The output of example.py is the following:

GridSearcher PID: 8940
command 1: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-2_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=1_2024-06-19_23-04-23 --seed 1 --lr 1e-2 --wd 1e-2 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-2_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8/seed=1_2024-06-19_23-04-23
command 2: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-2_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=1_2024-06-19_23-04-23 --seed 1 --lr 1e-2 --wd 1e-3 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-2_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8/seed=1_2024-06-19_23-04-23
command 3: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-3_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=1_2024-06-19_23-04-23 --seed 1 --lr 1e-3 --wd 1e-2 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-3_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8/seed=1_2024-06-19_23-04-23
command 4: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-3_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=1_2024-06-19_23-04-23 --seed 1 --lr 1e-3 --wd 1e-3 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-3_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8/seed=1_2024-06-19_23-04-23
command 5: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-2_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=2_2024-06-19_23-04-23 --seed 2 --lr 1e-2 --wd 1e-2 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-2_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8/seed=2_2024-06-19_23-04-23
command 6: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-2_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=2_2024-06-19_23-04-23 --seed 2 --lr 1e-2 --wd 1e-3 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-2_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8/seed=2_2024-06-19_23-04-23
command 7: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-3_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=2_2024-06-19_23-04-23 --seed 2 --lr 1e-3 --wd 1e-2 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-3_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8/seed=2_2024-06-19_23-04-23
command 8: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-3_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=2_2024-06-19_23-04-23 --seed 2 --lr 1e-3 --wd 1e-3 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-3_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8/seed=2_2024-06-19_23-04-23
command 9: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-2_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=3_2024-06-19_23-04-23 --seed 3 --lr 1e-2 --wd 1e-2 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-2_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8/seed=3_2024-06-19_23-04-23
command 10: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-2_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=3_2024-06-19_23-04-23 --seed 3 --lr 1e-2 --wd 1e-3 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-2_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8/seed=3_2024-06-19_23-04-23
command 11: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-3_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=3_2024-06-19_23-04-23 --seed 3 --lr 1e-3 --wd 1e-2 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-3_wd=1e-2_beta1=0.9_beta2=0.999_eps=1e-8/seed=3_2024-06-19_23-04-23
command 12: python3 myscript.py --batch_size 128 --epochs 100 --lr_decay_at 82 123 --wandb_project cifar10-training --wandb_group cifar10_rn18_adamw_E=100_bs=128 --wandb_job_type lr=1e-3_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8 --wandb_name seed=3_2024-06-19_23-04-23 --seed 3 --lr 1e-3 --wd 1e-3 --beta1 0.9 --beta2 0.999 --eps 1e-8 --root_folder ./results/cifar10-training/cifar10_rn18_adamw_E=100_bs=128/lr=1e-3_wd=1e-3_beta1=0.9_beta2=0.999_eps=1e-8/seed=3_2024-06-19_23-04-23

SBATCH wrapper for SLURM (NEW in version 1.0.4)

We also added a wrapper for SBATCH that allows running SLURM jobs directly from Python!

SBATCH(
    script='script.sh', # the shell script that will be launched using sbatch
    env_vars=dict( # these variables will be set in the --export argument of sbatch and will be available in the h100-eval.sh script
        root=ROOT,
        wandb_project=wandb_project,
        wandb_group=wandb_group,
        wandb_job_type=wandb_job_type,
        wandb_name=wandb_name,
        task='imagenet',
        batch_size=128,
    ),
    sbatch_args={ # the variables below will be added to the sbatch command (e.g. --job_name)
        'exclude': 'machine-1,machine-2',
        'job-name': f'job-name',
        'error': 'slurm_output/%j-%x.err',
        'output': 'slurm_output/%j-%x.out',
        'ntasks': 1,
        'cpus-per-task': 20,
        'time': '1-00:00:00',
        'mem': '200G',
        'partition': 'gpu',
        'gres': 'gpu:H100:1',
    }
).run()

Contribute 🤝

We welcome contributions! If you have suggestions for new features or improvements, feel free to open an issue or submit a pull request.

Versions history:

1.1.6 @ 2026-06-30:
- added function wait_for_resources() at the beginning of the run function. The user can now wait for certain GPUs to be free or delay the start of the script. Detailed example of the arguments:
  - --wait_pids 123 456 789 - wait for the processes with the specified PIDs to finish before running GridSearcher
  - --wait_secs 10 - wait 10 seconds before running GridSearcher
  - --wait_current - waits for the current jobs to finish (automatically detects the PIDs of current jobs)
1.1.5 @ 2026-03-17:
- fixed typo when distributed_training=False in waiting_worker
1.1.4 @ 2025-11-14:
- return commands when debug=True, which is useful to combine with SBATCH on a cluster that can exclusively be used via SLURM
1.1.3 @ 2025-11-13:
- introduced SchedulingConfig and TorchRunConfig to control rdzv parameters and MASTER_ADDR/PORT for torchrun
1.1.2 @ 2025-01-23:
- added create_state_finished parameter to control whether the process writes the file state.finished when returns with error code 0 or not
1.1.1: fixed import issues
1.1.0: removed specific arguments and replaced them with dictionaries to offer flexibility to use any SBATCH params
1.0.4: added SBATCH class, which can be used in a completely separated manner from GridSearcher, allowing running slurm jobs from python
1.0.3: do not check whether the script ends with .py extension anymore
1.0.2: checking the return code of os.system and create file state.finished only if code == 0
1.0.1: added assert statement to make sure that all values in the scheduling["params_values"] are of type list
1.0.0: added initial project

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

This version

1.1.6

Jun 30, 2026

1.1.5

Mar 17, 2026

1.1.4

Nov 14, 2025

1.1.3

Nov 13, 2025

1.1.2

Jan 23, 2025

1.1.1

Oct 27, 2024

1.1.0

Oct 27, 2024

1.0.4

Jul 22, 2024

1.0.3

Jul 20, 2024

1.0.2

Jun 21, 2024

1.0.1

Jun 21, 2024

1.0.0

Jun 19, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gridsearcher-1.1.6.tar.gz (25.8 kB view details)

Uploaded Jun 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gridsearcher-1.1.6-py3-none-any.whl (24.1 kB view details)

Uploaded Jun 30, 2026 Python 3

File details

Details for the file gridsearcher-1.1.6.tar.gz.

File metadata

Download URL: gridsearcher-1.1.6.tar.gz
Upload date: Jun 30, 2026
Size: 25.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for gridsearcher-1.1.6.tar.gz
Algorithm	Hash digest
SHA256	`eb5b2c4e469d38cb80686b2a4516d809d772934cdb09853796de1881e8c3565c`
MD5	`65d39af985f7bb3ceba7ee2fa663556f`
BLAKE2b-256	`36678b3ea8eef57d65928b6c91e0560190bb87e172e693c581e20030d85d490c`

See more details on using hashes here.

File details

Details for the file gridsearcher-1.1.6-py3-none-any.whl.

File metadata

Download URL: gridsearcher-1.1.6-py3-none-any.whl
Upload date: Jun 30, 2026
Size: 24.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for gridsearcher-1.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c0b866174f19bff979597ee7b1c589a4ef98c10e940340c320cfa5ba0276bc10`
MD5	`bf3080b1c5413128710fa4bcf50931ad`
BLAKE2b-256	`1954b61ad0f9c5986fdf9d1a01e64a476bf9ea2a8774cab8d44f429814961964`

See more details on using hashes here.

gridsearcher 1.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GridSearcher 𖣯🔍

Features ✨󠁇󠁇󠁇

Why GridSearcher? 🤔

Installation 🛠️

How to use GridSearcher?

SBATCH wrapper for SLURM (NEW in version 1.0.4)

Contribute 🤝

Versions history:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes