For whom needs easily usable multiple-experiment environment, here's experiment-scheduler

These details have not been verified by PyPI

Project links

repository

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Experiment-Scheduler

Running Same Experiments is boring, Kuberenetes is hard to install.
For whom needs easily usable multiple-experiment environment, here's experiment-scheduler

About Experiment-Scheduler

Experiment-Scheduler is open-source, for automating repeated experiments.
In some environments like where k8s is not supported, where you can only use is ssh servers with python, repeatedly running same experiments with different parameters would be annoying and boring.
By minimum settings and minimum effort, we provide distributed multi-experiment environment without affecting your already-completed server setting.
Our goal is make you only concentrate on experiment by providing easily, fastly constructable experiment tool.

Quick Start

Installation

pip3 install experiment-scheduler

Write your own experiment yaml

# sample.yaml
name: sample
tasks:
- cmd: torch train --lr 0.01
  condition:
    gpu: 1
  name: hpo_1
- cmd: torch train --lr 0.02
  condition:
    gpu: 1
  name: hpo_2

Run master and task manager

exs init_master
exs init_task_manager

Run experiement scheduler

exs run -f sample.yaml

How to write your own yaml file

Currently we support only a few reserved words. You can refer all of them in below example.

name : This is an experiment name
tasks : list of tasks
    - cmd : sh command you want to run
      name : task name

exs [command] explanation

exs execute -f(--file) : Request experiments to run. You should execute it with -f(--file) argument which is the yaml file depicting experiments.
exs delete -t(--task) : Delete a single task. It needs -t(--task) argument.
exs list : list all experiment. To list specific experiment, use -e(--experiment) argument with experiment id. Id values are truncated by default. For non-truncated value, use -v(--verbose) argument.
exs status : Get status of tasks. It needs -t(--task) argument with task id.
exs init_master : Run master server. When executing the command, master server logs are printed continously. To run it as daemon, use -d(--daemon) argument.
exs init_task_manager : Run a task manager server. If there are more than one server, you need to execute it on each of them. Same as master, task manager server logs are printed as default. To run it as daemon, use -d(--daemon) argument.

How to set experiment_scheduler.cfg

Each server needs address to communicate with other servers. Although default setting exists, you can modify them. Currently, two elements are available:

master_address : "IP:port"
task_manager_address : ["IP:port", "IP:port", ...]

Experiement scheduler uses ConfigParser. So, you should write [default] at head. task_manager_addresses should be wrapped by square brackets even if you use a single node.

Below is default setting.

[default]
master_address = "localhost:50052"
task_manager_address = ["localhost:50051"]

Roadmap

What we are going to work on from v0.3 on the next few months :

Done

RUD on experiment (Start: Oct 3 2022, End: Dec 31 2022)
Register in Pypi (Start: Oct 3 2022, End: Dec 31 2022)
All comments on codes for future docs (Start: Oct 3 2022, End: Dec 31 2022)
Detailed README.md (Feb 6 2023)

In Progress

Detailed --help command (Start: Feb 6 2023)
Set local DB for master (Start: Feb 6 2023)
Refined GPU selection algorithm (Start: Feb 6 2023)
Additional yaml file syntax (Start: Feb 6 2023)
Support multiple gRPC versions (Start: Feb 6 2023)
Single execution for exs excute (Start: Feb 6 2023)
Specify package version. (Start: Feb 6 2023)

To Do

Support multi node environment (v0.4)
Improve test code coverage (v0.4)
Detailed error log (v0.4)

Web Page for Experiment Tracking

Create Web Page Running on localhost to check current status of master, task_manager.
Home, Logs, Status, Experiments pages will be served.

Autotesting for further development and dockerization

Project details

These details have not been verified by PyPI

Project links

repository

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.1

Jun 18, 2023

This version

1.0

May 11, 2023

0.1

Nov 6, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

experiment-scheduler-1.0.tar.gz (3.9 kB view hashes)

Uploaded May 11, 2023 Source

Built Distribution

experiment_scheduler-1.0-py3-none-any.whl (4.2 kB view hashes)

Uploaded May 11, 2023 Python 3

Hashes for experiment-scheduler-1.0.tar.gz

Hashes for experiment-scheduler-1.0.tar.gz
Algorithm	Hash digest
SHA256	`4c54c5e8620a782138cac1d95002afe89dad818f6b8a6a654ca7f7893d910b50`
MD5	`fe1719b3fd9a207f0a94fc15622da1d7`
BLAKE2b-256	`7ee123f9839db10e1916549e38eec0bedbc1ced6937b957587fd4556f90928bd`

Hashes for experiment_scheduler-1.0-py3-none-any.whl

Hashes for experiment_scheduler-1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c86e34c958107d2d407e57dc262ca9fb9efffbc28f7c3089afe7cf4a9e248e73`
MD5	`07c334f9ac3a14e2fd119cb888123ad1`
BLAKE2b-256	`db873ea7accee5547e21f305fbb8ea14d2dd3dfc8af54929cb0fe7beeca26827`