A streamlined MLflow orchestrator for hybrid LLM training and tuning
Project description

A Lightweight Orchestrator for Machine Learning Experiments
Running machine learning experiments often involves a series of steps, such as data processing, training, and evaluation. Managing the dependencies and parameters for each step can become complex. Flowkestra simplifies this process by allowing you to define your entire workflow in a single configuration file.
This approach makes it easier to run, reproduce, and track your experiments, whether you are doing initial exploration on your local machine or preparing for more complex workflows.
Core Features
- YAML-based Workflows: Define your experiment as a series of tasks in a simple
config.ymlfile. - Sequential Task Execution: Runs your Python scripts in the order you define them.
- MLflow Integration: Automatically logs your runs, parameters, and artifacts to an MLflow tracking server.
- Local Execution: Currently supports running experiments on your local machine.
Getting Started
1. Installation
pip install flowkestra
(Note: The package is not yet published to PyPI. To install locally, use pip install .)
2. Create a Configuration File
Create a config.yml file to define your experiment. This file specifies the scripts to run, their inputs/outputs, and any parameters.
Here is an example for a local run:
# A descriptive name for your MLflow experiment.
mlflow_uri: "http://localhost:5000"
experiment_name: "example_experiment"
# Define your experiment instances. Each instance represents a distinct run
# with its own configuration.
instances:
- mode: local # Currently, 'local' is the supported execution mode.
# The working directory for the instance.
workdir: "./test_data"
# The target directory where training and virtual environment will be created.
target_workdir: "./local_train"
# Path to a requirements.txt file for this instance's environment.
requirements: "requirements_local.txt"
# Define pipelines (e.g., 'train', 'evaluate') with their scripts and arguments.
pipelines:
train:
script: "mlflow_example.py" # The Python script to execute.
args: # Optional arguments to pass to the script.
[
"--epoch", "30"
]
3. Run Your Experiment
Execute your experiment using the Flowkestra CLI, pointing it to your configuration file.
flowkestra -f config.yml
Flowkestra will then run your defined tasks in order.
Potential Use Cases
- Organizing Experiments: Structure your ML code into reusable scripts and orchestrate them for different experiments.
- Reproducible Runs: Keep your configuration, parameters, and scripts together, ensuring that you can easily rerun an experiment.
- Basic ML Pipelines: Create simple, sequential pipelines for tasks like data preprocessing followed by model training.
Roadmap & Next Features
- Remote Training: Support for executing tasks on remote machines via SSH.
- Automated Parameter Tuning: Integrate with libraries to automate hyperparameter searches.
- Expanded Cloud Support: Add direct support for cloud environments.
Notes
- This project is in its early stages. Contributions and feedback are welcome.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flowkestra-0.1.0.tar.gz.
File metadata
- Download URL: flowkestra-0.1.0.tar.gz
- Upload date:
- Size: 12.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a3ccae2db46584172909cfe8c7795a4bdef98e87c2feb3fc383f44df96ed3969
|
|
| MD5 |
7ec7401d0e942bcecf26b4f705dab3da
|
|
| BLAKE2b-256 |
f2e7d5902fbd44efcd49f3091c3eff259d0be2c5a7d5eadf60800cbbaf1f8fe7
|
File details
Details for the file flowkestra-0.1.0-py3-none-any.whl.
File metadata
- Download URL: flowkestra-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
57ae84f568066712bdaaff8cd478bb95748ca3cf8b1abecf50f9bcfb1ff4cb3f
|
|
| MD5 |
18d523aa929ecad83921fd02631e5ccd
|
|
| BLAKE2b-256 |
672d35df4ff57ea05fd90c5b116bfdcdc1f8d181f60f438fa868c6cdd57a0b16
|