All-in-Python configuration and execution engine for R&D workloads.
Project description
KohakuEngine
All-in-Python Configuration and Execution Engine for R&D Workloads
Write configs as pure Python — no YAML, no JSON, often no boilerplate at all. KohakuEngine takes your existing Python scripts and runs them with different configurations without modifying your code: it imports the script, injects global variables from your config, finds your entrypoint, and calls it.
pip install kohaku-engine
A 30-second tour
Your script — unchanged:
# train.py
learning_rate = 0.001
batch_size = 32
epochs = 10
def train():
print(f"lr={learning_rate} bs={batch_size} epochs={epochs}")
if __name__ == "__main__":
train()
Your config — just variables, no boilerplate:
# config.py
learning_rate = 0.01
batch_size = 64
epochs = 5
Run it:
kogine run train.py --config config.py
# or:
kogine run train.py --set learning_rate=0.01 --set batch_size=64
That's it. No Config(...) construction, no def config_gen():, no imports
in the config file.
Why KohakuEngine
R&D workflows live in the gap between "throw global variables at it" and "set up a real config system." KohakuEngine embraces the global-variable style and gives you proper machinery on top:
- Python-first configs. Computed values, functions, closures — anything Python can express, your config can express.
- Non-invasive. Works with scripts you already have. No refactor required.
- Generators for sweeps. A
_sweepdict expands to a cartesian product; or write aconfig_gen()generator for irregular sweeps. - Parallel execution. Subprocess-isolated workers for true
__main__semantics across sweep iterations. - Config cells (Idea 7). Mark a region of your script as configuration — Kogine evaluates it once and freezes the values into the module, so expensive setup runs at most once per script run (fork-mode-safe).
Configs are just data
The minimum config file is just variables:
# config.py
learning_rate = 0.01
batch_size = 64
Want positional args, kwargs, or metadata? Use the reserved underscore names:
# config.py
learning_rate = 0.01
_args = []
_kwargs = {"resume_from": "ckpt.pt"}
_metadata = {"experiment": "baseline"}
Want to include a locally-defined function or class? Just define it — locally- defined callables are captured automatically:
# config.py
learning_rate = 0.01
def lr_schedule(epoch):
return learning_rate * 0.95**epoch
class ModelConfig:
hidden_size = 256
For imported callables/classes (which the loader can't disambiguate),
wrap with use():
# config.py
import math
from kohakuengine import use
loss_fn = use(math.sqrt)
Sweeps
Declarative grid (_sweep)
# sweep.py
epochs = 5
_sweep = {
"learning_rate": [0.001, 0.01, 0.1],
"batch_size": [32, 64],
}
# expands to 6 configs (3 x 2)
Paired (zip mode)
_sweep = {
"__mode__": "zip",
"learning_rate": [0.001, 0.01, 0.1],
"batch_size": [32, 64, 128],
}
# expands to 3 configs (paired)
From the CLI
kogine run train.py --sweep learning_rate=0.001,0.01,0.1 --sweep batch_size=32,64
Generators for arbitrary sweeps
# sweep.py
from kohakuengine import Config
def config_gen():
for lr in [0.001, 0.01, 0.1]:
yield Config(
globals_dict={"learning_rate": lr},
metadata={"lr": lr},
)
Entrypoint discovery (Idea 5)
Kogine tries, in priority order:
--entrypoint NAMECLI flagscript.py:funccolon syntax@kogine.entrypointdecorator (most explicit, least magic)if __name__ == "__main__":AST detectionmain()orrun()by convention
The decorator removes all ambiguity:
import kohakuengine as kogine
@kogine.entrypoint
def train():
...
Config cells (Idea 7)
Mark a region of your script as configuration; Kogine evaluates it once and freezes the resolved values into the module:
# train.py
import torch
from data import load_huge_index
# %% kogine:config
learning_rate = 0.001
batch_size = 32
index = load_huge_index() # expensive, runs at most once per script run
# %% kogine:script
def train():
loader = DataLoader(index, batch_size=batch_size, num_workers=4)
# Worker subprocesses that fork from this process inherit the
# already-evaluated state -- load_huge_index() is NOT re-run.
for batch in loader:
...
if __name__ == "__main__":
train()
The cell is rewritten in memory — no temp file, tracebacks still point
at the original train.py line numbers, debuggers still work.
Scope: "evaluate once" applies per script run. A sweep over N configs is N independent script runs (each evaluating its own cell once). Fork-mode worker children inherit. Spawn-mode children re-import from disk and will re-execute the cell — documented limitation.
CLI overview
# Single run
kogine run script.py --config config.py
kogine run script.py --set lr=0.01 --set bs=64
kogine run script.py --sweep lr=0.001,0.01,0.1
# Sequential / parallel workflows
kogine workflow sequential a.py b.py c.py --config c.py
kogine workflow parallel train.py --config sweep.py --workers 4
# Config utilities
kogine config validate config.py # syntax / shape check
kogine config show config.py # lowered Config form
kogine config check script.py --config c.py # diff vs. script defaults, typo hints
kogine config check is especially useful — it warns about overrides that
don't match any script default (typos like batch_sz vs batch_size).
Python API
from kohakuengine import Config, Script, Flow, run
# One-shot convenience
run("train.py", globals_dict={"learning_rate": 0.01})
# Build it up
config = Config.from_file("config.py")
script = Script("train.py", config=config)
script.run()
# Workflow
flow = Flow([Script("train.py", config=c) for c in configs], mode="parallel", max_workers=4)
flow.run()
Migration notes (v0.0.x → v0.2)
| Before | After |
|---|---|
def config_gen(): return Config.from_globals() |
Delete it. Bare files work. |
lr_scheduler = use(local_func) |
lr_scheduler = local_func (no wrapper) |
def config_gen(): for ...: yield Config(...) for grids |
_sweep = {"lr": [...], "bs": [...]} |
Config(args=..., kwargs=...) in config_gen |
_args = ..., _kwargs = ... at top level |
capture_globals() context manager |
Deprecated; emits warning. Removed in v0.3. |
Existing def config_gen() and CONFIG = ... patterns continue to work
unchanged. Migration is opt-in.
Documentation
Full documentation lives in docs/:
- Tutorials
- How-to guides
- Reference
- Explanation
- Project
Historical design notes are kept under
plans/ux-redesign-v0.2/ for reference.
License
Apache-2.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kohaku_engine-0.3.0-py3-none-any.whl.
File metadata
- Download URL: kohaku_engine-0.3.0-py3-none-any.whl
- Upload date:
- Size: 41.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a1264141d8ebc6538ed6928255ab6063b57224acc84d521c1c7ab68290c73d7
|
|
| MD5 |
ae0636057fd4a7d1b021f006cca42cca
|
|
| BLAKE2b-256 |
096ad48795e8c4fc2aea8e39bf20a8eb6f796a80cec1ce8bdefbfd32d4df3a88
|