Skip to main content

A workflow engine with sugar syntax

Project description

🐇 Pyoco

pyoco is a minimal, pure-Python DAG engine for defining and running simple task-based workflows.

Repository Move

Pyoco's canonical Git remote has moved to https://github.com/hachiware-labs/pyoco.git. Development for 0.8.0 and later continues on main there.

If you have an older clone, update origin with:

git remote set-url origin https://github.com/hachiware-labs/pyoco.git

✨ Why It Feels Easy

  • Try it in minutes: a tiny local workflow is enough to get your first success.
  • 🧩 Grow without changing tools: when your flow becomes reusable, move to plug-ins + tasks.<local>.use.
  • 🪶 Stay lightweight: no scheduler cluster, no metadata DB, no “platform first” setup.

Pyoco is intentionally much smaller than full-scale workflow engines like Airflow. It is built for local development, single-machine execution, and “I want to run this now” workflows.

🚦 Pick Your Route

  • Fastest first success: write one tiny task and run it locally. Great for learning or debugging an idea.
  • Recommended project route: package reusable tasks as entry point plug-ins, then bind them in flow.yaml with tasks.<local_name>.use.

If you are new to Pyoco, do the quick win first. If you are building something you want to keep, learn the plug-in route right after.

✨ Features

  • Pure Python: No external services or heavy dependencies required.
  • Minimal DAG model: Tasks and dependencies are defined directly in code.
  • Task-oriented: Focus on "small workflows" that should be easy to read and maintain.
  • Graph DSL controls: >> pipeline + node_name: task_ref + pipe/switch/repeat/foreach/until for branching, reuse, and loops in flow.yaml.
  • Friendly trace logs: Runs can be traced step by step from the terminal with cute (or plain) logs.
  • Parallel Execution: Automatically runs independent tasks in parallel.
  • Artifact Management: Easily save and manage task outputs and files.
  • Observability: Track execution with unique Run IDs and detailed state transitions.
  • Control: Cancel running workflows gracefully with Ctrl+C.

📦 Installation

pip install pyoco

🚀 Quick Win: Run Something in 60 Seconds

This is the shortest possible hello. It keeps everything in one file so you can feel the engine immediately.

from pyoco import task
from pyoco.core.models import Flow
from pyoco.core.engine import Engine

@task
def fetch_data(ctx):
    print("🐰 Fetching data...")
    return {"id": 1, "value": "carrot"}

@task
def process_data(ctx, data):
    print(f"🥕 Processing: {data['value']}")
    return data['value'].upper()

@task
def save_result(ctx, result):
    print(f"✨ Saved: {result}")

# Define the flow
flow = Flow(name="hello_pyoco")
flow >> fetch_data >> process_data >> save_result

# Wire inputs (explicitly for this example)
process_data.task.inputs = {"data": "$node.fetch_data.output"}
save_result.task.inputs = {"result": "$node.process_data.output"}

if __name__ == "__main__":
    engine = Engine()
    engine.run(flow)

Run it:

python examples/hello_pyoco.py

Output:

🐇 pyoco > start flow=hello_pyoco
🏃 start node=fetch_data
🐰 Fetching data...
✅ done node=fetch_data (0.30 ms)
🏃 start node=process_data
🥕 Processing: carrot
✅ done node=process_data (0.23 ms)
🏃 start node=save_result
✨ Saved: CARROT
✅ done node=save_result (0.30 ms)
🥕 done flow=hello_pyoco

See examples/hello_pyoco.py for the full code.

🧭 Build It the Recommended Way

When a task should be reused, shared, or documented, prefer this shape:

  1. Publish a Task subclass from a plug-in package.
  2. Give it a stable public name such as vision/image_classify.
  3. Bind that public name to a local workflow name with tasks.<local_name>.use.

That is the model Pyoco now treats as the default for real projects.

🧾 flow.yaml Graph DSL

This is the model to learn once you move past a one-file experiment. flow.yaml keeps the graph readable, and plug-in task names keep reuse clean.

For production-style task sharing, prefer entry point plug-ins that register Task subclasses and bind them in flow.yaml via tasks.<local_name>.use. Keep tasks.<name>.callable as an explicit local override or migration path.

version: 1

tasks:
  prepare:
    use: "demo/prepare"
  choose_mode:
    use: "demo/choose_mode"
  run_batch:
    use: "demo/run_batch"
  process_item:
    use: "demo/process_item"
  poll_status:
    use: "demo/poll_status"
  finish:
    use: "demo/finish"

flow:
  defaults:
    mode: "batch"
    items: ["A", "B", "C"]
    done: false
  graph: |
    prepare
    >> choose_mode
    >> switch(on={{mode}}){
      batch: first_batch: run_batch >> second_batch: run_batch;
      default: run_batch;
    }
    >> foreach(over={{items}}, item=it, index=idx){ process_item }
    >> until(cond={{params.done}}, max_iter=5){ poll_status }
    >> finish
  • >>: sequential dependency
  • node_name: task_ref: reuse one task definition with a distinct runtime node name
  • tasks.<local_name>.use: bind a registered public task name such as demo/run_batch to a local graph name
  • pipe(NAME): inline expansion from top-level pipes
  • switch(on=...){ ... }: single-branch selection
  • repeat / foreach / until: control loops

Want a gentle walkthrough instead of reading specs? Start with docs/tutorial/index.md.

🏗️ Architecture

Pyoco is designed with a simple flow:

+-----------+        +------------------+        +-----------------+
| User Code |  --->  | pyoco.core.Flow  |  --->  | trace/logger    |
| (Tasks)   |        | (Engine)         |        | (Console/File)  |
+-----------+        +------------------+        +-----------------+
  1. User Code: You define tasks and workflows using Python decorators.
  2. Core Engine: The engine resolves dependencies and executes tasks (in parallel where possible).
  3. Trace: Execution events are sent to the trace backend for logging (cute or plain).

🎭 Modes

Pyoco has two output modes:

  • Cute Mode (Default): Uses emojis and friendly messages. Best for local development and learning.
  • Non-Cute Mode: Plain text logs. Best for CI/CD and production monitoring.

You can switch modes using an environment variable:

export PYOCO_CUTE=0  # Disable cute mode

Or via CLI flag:

pyoco run --non-cute ...

🔭 Observability / Server (Archived)

Observability and server-related docs are archived and out of scope for the current requirements.
See docs/archive/observability.md and docs/archive/roadmap.md.

🌐 Distributed Execution with pyoco-server

pyoco focuses on local/single-machine workflow execution.
For distributed workers, queueing, and remote run management, use pyoco-server.

  • The practical win of the plug-in model is distribution: packaged task sets can travel as wheels instead of ad-hoc source copies.
  • pyoco-server provides the worker/server side for that model, so reusable task packages fit naturally when you want to fan out execution beyond one machine.
  • Repository: https://github.com/kitfactory/pyoco-server
  • Detailed setup, operations, and compatibility are documented in pyoco-server.

🧩 Plug-ins

Need to share domain-specific tasks? Publish an entry point under pyoco.tasks and pyoco will auto-load it. This is the default recommended path. Register Task subclasses first (callables still work with warnings), give them stable public names like vision/image_classify, then bind them with tasks.<local_name>.use in flow.yaml. See docs/plugins.md for examples, quickstart, and pyoco plugins list / pyoco plugins lint.

Another reason this path matters: once tasks live in a package, they are much easier to distribute to pyoco-server workers as versioned plug-ins.

Big data note: pass handles, not copies. For large tensors/images, stash paths or handles in ctx.artifacts/ctx.scratch and let downstream tasks materialize only when needed. For lazy pipelines (e.g., DataPipe), log the pipeline when you actually iterate (typically the training task) instead of materializing upstream.

🧭 Task Discovery (Security)

Pyoco does not allow configuring discovery scope in flow.yaml (the discovery: key is rejected) to reduce the risk of importing unexpected code.

  • Entry point plug-ins: auto-loaded from importlib.metadata.entry_points(group="pyoco.tasks")
  • Extra imports (ops-controlled): set PYOCO_DISCOVERY_MODULES (comma/space-separated module names), e.g. PYOCO_DISCOVERY_MODULES=tasks,myapp.extra_tasks
  • Flow-local bindings: prefer tasks.<local_name>.use: "namespace/task_name" for registered plug-in tasks
  • Explicit callables: keep tasks.<name>.callable for local overrides or small ad-hoc flows

📚 Documentation

💖 Contributing

We love contributions! Please feel free to submit a Pull Request.


Made with 🥕 by the Pyoco Team.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyoco-0.8.0.tar.gz (63.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyoco-0.8.0-py3-none-any.whl (63.7 kB view details)

Uploaded Python 3

File details

Details for the file pyoco-0.8.0.tar.gz.

File metadata

  • Download URL: pyoco-0.8.0.tar.gz
  • Upload date:
  • Size: 63.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for pyoco-0.8.0.tar.gz
Algorithm Hash digest
SHA256 351285251c4363a08997941b6621a55a1e4a3cda791651c84ebfc25e4c0b67e3
MD5 70b70354a3dcd82c83fcb1f597022d94
BLAKE2b-256 22870c92a6c3805756175c540ba35279a9673cea2498b4423755a3624afdf9f2

See more details on using hashes here.

File details

Details for the file pyoco-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: pyoco-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 63.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for pyoco-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 861e8179f6c4883ffe36876aa3c8178e30e5f8c9b386535308a052d0a0b31512
MD5 6c4aba97b69dc62b02fdbcc3b388d2cd
BLAKE2b-256 d5611043d262ce5f013a315f162b9a0cbd66b13f934a6b9b0206dfaf65638c44

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page