Skip to main content

Invisible asset orchestrator

Project description

barca

The invisible asset orchestrator.
Rust plans it. Python runs it. You just write functions.

CI PyPI Python Rust License


Barca is an asset orchestrator that adds zero perceptible overhead to your Python pipelines. A compiled Rust binary handles parsing, DAG construction, and execution planning. Python does what it's best at: running your code.

# pipeline.py
from barca import asset

@asset()
def raw_data() -> list[dict]:
    return [{"x": 1}, {"x": 2}, {"x": 3}]

@asset(inputs={"data": raw_data})
def summary(data: list[dict]) -> dict:
    return {"count": len(data), "total": sum(d["x"] for d in data)}
$ barca run pipeline.py
{"elapsed_seconds":0.042,"steps_executed":2,"phases":2,"final_output":{"count":3,"total":6}}

No config files. No YAML. No daemon. Just functions and a fast binary.

Install

pip install barca

This gives you:

  • The barca CLI binary (compiled Rust)
  • Python decorator stubs for @asset, @sensor, @effect (IDE autocomplete + type checking)
  • The execution worker (barca._worker)

All in one wheel, built with maturin.

From source

git clone https://github.com/recursia-io/barca.git
cd barca
cargo build --release
maturin develop --release    # installs into current .venv

Quick start

# assets.py
from barca import asset

@asset()
def hello() -> dict:
    return {"message": "Hello from barca!"}
barca run assets.py

That's it. Barca parses your Python source with ruff's AST parser (no import, pure static analysis), builds a dependency graph, generates a phased execution plan, spawns Python workers, and persists results to a local SQLite database -- all in under 40ms for a trivial asset.

How it works

                    ┌─────────────────────────────────────┐
                    │          barca run pipeline.py       │
                    └──────────────┬──────────────────────┘
                                   │
                    ┌──────────────▼──────────────────────┐
                    │         Rust binary (barca)          │
                    │                                      │
                    │  1. Parse Python source (ruff AST)   │
                    │  2. Build DAG (petgraph)              │
                    │  3. Generate execution plan           │
                    │  4. Initialize DB (.barca/metadata.db)│
                    │  5. Spawn Python workers per phase    │
                    │  6. Collect outputs, persist to DB    │
                    └──────────────┬──────────────────────┘
                                   │
                    ┌──────────────▼──────────────────────┐
                    │      Python worker (per phase)       │
                    │                                      │
                    │  - Loads modules via importlib        │
                    │  - Executes steps in tier order       │
                    │  - LRU cache for in-process results   │
                    │  - Emits JSON lines to stdout         │
                    └─────────────────────────────────────┘

Key design decisions:

  • Static analysis only -- Rust never imports your Python code. It parses source text and extracts decorator metadata from the AST.
  • Phased execution -- The planner decomposes the DAG into sequential phases. Within each phase, independent streams run in parallel workers.
  • No framework lock-in -- Decorators are identity functions. Your code runs standalone without barca installed.
  • Single binary -- One pip install gives you everything. No JVM, no Docker, no scheduler service.

Decorators

from barca import asset, sensor, effect, sink, unsafe
from barca import Always, Manual, Schedule
from barca import partitions, partitions_from, collect, asset_ref

@asset

Cached computation node. The workhorse.

@asset()
def prices() -> dict:
    return {"AAPL": 150, "MSFT": 380}

@asset(inputs={"data": prices})
def report(data: dict) -> str:
    return f"Tracked {len(data)} tickers"

@sensor

Observes external state. Returns (update_detected, output).

@sensor()
def inbox_files() -> tuple[bool, list[str]]:
    files = list(Path("inbox").glob("*.csv"))
    return bool(files), [str(f) for f in files]

@effect

Side-effect leaf node. Never cached, can't be used as input.

@effect(inputs={"report": report})
def publish(report: str) -> None:
    print(f"Publishing: {report}")

@sink

Stacks on @asset to write outputs to files.

@asset()
@sink("output/data.json", serializer="json")
def my_data() -> dict:
    return {"rows": 42}

Freshness markers

Marker Behavior
Always Auto-materializes whenever stale (default for @effect)
Manual Only runs on explicit refresh
Schedule("0 5 * * *") Cron expression

Partitions

Fan a single asset definition into N independent materializations:

@asset(partitions={"ticker": partitions(["AAPL", "MSFT", "GOOG"])})
def prices(ticker: str) -> dict:
    return {"ticker": ticker, "price": get_price(ticker)}
Function Purpose
partitions(values) Static list of partition keys
partitions_from(source) Derive partitions from upstream asset
collect(asset_fn) Aggregate all partitions of an upstream
asset_ref(ref_string) Canonical asset reference

CLI

barca run <file.py> [file.py ...]     Parse, plan, and execute
barca plan <file.py> [file.py ...]    Emit execution plan as JSON
barca --help                          Show help

barca plan -- inspect without running

$ barca plan pipeline.py
{
  "total_steps": 2,
  "phases": [
    { "reason": "Independent", "streams": [{"stream_id": 0, "steps": ["raw_data"]}] },
    { "reason": "Dependent",   "streams": [{"stream_id": 1, "steps": ["summary"]}] }
  ]
}

barca run -- execute the full plan

Parses source, builds DAG, spawns workers, collects outputs, persists to .barca/metadata.db.

Output is a JSON summary:

{
  "elapsed_seconds": 0.042,
  "steps_executed": 2,
  "phases": 2,
  "final_output": {"count": 3, "total": 6}
}

Diagnostics go to stderr:

[barca] 2 nodes, 1 edges, 2 phases, 2 streams | plan: 1.2ms | exec: 38ms | total: 40ms

Benchmarks

All benchmarks measured with hyperfine (3 warmup runs, 10 measured runs) on the same machine. Barca is compared against Dagster and Prefect running equivalent pipelines.

Trivial (1 asset, zero work)

Measures pure framework overhead -- how long it takes to do nothing.

Framework Mean Relative
barca 38.0 ms 1.00x
dagster 538.1 ms 14.2x
prefect 3977.7 ms 104.7x

Barca's total overhead (parse + plan + spawn + persist) is 38ms. Dagster needs ~0.5s. Prefect needs ~4s.

Benchmark suite

The benchmarks/ directory contains 12 scenarios covering a range of DAG topologies and workloads:

Benchmark Assets Topology What it tests
trivial 1 single node Pure framework overhead
chain_100 100 linear chain Sequential dependency resolution
fan_out_500 500 flat (independent) Wide parallelism, process spawning
fan_out_500_50ms 500 flat + 50ms sleep Parallelism under I/O latency
deep_diamond 18 diamond (5-wide, 6-deep) Fan-out/fan-in patterns
wide_layers varies parallel layers Tier-based parallel execution
large_payloads varies varied JSON serialization overhead
map_reduce varies map-reduce Scatter-gather pattern
mixed_io_cpu varies varied Mixed I/O and CPU workloads
multi_file_discovery varies multi-file Cross-file asset discovery
iris_pipeline varies diamond ML pipeline (iris dataset)
spaceflights 10 diamond (3-wide, 6-deep) Full ML pipeline (Kedro-style)

Run any benchmark:

cd benchmarks/trivial
./bench.sh 10    # 10 measured runs

Each benchmark includes equivalent Dagster and Prefect implementations for apples-to-apples comparison.

Architecture

Cargo.toml                  Rust workspace root
crates/
  barca-core/               Core library: models, parser, DAG, planner, hashing
  barca-cli/                CLI binary (the `barca` command)
python/barca/
  __init__.py               No-op decorator stubs (identity functions)
  _worker.py                Execution worker (invoked by Rust binary)
  py.typed                  PEP 561 marker
pyproject.toml              Maturin build config

Tech stack

Layer Technology
Parser ruff Python AST (static, no import)
DAG petgraph
Database Turso/libSQL (local SQLite)
Serialization serde + serde_json
Hashing SHA-256 (content-addressed artifacts)
Build maturin (Rust binary + Python stubs in one wheel)
Python runtime Any Python >= 3.10

Node kinds

Kind Decorator Cached Can be input
asset @asset() Yes Yes
sensor @sensor() No Yes
effect @effect() No No (leaf)

Development

git clone https://github.com/recursia-io/barca.git
cd barca

# Build
cargo build --release
maturin develop --release

# Test
cargo test

# Run
barca run examples/basic_app/example_project/assets.py
barca plan examples/basic_app/example_project/assets.py

Project status

Barca is in active development. The core pipeline (parse -> DAG -> plan -> execute -> persist) is working and benchmarked. See the guide for a walkthrough.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

barca-0.1.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.1 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

barca-0.1.0-py3-none-macosx_11_0_arm64.whl (7.3 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

barca-0.1.0-py3-none-macosx_10_12_x86_64.whl (7.7 MB view details)

Uploaded Python 3macOS 10.12+ x86-64

File details

Details for the file barca-0.1.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for barca-0.1.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c6aaefdce6bc0fabbdc35abf114f9828cd8d2931f34f62df5e7af59858c8990e
MD5 c9ef6cae8eb0e9dacebf9014e5595177
BLAKE2b-256 feae3b7964e92922f25813ad7af7d043b9ef732026cf729655f032356c4a6478

See more details on using hashes here.

File details

Details for the file barca-0.1.0-py3-none-macosx_11_0_arm64.whl.

File metadata

  • Download URL: barca-0.1.0-py3-none-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 7.3 MB
  • Tags: Python 3, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for barca-0.1.0-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bbe375036334b9dd2f67e426a2fed3536454131ca79ec6f3022ef92dad550880
MD5 35f8d91c4d22887aedc7f97e1224fc53
BLAKE2b-256 611f2746f5b0c330ee277eada3fd9e24fe4e08513e7f08d13c962e587c4975ae

See more details on using hashes here.

File details

Details for the file barca-0.1.0-py3-none-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for barca-0.1.0-py3-none-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b2ab38dccf1d899e2e1253362d5083d81f6f793f29b357f001accfe3ce38eade
MD5 e99e1ab9c65a00d28619a676392157f0
BLAKE2b-256 234194711af407f22bef4701043ce2f078e1f1930342b5cb24da8e97b1670ab7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page