Invisible asset orchestrator
Project description
barca
The invisible asset orchestrator.
Rust plans it. Python runs it. You just write functions.
Barca is an asset orchestrator that adds zero perceptible overhead to your Python pipelines. A compiled Rust binary handles parsing, DAG construction, and execution planning. Python does what it's best at: running your code.
# pipeline.py
from barca import asset
@asset()
def raw_data() -> list[dict]:
return [{"x": 1}, {"x": 2}, {"x": 3}]
@asset(inputs={"data": raw_data})
def summary(data: list[dict]) -> dict:
return {"count": len(data), "total": sum(d["x"] for d in data)}
$ barca run pipeline.py
{"elapsed_seconds":0.042,"steps_executed":2,"phases":2,"final_output":{"count":3,"total":6}}
No config files. No YAML. No daemon. Just functions and a fast binary.
Install
pip install barca
This gives you:
- The
barcaCLI binary (compiled Rust) - Python decorator stubs for
@asset,@sensor,@effect(IDE autocomplete + type checking) - The execution worker (
barca._worker)
All in one wheel, built with maturin.
From source
git clone https://github.com/recursia-io/barca.git
cd barca
cargo build --release
maturin develop --release # installs into current .venv
Quick start
# assets.py
from barca import asset
@asset()
def hello() -> dict:
return {"message": "Hello from barca!"}
barca run assets.py
That's it. Barca parses your Python source with ruff's AST parser (no import, pure static analysis), builds a dependency graph, generates a phased execution plan, spawns Python workers, and persists results to a local SQLite database -- all in under 40ms for a trivial asset.
How it works
┌─────────────────────────────────────┐
│ barca run pipeline.py │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ Rust binary (barca) │
│ │
│ 1. Parse Python source (ruff AST) │
│ 2. Build DAG (petgraph) │
│ 3. Generate execution plan │
│ 4. Initialize DB (.barca/metadata.db)│
│ 5. Spawn Python workers per phase │
│ 6. Collect outputs, persist to DB │
└──────────────┬──────────────────────┘
│
┌──────────────▼──────────────────────┐
│ Python worker (per phase) │
│ │
│ - Loads modules via importlib │
│ - Executes steps in tier order │
│ - LRU cache for in-process results │
│ - Emits JSON lines to stdout │
└─────────────────────────────────────┘
Key design decisions:
- Static analysis only -- Rust never imports your Python code. It parses source text and extracts decorator metadata from the AST.
- Phased execution -- The planner decomposes the DAG into sequential phases. Within each phase, independent streams run in parallel workers.
- No framework lock-in -- Decorators are identity functions. Your code runs standalone without barca installed.
- Single binary -- One
pip installgives you everything. No JVM, no Docker, no scheduler service.
Decorators
from barca import asset, sensor, effect, sink, unsafe
from barca import Always, Manual, Schedule
from barca import partitions, partitions_from, collect, asset_ref
@asset
Cached computation node. The workhorse.
@asset()
def prices() -> dict:
return {"AAPL": 150, "MSFT": 380}
@asset(inputs={"data": prices})
def report(data: dict) -> str:
return f"Tracked {len(data)} tickers"
@sensor
Observes external state. Returns (update_detected, output).
@sensor()
def inbox_files() -> tuple[bool, list[str]]:
files = list(Path("inbox").glob("*.csv"))
return bool(files), [str(f) for f in files]
@effect
Side-effect leaf node. Never cached, can't be used as input.
@effect(inputs={"report": report})
def publish(report: str) -> None:
print(f"Publishing: {report}")
@sink
Stacks on @asset to write outputs to files.
@asset()
@sink("output/data.json", serializer="json")
def my_data() -> dict:
return {"rows": 42}
Freshness markers
| Marker | Behavior |
|---|---|
Always |
Auto-materializes whenever stale (default for @effect) |
Manual |
Only runs on explicit refresh |
Schedule("0 5 * * *") |
Cron expression |
Partitions
Fan a single asset definition into N independent materializations:
@asset(partitions={"ticker": partitions(["AAPL", "MSFT", "GOOG"])})
def prices(ticker: str) -> dict:
return {"ticker": ticker, "price": get_price(ticker)}
| Function | Purpose |
|---|---|
partitions(values) |
Static list of partition keys |
partitions_from(source) |
Derive partitions from upstream asset |
collect(asset_fn) |
Aggregate all partitions of an upstream |
asset_ref(ref_string) |
Canonical asset reference |
CLI
barca run <file.py> [file.py ...] Parse, plan, and execute
barca plan <file.py> [file.py ...] Emit execution plan as JSON
barca --help Show help
barca plan -- inspect without running
$ barca plan pipeline.py
{
"total_steps": 2,
"phases": [
{ "reason": "Independent", "streams": [{"stream_id": 0, "steps": ["raw_data"]}] },
{ "reason": "Dependent", "streams": [{"stream_id": 1, "steps": ["summary"]}] }
]
}
barca run -- execute the full plan
Parses source, builds DAG, spawns workers, collects outputs, persists to .barca/metadata.db.
Output is a JSON summary:
{
"elapsed_seconds": 0.042,
"steps_executed": 2,
"phases": 2,
"final_output": {"count": 3, "total": 6}
}
Diagnostics go to stderr:
[barca] 2 nodes, 1 edges, 2 phases, 2 streams | plan: 1.2ms | exec: 38ms | total: 40ms
Benchmarks
All benchmarks measured with hyperfine (3 warmup runs, 10 measured runs) on the same machine. Barca is compared against Dagster and Prefect running equivalent pipelines.
Trivial (1 asset, zero work)
Measures pure framework overhead -- how long it takes to do nothing.
| Framework | Mean | Relative |
|---|---|---|
| barca | 38.0 ms | 1.00x |
| dagster | 538.1 ms | 14.2x |
| prefect | 3977.7 ms | 104.7x |
Barca's total overhead (parse + plan + spawn + persist) is 38ms. Dagster needs ~0.5s. Prefect needs ~4s.
Benchmark suite
The benchmarks/ directory contains 12 scenarios covering a range of DAG topologies and workloads:
| Benchmark | Assets | Topology | What it tests |
|---|---|---|---|
trivial |
1 | single node | Pure framework overhead |
chain_100 |
100 | linear chain | Sequential dependency resolution |
fan_out_500 |
500 | flat (independent) | Wide parallelism, process spawning |
fan_out_500_50ms |
500 | flat + 50ms sleep | Parallelism under I/O latency |
deep_diamond |
18 | diamond (5-wide, 6-deep) | Fan-out/fan-in patterns |
wide_layers |
varies | parallel layers | Tier-based parallel execution |
large_payloads |
varies | varied | JSON serialization overhead |
map_reduce |
varies | map-reduce | Scatter-gather pattern |
mixed_io_cpu |
varies | varied | Mixed I/O and CPU workloads |
multi_file_discovery |
varies | multi-file | Cross-file asset discovery |
iris_pipeline |
varies | diamond | ML pipeline (iris dataset) |
spaceflights |
10 | diamond (3-wide, 6-deep) | Full ML pipeline (Kedro-style) |
Run any benchmark:
cd benchmarks/trivial
./bench.sh 10 # 10 measured runs
Each benchmark includes equivalent Dagster and Prefect implementations for apples-to-apples comparison.
Architecture
Cargo.toml Rust workspace root
crates/
barca-core/ Core library: models, parser, DAG, planner, hashing
barca-cli/ CLI binary (the `barca` command)
python/barca/
__init__.py No-op decorator stubs (identity functions)
_worker.py Execution worker (invoked by Rust binary)
py.typed PEP 561 marker
pyproject.toml Maturin build config
Tech stack
| Layer | Technology |
|---|---|
| Parser | ruff Python AST (static, no import) |
| DAG | petgraph |
| Database | Turso/libSQL (local SQLite) |
| Serialization | serde + serde_json |
| Hashing | SHA-256 (content-addressed artifacts) |
| Build | maturin (Rust binary + Python stubs in one wheel) |
| Python runtime | Any Python >= 3.10 |
Node kinds
| Kind | Decorator | Cached | Can be input |
|---|---|---|---|
| asset | @asset() |
Yes | Yes |
| sensor | @sensor() |
No | Yes |
| effect | @effect() |
No | No (leaf) |
Development
git clone https://github.com/recursia-io/barca.git
cd barca
# Build
cargo build --release
maturin develop --release
# Test
cargo test
# Run
barca run examples/basic_app/example_project/assets.py
barca plan examples/basic_app/example_project/assets.py
Project status
Barca is in active development. The core pipeline (parse -> DAG -> plan -> execute -> persist) is working and benchmarked. See the guide for a walkthrough.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file barca-0.1.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: barca-0.1.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 8.1 MB
- Tags: Python 3, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6aaefdce6bc0fabbdc35abf114f9828cd8d2931f34f62df5e7af59858c8990e
|
|
| MD5 |
c9ef6cae8eb0e9dacebf9014e5595177
|
|
| BLAKE2b-256 |
feae3b7964e92922f25813ad7af7d043b9ef732026cf729655f032356c4a6478
|
File details
Details for the file barca-0.1.0-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: barca-0.1.0-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 7.3 MB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bbe375036334b9dd2f67e426a2fed3536454131ca79ec6f3022ef92dad550880
|
|
| MD5 |
35f8d91c4d22887aedc7f97e1224fc53
|
|
| BLAKE2b-256 |
611f2746f5b0c330ee277eada3fd9e24fe4e08513e7f08d13c962e587c4975ae
|
File details
Details for the file barca-0.1.0-py3-none-macosx_10_12_x86_64.whl.
File metadata
- Download URL: barca-0.1.0-py3-none-macosx_10_12_x86_64.whl
- Upload date:
- Size: 7.7 MB
- Tags: Python 3, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b2ab38dccf1d899e2e1253362d5083d81f6f793f29b357f001accfe3ce38eade
|
|
| MD5 |
e99e1ab9c65a00d28619a676392157f0
|
|
| BLAKE2b-256 |
234194711af407f22bef4701043ce2f078e1f1930342b5cb24da8e97b1670ab7
|