Skip to main content

Provenance tracking, intelligent caching, and data virtualization for scientific simulation workflows.

Project description

Consist

CI Python 3.11+ License BSD 3-Clause

Consist is a caching and provenance layer for scientific simulation workflows. It records the code, configuration, input data, and output artifacts behind each run so expensive steps can be skipped safely and results remain queryable after the fact.

Consist is useful when a workflow has:

  • long-running model steps that should cache-hit when inputs are unchanged;
  • scenario variants that need explicit lineage and comparison;
  • file-based tools that need stable local paths but still need canonical provenance;
  • post-run questions like "which config produced this output?"

Installation

pip install consist

Optional integrations are installed as extras:

pip install "consist[ingest]"
pip install "consist[docker]"

[!NOTE] Consist is pre-1.0. It is ready for real workflows, but minor releases may still include breaking changes while the API settles.

Quick Example

from pathlib import Path

import pandas as pd

import consist
from consist import ExecutionOptions, Tracker

tracker = Tracker(run_dir="./runs", db_path="./provenance.duckdb")


def clean_data(raw: Path, threshold: float = 0.5) -> dict[str, Path]:
    df = pd.read_parquet(raw)
    out = Path("./cleaned.parquet")
    df[df["value"] > threshold].to_parquet(out)
    return {"cleaned": out}


first = tracker.run(
    fn=clean_data,
    inputs={"raw": Path("raw.parquet")},
    config={"threshold": 0.5},
    outputs=["cleaned"],
    execution_options=ExecutionOptions(input_binding="paths"),
)

second = tracker.run(
    fn=clean_data,
    inputs={"raw": Path("raw.parquet")},
    config={"threshold": 0.5},
    outputs=["cleaned"],
    execution_options=ExecutionOptions(input_binding="paths"),
)

print(first.cache_hit, second.cache_hit)  # False, True
cleaned = consist.load_df(second.outputs["cleaned"])

In this example, input_binding="paths" tells Consist to pass local Path objects into the callable instead of loading input files. Those same paths are still hashed and recorded for cache identity and lineage. For tools that need inputs copied to specific local filenames, see Usage Guide.

Documentation

Start here Use it for
Quickstart First tracked run and cache hit
First Workflow Two-step pipeline with explicit artifact links
Usage Guide Choosing between run, trace, and scenario
Caching & Hydration Cache identity, hit behavior, and output recovery concepts
Historical Recovery Restoring archived outputs and staging inputs
CLI Reference Inspecting runs, artifacts, lineage, and schemas
API Reference Public Python API and generated signatures

Etymology

In railroad terminology, a consist is the lineup of locomotives and cars that make up a train. In this library, a consist is the immutable record of the code, config, inputs, and outputs coupled together to produce a result.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

consist-0.1.3.tar.gz (403.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

consist-0.1.3-py3-none-any.whl (451.0 kB view details)

Uploaded Python 3

File details

Details for the file consist-0.1.3.tar.gz.

File metadata

  • Download URL: consist-0.1.3.tar.gz
  • Upload date:
  • Size: 403.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for consist-0.1.3.tar.gz
Algorithm Hash digest
SHA256 f1fa1ad4d1dfd6dc8f33699bb3f2d77e16dda66094f564a449b8b225c8f24056
MD5 e1864c3e1594dfc97cc9d7b1a6110c00
BLAKE2b-256 a062777ff64fb71fab7d8040d998ed65fc022a00fe052dec6bd23f4b699a7525

See more details on using hashes here.

File details

Details for the file consist-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: consist-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 451.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for consist-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 221cfdb5eb13b1bd26adab62428afe47843fdc6218954a2d0f71fb943d9b8e89
MD5 8a4157e988d5c4f1abe91163a2cc4eb9
BLAKE2b-256 5de4959b1fac25dec77dac10ac524cf2a2fc821d1f5ce7ac4b9ab08b67e2cca8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page