A declarative data engineering framework - Explicit over implicit, Stories over magic

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

odibiengineering

These details have not been verified by PyPI

Project description

Odibi

Declarative data pipelines. YAML in, star schemas out.

Note: Personal open-source project. See IP_NOTICE.md for details.

Odibi is a framework for building data pipelines. You describe what you want in YAML; Odibi handles how. Every run generates a "Data Story" — an audit report showing exactly what happened to your data.

🤖 AI/LLM Users: For comprehensive context, see docs/ODIBI_DEEP_CONTEXT.md — 2,200+ lines covering all patterns, transformers, validation, connections, and runtime behavior.

🎯 Try Odibi in 5 Minutes (No Install Needed)

Click the badge above → run 3 cells → see your first simulation. No Python install, no cloning, no setup.

The notebook walks you through:

pip install odibi (runs in the cloud)
Define a simulation in YAML (sensors, sales data, or industrial equipment)
Run the pipeline → see the output → chart it with Altair

When you're ready for more: 38 simulation configs covering buildings, compressors, reactors, cooling towers, wastewater, production lines, and sales pipelines.

⚡ Quick Start (Local)

pip install odibi

Option 1: Simulate data from YAML

Create sim.yaml:

project: my_first_sim
engine: pandas
connections:
  output:
    type: local
    base_path: ./data
story:
  connection: output
  path: stories/
system:
  connection: output
pipelines:
  - pipeline: demo
    nodes:
      - name: sensors
        read:
          connection: null
          format: simulation
          options:
            simulation:
              scope:
                start_time: "2026-01-01T00:00:00Z"
                timestep: "5m"
                row_count: 100
                seed: 42
              entities:
                count: 3
                id_prefix: "sensor_"
              columns:
                - name: sensor_id
                  data_type: string
                  generator: {type: constant, value: "{entity_id}"}
                - name: timestamp
                  data_type: timestamp
                  generator: {type: timestamp}
                - name: temperature
                  data_type: float
                  generator:
                    type: random_walk
                    start: 22.0
                    min: 16.0
                    max: 30.0
                    volatility: 0.3
                    mean_reversion: 0.15
        write:
          connection: output
          format: parquet
          path: bronze/sensors.parquet
          mode: overwrite

Run it:

python -c "from odibi.pipeline import PipelineManager; PipelineManager.from_yaml('sim.yaml').run()"

Output: data/bronze/sensors.parquet — 300 rows of realistic sensor data with memory, drift, and mean reversion. No database needed.

Option 2: Build a star schema from CSV

odibi init my_project --template star-schema
cd my_project
odibi run odibi.yaml
odibi story last          # View the audit report

Option 3: Clone the reference example

git clone https://github.com/henryodibi11/Odibi.git
cd Odibi/docs/examples/canonical/runnable
odibi run 04_fact_table.yaml

This builds a complete star schema in seconds:

3 dimension tables (customer, product, date)
1 fact table with FK lookups and orphan handling
HTML audit report

See the full breakdown →

📖 The Canonical Example

pipelines:
  - pipeline: build_dimensions
    nodes:
      - name: dim_customer
        read:
          connection: source
          format: csv
          path: customers.csv
        pattern:
          type: dimension
          params:
            natural_key: customer_id
            surrogate_key: customer_sk
            scd_type: 1
        write:
          connection: gold
          format: parquet
          path: dim_customer

      - name: dim_date
        pattern:
          type: date_dimension
          params:
            start_date: "2025-01-01"
            end_date: "2025-12-31"
        write:
          connection: gold
          format: parquet
          path: dim_date

  - pipeline: build_facts
    nodes:
      - name: fact_sales
        depends_on: [dim_customer, dim_date]
        read:
          connection: source
          format: csv
          path: orders.csv
        pattern:
          type: fact
          params:
            grain: [order_id, line_item_id]
            dimensions:
              - source_column: customer_id
                dimension_table: dim_customer
                dimension_key: customer_id
                surrogate_key: customer_sk
            orphan_handling: unknown
        write:
          connection: gold
          format: parquet
          path: fact_sales

Full runnable example →

🚀 Key Features

Feature	Description
Data Stories	Every run generates an HTML audit report
Dimensional Patterns	6 built-in patterns: SCD1/SCD2, date dimension, fact tables, merge, aggregation
56 Transformers	Comprehensive library for data manipulation and quality
Validation & Contracts	Fail-fast checks, quarantine bad rows
Multi-Engine	Pandas, Polars, and Spark — same config across all engines
Production Ready	Retry, alerting, secrets, Delta Lake support
Battle-Tested	5500+ tests ensure reliability and correctness

📚 Documentation

Goal	Link
Get running in 10 minutes	Golden Path
Copy THE working example	THE_REFERENCE.md
Solve a specific problem	Playbook
Understand when to use what	Decision Guide
See all config options	YAML Schema

📦 Installation

# Standard (Pandas engine)
pip install odibi

# With Polars engine
pip install "odibi[polars]"

# With Spark + Azure support
pip install "odibi[spark,azure]"

# All engines and features
pip install "odibi[all]"

🎯 Who is this for?

Solo data engineers building pipelines without a team
Analytics engineers moving from dbt to Python-based pipelines
Anyone tired of writing the same boilerplate for every project

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md.

Maintainer: Henry Odibi (@henryodibi11)
License: Apache 2.0

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

odibiengineering

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

3.9.0

Apr 20, 2026

3.8.2

Apr 6, 2026

3.8.1

Apr 5, 2026

3.8.0

Apr 1, 2026

3.7.4

Apr 1, 2026

3.7.3

Apr 1, 2026

3.7.2

Apr 1, 2026

3.7.1

Apr 1, 2026

3.7.0

Apr 1, 2026

3.6.3

Mar 31, 2026

3.6.2

Mar 31, 2026

3.6.1

Mar 31, 2026

3.6.0

Mar 31, 2026

3.5.0

Mar 29, 2026

3.4.7

Mar 26, 2026

3.4.6

Mar 24, 2026

3.4.5

Mar 24, 2026

3.4.4

Mar 22, 2026

3.4.3

Mar 17, 2026

3.4.2

Mar 14, 2026

3.4.1

Mar 14, 2026

3.4.0

Mar 12, 2026

3.3.0

Mar 12, 2026

2.23.0

Mar 11, 2026

2.22.1

Mar 10, 2026

2.22.0

Mar 9, 2026

2.21.0

Mar 9, 2026

2.20.1

Mar 9, 2026

2.20.0

Mar 9, 2026

2.18.0

Mar 5, 2026

2.17.0

Feb 5, 2026

2.16.0

Feb 5, 2026

2.15.3

Jan 26, 2026

2.15.2

Jan 26, 2026

2.15.1

Jan 26, 2026

2.15.0

Jan 25, 2026

2.14.0

Jan 24, 2026

2.13.1

Jan 24, 2026

2.12.0

Jan 24, 2026

2.11.2

Jan 23, 2026

2.11.1

Jan 23, 2026

2.11.0

Jan 23, 2026

2.10.0

Jan 22, 2026

2.9.0

Jan 20, 2026

2.8.0

Jan 19, 2026

2.7.0

Jan 14, 2026

2.6.6

Jan 14, 2026

2.6.5

Jan 14, 2026

2.6.4

Jan 14, 2026

2.6.3

Jan 14, 2026

2.6.2

Jan 14, 2026

2.6.1

Jan 11, 2026

2.5.0

Jan 10, 2026

2.4.0

Jan 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

odibi-3.9.0.tar.gz (910.7 kB view details)

Uploaded Apr 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

odibi-3.9.0-py3-none-any.whl (914.8 kB view details)

Uploaded Apr 20, 2026 Python 3

File details

Details for the file odibi-3.9.0.tar.gz.

File metadata

Download URL: odibi-3.9.0.tar.gz
Upload date: Apr 20, 2026
Size: 910.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for odibi-3.9.0.tar.gz
Algorithm	Hash digest
SHA256	`eb52934a45abe92a90471e0c8f3f234441fb344e327893445b245eed7f8462d6`
MD5	`4341610a83c3a50275fc36627261b0f1`
BLAKE2b-256	`e8cea018f11ef80ed78540ff3c97fafdf18b28cd910f59b35196313dc591dd08`

See more details on using hashes here.

Provenance

The following attestation bundles were made for odibi-3.9.0.tar.gz:

Publisher: publish.yml on henryodibi11/Odibi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: odibi-3.9.0.tar.gz
- Subject digest: eb52934a45abe92a90471e0c8f3f234441fb344e327893445b245eed7f8462d6
- Sigstore transparency entry: 1342289089
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: henryodibi11/Odibi@13c791207e3f893152273d08579b35f288ab85ba
- Branch / Tag: refs/tags/v3.9.0
- Owner: https://github.com/henryodibi11
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@13c791207e3f893152273d08579b35f288ab85ba
- Trigger Event: release

File details

Details for the file odibi-3.9.0-py3-none-any.whl.

File metadata

Download URL: odibi-3.9.0-py3-none-any.whl
Upload date: Apr 20, 2026
Size: 914.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for odibi-3.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`469deeca511228565dd89ac0ee899b7175563e550632c336305e0d46ab7c25a7`
MD5	`838578c6b9f8618c64ed2473f2506cf9`
BLAKE2b-256	`e7a03b9c6ddd5a49dcadeeb894f48048dc7c3a65fe3271eb5fb7d4c67f5b4ced`

See more details on using hashes here.

Provenance

The following attestation bundles were made for odibi-3.9.0-py3-none-any.whl:

Publisher: publish.yml on henryodibi11/Odibi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: odibi-3.9.0-py3-none-any.whl
- Subject digest: 469deeca511228565dd89ac0ee899b7175563e550632c336305e0d46ab7c25a7
- Sigstore transparency entry: 1342289121
- Sigstore integration time: Apr 20, 2026
Source repository:
- Permalink: henryodibi11/Odibi@13c791207e3f893152273d08579b35f288ab85ba
- Branch / Tag: refs/tags/v3.9.0
- Owner: https://github.com/henryodibi11
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@13c791207e3f893152273d08579b35f288ab85ba
- Trigger Event: release

odibi 3.9.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Odibi

🎯 Try Odibi in 5 Minutes (No Install Needed)

⚡ Quick Start (Local)

📖 The Canonical Example

🚀 Key Features

📚 Documentation

📦 Installation

🎯 Who is this for?

🤝 Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance