Skip to main content

A library for explicit data pipelines

Project description

tacit

Pydantic-style schemas for DataFrame pipelines, built on ibis and pandera.

Every DataFrame operation makes implicit assumptions about the data — which columns exist, their types, whether nulls are allowed. Tacit makes them explicit: you define schemas as Python classes and enforce contracts on the functions that transform them. From that single definition:

  • Catch errors where they happen — pandera validates actual data at pipeline boundaries. Missing columns, wrong types, constraint violations — caught where bad data enters, not three stages downstream.
  • Catch errors before they happen — type checkers (mypy, pyright, ty, pyrefly) verify that every pipeline stage respects the contract before your code runs.
  • Make contracts self-documenting — "go to definition" on any schema shows every column, its type, and its constraints. No Slack threads, no stale wiki pages. The code has the full context — for teammates, for your future self, and for coding agents that can discover schemas without extra context files.
  • Make changes safe — rename a column in a schema and your type checker flags every function that needs updating — across teams, across repos.

Works across any ibis-supported backend — DuckDB, Spark, BigQuery, Snowflake, Polars, Postgres, and more.

Documentation

Install

uv add tacit

# or with pip directly
pip install tacit

Quick example

import ibis
import tacit


class Iris(tacit.Schema):
    sepal_length: float
    sepal_width: float
    petal_length: float
    petal_width: float
    species: str


class IrisFeatures(Iris):
    sepal_ratio: float
    petal_ratio: float
    petal_area: float


@tacit.contract
def engineer_features(df: tacit.DataFrame[Iris]) -> tacit.DataFrame[IrisFeatures]:
    return df.mutate(
        sepal_ratio=df.sepal_length / df.sepal_width,
        petal_ratio=df.petal_length / df.petal_width,
        petal_area=df.petal_length * df.petal_width,
    )


con = ibis.duckdb.connect()
raw = con.read_csv("iris.csv")

iris = Iris.parse(raw)
features = engineer_features(iris)

Schemas are Python classes — your editor autocompletes column names from them. parse() coerces types and validates at the boundary. @contract enforces input/output schemas at runtime. DataFrame[S] is an ibis Table, so you get the full ibis expression API with no wrapping.

What else

Constraints — go beyond column names and types with value-level checks, powered by pandera:

from typing import Annotated

class Order(tacit.Schema):
    amount: Annotated[float, tacit.Check.ge(0)]
    status: Annotated[str, tacit.Check.isin(["pending", "shipped"])]
    notes: Annotated[str, tacit.Nullable()]

cast() vs parse()parse() runs full validation (executes queries). cast() checks column names and types only — zero execution cost, for internal pipeline steps where the data has already been validated.

validate=True@contract uses cast() by default. Pass validate=True at pipeline entry points to run full parse() validation on inputs and outputs.

See the documentation for the full guide, API reference, and examples.

FAQ

Does this work with pandas? — Tacit builds on ibis, which moved away from pandas as a backend. If your data currently lives in pandas DataFrames, you can use a well-supported engine like DuckDB or Polars as the execution backend — ibis reads from and converts back to pandas seamlessly, while giving you a modern query engine underneath.

Which backends are supported? — Any engine that ibis supports. Tacit delegates all query execution to ibis, so backend support is inherited automatically. See the ibis backends page for the full list.

Which checks and constraints are available? — Tacit delegates constraint validation to pandera's ibis backend. Anything in pandera's Check API that has ibis support will work. See pandera's ibis compatibility status for what's currently available.

Status

Early development. The API is not stable.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tacit-0.3.0.tar.gz (8.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tacit-0.3.0-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file tacit-0.3.0.tar.gz.

File metadata

  • Download URL: tacit-0.3.0.tar.gz
  • Upload date:
  • Size: 8.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for tacit-0.3.0.tar.gz
Algorithm Hash digest
SHA256 3e8dbee7616dcef815ab8b846a88cab9eeab265ff25180be0c1fa8c9096806c3
MD5 a09f6aa3b305525bb776e82db6fbbc5f
BLAKE2b-256 836aebf670908019c6c53ece4924a78327c9370c3ede4eb4fdfbb503563fa7b8

See more details on using hashes here.

File details

Details for the file tacit-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: tacit-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for tacit-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7fd2f84c52b5f324d41c36c5dedb5c1f70a7a689c6597aaac5a7f02bdee023e0
MD5 4dc8ef51b20d094c4ad4f9f6dfb1fac2
BLAKE2b-256 fc0da7980caeb8233f9aa76378d9f4a7268f96a21090e751f8206e4801cf8595

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page