Pydantic-native storage for time-series and evolving data models. SQLite + PostgreSQL, schema evolution without migrations, built-in Polars.
Project description
CentauroDB
Docs: centaurodb.dev
Pydantic-native storage for time-series and evolving data models. SQLite + PostgreSQL backends, schema evolution without migrations, built-in Polars DataFrames.
pip install centaurodb
Optional extras:
pip install "centaurodb[polars]" # for .df / sql_select() → DataFrame
pip install "centaurodb[postgres]" # for PostgreSQL backend
Why CentauroDB?
CentauroDB is not another ORM. It targets a different problem:
- You define data with Pydantic models, not table schemas.
- Models are stored as JSON blobs, so adding a field never requires a migration — just give it a default value.
- Time-series data is a first-class citizen via
TimeSeriesCollection, with one-line conversion to a Polars DataFrame. - Same API on SQLite (zero-config, embedded) and PostgreSQL (production) — pick the URL, the dialect adapts.
Who is this for?
Built for data and analytics pipelines, IoT/sensor ingestion, quant backtesting, and analytics-shaped backends — workloads where schemas keep evolving, where the read path ends in a DataFrame, and where the write side is a process or a small fleet of workers, not a thousand concurrent web requests.
| You want | Use |
|---|---|
| Web app with relational schema, FK constraints, complex joins | SQLAlchemy / SQLModel |
| High-concurrency web backend with auth / RLS / connection pooling | Supabase + SQLAlchemy |
| Document store with full-text search and replication | MongoDB |
| Pydantic models in, structured storage out, DataFrames back | CentauroDB |
| Time-series + metadata with evolving schemas | CentauroDB |
Quickstart
from centaurodb import Engine, Collection, CentauroModel
class Book(CentauroModel):
__centauro_name__ = "book"
title: str = ""
author: str = ""
rating: float = 0.0
engine = Engine("library.db") # or "sqlite://" for in-memory
# or "postgresql://user:pw@host/db"
books = Collection(engine, "library")
# Write
dune = Book(title="Dune", author="Herbert", rating=4.8)
books.write_object(dune)
# Update
dune.rating = 5.0
books.update_object(dune)
# Query (JSON fields, AND/OR conditions)
top = books.read_objects(Book.fields.rating > 4.5)
for b in top:
print(b.title, b.rating)
# Paginate large result sets
page = books.read_objects(Book.fields.rating > 4.5, limit=20, offset=40)
total = books.count_objects(Book.fields.rating > 4.5)
# Delete
books.delete_object(dune)
Time-series with Polars
from datetime import datetime
import polars as pl
from centaurodb import Engine, TimeSeriesCollection, CentauroModelSeries
class StockPrice(CentauroModelSeries):
__centauro_name__ = "stock"
ticker: str = ""
exchange: str = "NYSE"
engine = Engine("stocks.db")
prices = TimeSeriesCollection(engine, "portfolio")
df = pl.DataFrame({
"time": [datetime(2026, 1, 1), datetime(2026, 1, 2)],
"value": [185.20, 187.55],
})
apple = StockPrice(ticker="AAPL", values=df)
prices.write_object(apple)
# Read back, filtered by JSON metadata
[apple] = prices.read_objects(StockPrice.fields.ticker == "AAPL")
# .df is a Polars DataFrame with (time, value)
print(apple.df)
If your tool today is pandas.read_csv plus a folder of parquet files, this
is the next step from that.
Schema evolution without migrations
Add a field — give it a default. Old rows still load fine:
class Book(CentauroModel):
__centauro_name__ = "book"
title: str = ""
author: str = ""
rating: float = 0.0
pages: int | None = None # NEW — old rows just see None
Rename a field — declare the old name as an alias:
from centaurodb import renamed_from
class Book(CentauroModel):
__centauro_name__ = "book"
page_count: int = renamed_from("pages", default=0)
Three guardrails enforced at class-definition time keep stored data readable across versions:
- All fields must have a default value.
extra='forbid'is rejected (old keys must be silently dropped).__centauro_name__is mandatory — it's the stable storage identifier.
Async
AsyncCollection and AsyncTimeSeriesCollection mirror the sync API and run
under asyncio:
from centaurodb import Engine, AsyncCollection
async def main():
engine = Engine("postgresql://localhost/mydb")
books = AsyncCollection(engine, "library")
results = await books.read_objects(Book.fields.rating > 4.5)
Status
- 591 tests passing (610 collected; the 14 PostgreSQL integration
tests skip without a
CENTAURODB_PG_URLenv var and run in CI against a realpostgres:16service container), ~4,000 LOC, fully type-hinted (PEP 561py.typed). - v0.9.0 — beta. Stable storage format (five canonical columns —
id,name,write_time,edit_time,meta— committed per ADR-0002, storage-format guarantees; future system features land inmeta._sys.*or sibling tables, never as new columns) and stable public API on the surfaces marked stable. - See CHANGELOG.md for release history.
Production status & known limits
Honest about what's not in the box yet — so you can decide whether the gaps matter for your workload:
- No connection pooling. A
PostgresEngineholds a single connection. Fine for batch jobs, ETL, ingestion workers, and pipelines. Not yet sized for high-concurrency web backends. AsyncCollectionruns syncpsycopginsideasyncio.to_thread. It works under FastAPI, but it does not unlock native-async concurrency. Treat it as ergonomic compatibility, not a perf win. A nativepsycopg.AsyncConnectionPoolpath is on the roadmap.- No automatic pagination.
read_objectsacceptslimit/offset(andcount_objectsfor totals); use them on any list endpoint that could grow unbounded. - No row-level security / built-in auth. This is a storage library, not a backend. Enforce auth at the application boundary.
If your use case is data pipelines, analytics, time-series, IoT, or quant research, none of these are blockers. If it's a high-concurrency multi-tenant web app, reach for SQLAlchemy + Supabase instead.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file centaurodb-0.9.0.tar.gz.
File metadata
- Download URL: centaurodb-0.9.0.tar.gz
- Upload date:
- Size: 89.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f723cc28ae1ace94dd59bd8305ca662f70659060578244c050b266ec7c4528fd
|
|
| MD5 |
de676f9cfab855a391b80cac8a623609
|
|
| BLAKE2b-256 |
79ba83fd95e7f3a3589b0239999b5faff91954a7998923d6c5089fcda1a5a65d
|
Provenance
The following attestation bundles were made for centaurodb-0.9.0.tar.gz:
Publisher:
publish.yml on aropele/CentauroDB
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
centaurodb-0.9.0.tar.gz -
Subject digest:
f723cc28ae1ace94dd59bd8305ca662f70659060578244c050b266ec7c4528fd - Sigstore transparency entry: 1778396953
- Sigstore integration time:
-
Permalink:
aropele/CentauroDB@29c9fa4d81c27116d1a30cc89e63354fe101856a -
Branch / Tag:
refs/tags/v0.9.0 - Owner: https://github.com/aropele
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@29c9fa4d81c27116d1a30cc89e63354fe101856a -
Trigger Event:
push
-
Statement type:
File details
Details for the file centaurodb-0.9.0-py3-none-any.whl.
File metadata
- Download URL: centaurodb-0.9.0-py3-none-any.whl
- Upload date:
- Size: 46.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c40d1d4c72ae185d1aef900f65ea987e67928291aa13999cee12ddc6a8e30645
|
|
| MD5 |
73be8ec874ecb6323d70d18ecb2dccc0
|
|
| BLAKE2b-256 |
eb913ee4088cc9d728a14ebda0489d7452f9629f5c371137f75405e53dcd4e50
|
Provenance
The following attestation bundles were made for centaurodb-0.9.0-py3-none-any.whl:
Publisher:
publish.yml on aropele/CentauroDB
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
centaurodb-0.9.0-py3-none-any.whl -
Subject digest:
c40d1d4c72ae185d1aef900f65ea987e67928291aa13999cee12ddc6a8e30645 - Sigstore transparency entry: 1778397358
- Sigstore integration time:
-
Permalink:
aropele/CentauroDB@29c9fa4d81c27116d1a30cc89e63354fe101856a -
Branch / Tag:
refs/tags/v0.9.0 - Owner: https://github.com/aropele
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@29c9fa4d81c27116d1a30cc89e63354fe101856a -
Trigger Event:
push
-
Statement type: