Database utilities for scientific computing with SQLite3 and PostgreSQL
Project description
scitex-db
Database utilities for scientific computing — SQLite3 + PostgreSQL with NumPy-aware storage.
Full Documentation · uv pip install scitex-db[all]
Problem and Solution
| # | Problem | Solution |
|---|---|---|
| 1 | Storing ndarrays in SQLite means pickle.dumps → BLOB — no compression, no dtype/shape, no deterministic hashing |
db.save_array(table, arr) / load_array(...) — typed compressed BLOBs round-trip with dtype, shape, is_compressed, _hash columns |
| 2 | sqlite3 API is low-level — every project re-writes connect / transaction / execute boilerplate |
with db.transaction(): ... — context-managed transactions, health checks, dedup, schema inspection built-in |
| 3 | Switching SQLite ↔ Postgres rewrites every call site | Mixin composition — SQLite3 and PostgreSQL share _BaseMixins/; the same call site works against either backend |
Installation
pip install scitex-db # SQLite3 only
pip install scitex-db[postgresql] # add psycopg2 driver
pip install scitex-db[all] # everything
Configuration
Defaults work out of the box. To override, drop a config.yaml next to
your script, or point SCITEX_DB_CONFIG at one — see
.env.example for the full env-var list and
resolution order.
Quick Start
from scitex_db import SQLite3
import numpy as np
db = SQLite3("experiments.db")
db.create_table("results", {
"id": "INTEGER PRIMARY KEY",
"experiment": "TEXT",
"accuracy": "REAL",
})
db.insert_many("results", [
{"experiment": "exp1", "accuracy": 0.95},
{"experiment": "exp2", "accuracy": 0.92},
])
# NumPy arrays round-trip with dtype/shape preserved
db.save_array("features", np.random.rand(1000, 50), column="embeddings",
additional_columns={"model": "bert"})
features = db.load_array("features", "embeddings", where="model = 'bert'")
2 Interfaces
Python API ⭐⭐⭐ primary surface
from scitex_db import SQLite3, PostgreSQL, check_health, inspect
# Backends
db = SQLite3("experiments.db")
db = PostgreSQL(host=..., user=..., dbname=...)
# CRUD
db.insert("results", {"experiment": "exp1", "accuracy": 0.95})
db.insert_many("results", rows, batch_size=1000)
rows = db.get_rows("results", where="accuracy > 0.9")
db.update("results", {"accuracy": 0.97}, where="id = 1")
db.delete("results", where="id = 1")
# Arrays / Blobs
db.save_array(table, arr, column="data")
db.load_array(table, "data", where=...)
db.save_blob(table, obj, column="checkpoint")
db.load_blob(table, "checkpoint", where=...)
# Transactions / maintenance
with db.transaction():
db.insert("a", {...}); db.insert("b", {...})
db.summary # schema + row counts
inspect("experiments.db") # standalone helper
check_health("experiments.db", fix_issues=True)
CLI ⭐⭐ scitex-db <subcommand>
scitex-db --help-recursive # all subcommands at once
scitex-db inspect-db experiments.db # schema + row counts
scitex-db inspect-db experiments.db --tables results --json
scitex-db check-health experiments.db --fix --yes
scitex-db check-health experiments.db --dry-run
scitex-db list-python-apis # introspect public Python surface
Every subcommand supports -h/--help, --json, and the safety pair
--dry-run / --yes where it mutates state.
Architecture
scitex_db/
├── __init__.py ← public API (SQLite3, PostgreSQL, check_health, inspect)
├── __main__.py ← `scitex-db` CLI entry
├── _BaseMixins/ ← backend-agnostic mixins (CRUD, schema, batch, ...)
├── _sqlite3/ ← SQLite3 driver
│ └── _SQLite3Mixins/ ← SQLite3-specific mixin overrides
├── _postgresql/ ← PostgreSQL driver
│ └── _PostgreSQLMixins/ ← PostgreSQL-specific mixin overrides
├── _check_health.py ← `scitex-db check-health`
├── _inspect.py ← `scitex-db inspect-db`
├── _inspect_optimized.py ← faster path for large DBs
├── _delete_duplicates.py ← duplicate-row cleanup
├── _utils.py ← shared helpers
└── _skills/ ← agent-facing skill files
Each backend composes its _*Mixins/ folder onto _BaseMixins/, so
swapping SQLite3 ↔ PostgreSQL does not change call sites.
Demo
flowchart LR
U["user code"] --> A["SQLite3('exp.db')"]
U --> B["PostgreSQL(host=..., user=...)"]
A --> M["_BaseMixins (CRUD · schema · batch · maintenance)"]
B --> M
A -.-> SM["_SQLite3Mixins<br/>(backend overrides)"]
B -.-> PM["_PostgreSQLMixins<br/>(backend overrides)"]
M --> H["scitex-db check-health<br/>(fix orphans, vacuum)"]
M --> I["scitex-db inspect-db<br/>(schema + row counts)"]
Part of SciTeX
scitex-db is part of SciTeX. Install via the
umbrella with pip install scitex[db], then import as scitex.db or
invoke scitex db <subcommand> — the standalone scitex-db package
remains the source of truth.
Four Freedoms for Research
- The freedom to run your research anywhere — your machine, your terms.
- The freedom to study how every step works — from raw data to final manuscript.
- The freedom to redistribute your workflows, not just your papers.
- The freedom to modify any module and share improvements with the community.
AGPL-3.0 — because we believe research infrastructure deserves the same freedoms as the software it runs on.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scitex_db-0.1.11.tar.gz.
File metadata
- Download URL: scitex_db-0.1.11.tar.gz
- Upload date:
- Size: 8.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
849313444465ac3014533d83a404fb44a3c855745952278766a3ee35885b6a28
|
|
| MD5 |
a9aec1580e071b42c84cd64167586aeb
|
|
| BLAKE2b-256 |
3d24ff1b6b846c2084dfc8047f42d3a8fc4e3a63055a41f7aca1785b592b5bdc
|
Provenance
The following attestation bundles were made for scitex_db-0.1.11.tar.gz:
Publisher:
pypi-publish-and-github-release-on-tag.yml on ywatanabe1989/scitex-db
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scitex_db-0.1.11.tar.gz -
Subject digest:
849313444465ac3014533d83a404fb44a3c855745952278766a3ee35885b6a28 - Sigstore transparency entry: 1630150593
- Sigstore integration time:
-
Permalink:
ywatanabe1989/scitex-db@8f1f1da91eb1c76595e503659c1b208f532b0928 -
Branch / Tag:
refs/tags/v0.1.11 - Owner: https://github.com/ywatanabe1989
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish-and-github-release-on-tag.yml@8f1f1da91eb1c76595e503659c1b208f532b0928 -
Trigger Event:
push
-
Statement type:
File details
Details for the file scitex_db-0.1.11-py3-none-any.whl.
File metadata
- Download URL: scitex_db-0.1.11-py3-none-any.whl
- Upload date:
- Size: 8.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dd187a777069edd0dd92f96aabe80ffaf4885fcc7ff9270cc68416326fd6835f
|
|
| MD5 |
c89adbb89e7a1467ea4cd9f5d79ee02a
|
|
| BLAKE2b-256 |
539037e5edbf06ca110d5354d6e6303c1c856798d84c6f2fa22c01c192ef5de8
|
Provenance
The following attestation bundles were made for scitex_db-0.1.11-py3-none-any.whl:
Publisher:
pypi-publish-and-github-release-on-tag.yml on ywatanabe1989/scitex-db
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
scitex_db-0.1.11-py3-none-any.whl -
Subject digest:
dd187a777069edd0dd92f96aabe80ffaf4885fcc7ff9270cc68416326fd6835f - Sigstore transparency entry: 1630150635
- Sigstore integration time:
-
Permalink:
ywatanabe1989/scitex-db@8f1f1da91eb1c76595e503659c1b208f532b0928 -
Branch / Tag:
refs/tags/v0.1.11 - Owner: https://github.com/ywatanabe1989
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish-and-github-release-on-tag.yml@8f1f1da91eb1c76595e503659c1b208f532b0928 -
Trigger Event:
push
-
Statement type: