Skip to main content

Database utilities for scientific computing with SQLite3 and PostgreSQL

Project description

scitex-db

PyPI Python Tests Install Test Coverage Docs License: AGPL v3

Database utilities for scientific computing.

Interfaces: Python ⭐⭐⭐ (primary) · CLI ⭐ · MCP — · Skills ⭐⭐ · Hook — · HTTP —

Problem and Solution

# Problem Solution
1 Storing ndarrays in SQLite means pickle.dumps → BLOB -- no compression, no type info, no deterministic hashing SQLite3.save_array(name, arr) / load_array(name) -- compressed BLOB storage with typed round-trip; compatible with pandas via to_df
2 sqlite3 API is low-level -- every project re-writes connect/transaction/execute boilerplate with db: context-manager transactions -- health checks, duplicate removal, schema inspection built in

Overview

scitex-db provides enhanced database operations designed for scientific research:

Features

SQLite3 with Scientific Extensions:

  • 📊 Array Storage - Store/retrieve NumPy arrays efficiently
  • 🔬 Blob Storage - Serialize Python objects with metadata
  • 📦 Batch Operations - High-performance bulk inserts
  • 🔍 Advanced Queries - Scientific query patterns
  • 🗂️ Git Integration - Version control for databases
  • 📤 Import/Export - CSV, JSON, DataFrame conversions
  • 🔧 Maintenance Tools - Health checks, deduplication

PostgreSQL Support:

  • Full-featured PostgreSQL wrapper
  • Optimized for scientific datasets

CLI Tools:

scitex-db inspect database.db
scitex-db health database.db --fix

Installation

pip install scitex-db

For PostgreSQL:

pip install scitex-db[postgresql]

For all features:

pip install scitex-db[all]

Quick Start

Basic Usage

from scitex_db import SQLite3

# Initialize
db = SQLite3("experiments.db")

# Create table
db.create_table("results", {
    "id": "INTEGER PRIMARY KEY",
    "experiment": "TEXT",
    "accuracy": "REAL"
})

# Insert data
db.insert_many("results", [
    {"experiment": "exp1", "accuracy": 0.95},
    {"experiment": "exp2", "accuracy": 0.92}
])

# Query
results = db.get_rows("results", where="accuracy > 0.9")
print(results)

Array Storage

import numpy as np

# Save arrays
data = np.random.rand(1000, 50)
db.save_array("features", data,
              column="embeddings",
              additional_columns={"model": "bert"})

# Load arrays
loaded = db.load_array("features", "embeddings",
                       where="model = 'bert'")

Blob Storage

# Store arbitrary objects
model = {"weights": np.random.rand(100), "config": {...}}
db.save_blob("models", model,
             column="checkpoint",
             additional_columns={"epoch": 10})

# Retrieve
model = db.load_blob("models", "checkpoint", where="epoch = 10")

Git Integration

from scitex_db import SQLite3

db = SQLite3("versioned.db")
db.init_git()  # Initialize git tracking

# Automatic commits on changes
db.insert("results", {"value": 42})
# Commits with message: "Insert 1 row(s) into results"

Advanced Features

Transaction Management

with db.transaction():
    db.insert("table1", {...})
    db.insert("table2", {...})
    # Auto-commit on success, rollback on error

Batch Operations

# High-performance bulk insert
large_dataset = [{"id": i, "value": i**2} for i in range(10000)]
db.insert_many("data", large_dataset, batch_size=1000)

Database Inspection

# Get comprehensive summary
db.summary  # or db()

# Inspect specific table
db.inspect_table("results")

# Health check
from scitex_db import check_health
check_health("database.db", fix_issues=True)

Part of SciTeX Ecosystem

  • scitex-core - Core infrastructure
  • scitex-io - Data I/O (can use scitex-db)
  • scitex-writer - Academic writing
  • scitex-scholar - Paper management
  • scitex - Main package

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex_db-0.1.8.tar.gz (163.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scitex_db-0.1.8-py3-none-any.whl (112.1 kB view details)

Uploaded Python 3

File details

Details for the file scitex_db-0.1.8.tar.gz.

File metadata

  • Download URL: scitex_db-0.1.8.tar.gz
  • Upload date:
  • Size: 163.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scitex_db-0.1.8.tar.gz
Algorithm Hash digest
SHA256 21089facf9705baa079baa8fce0e8ee76c239b624d9ea9bfbe3bcb53971934e4
MD5 aff7808205a4f3d1bec3099c6af341b0
BLAKE2b-256 974579470289b3d66d57ac85046eb595fa8e64e7b5c1460c0609348fec6345ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_db-0.1.8.tar.gz:

Publisher: publish-pypi.yml on ywatanabe1989/scitex-db

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scitex_db-0.1.8-py3-none-any.whl.

File metadata

  • Download URL: scitex_db-0.1.8-py3-none-any.whl
  • Upload date:
  • Size: 112.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scitex_db-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 7f2029003543481a55c04b39d6e8e69e8c50c4f07f4b7f9897c14e7175347868
MD5 3461a741175167f6a4bdcc66c13b5b21
BLAKE2b-256 561a646ec7c6e2e5d14af53820c36bffa9cb99d0c0d82d00ef7a69b36b6ba439

See more details on using hashes here.

Provenance

The following attestation bundles were made for scitex_db-0.1.8-py3-none-any.whl:

Publisher: publish-pypi.yml on ywatanabe1989/scitex-db

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page