Production-grade database utilities with universal interface and hash collision detection

These details have not been verified by PyPI

Project links

Project description

Production DB Utils

A production-grade database utilities package that provides a unified interface for working with multiple database types. Built with enterprise requirements in mind, it includes hash collision detection, automatic schema validation, and robust error handling.

📖 Documentation Guide

👋 New to db_utils? Start here:

README.md (this file) - Overview, features, and quick start
INSTALL.md - Detailed installation guide for all databases
check_imports.py - Test interface import examples
examples.py - Basic db utils usage examples
data_approaches_demo.py - End to end db utils Data handling patterns

🔧 For developers:

SETUP.md - Development environment setup and contribution guide

Features

Universal Database Interface: Work with 12+ database types using a single API
Hash Collision Detection: Automatic duplicate prevention using configurable hash columns
Schema Validation: Built-in validation for table schemas and data integrity
Query Builder: Type-safe query building with parameter binding
Production Ready: Comprehensive error handling and logging
Type Hints: Full type annotation support for better IDE integration
Transaction Management: Built-in transaction handling with rollback support

Supported Databases

SQLite
PostgreSQL
MySQL/MariaDB
Microsoft SQL Server
Azure SQL Database
Oracle
Snowflake
Databricks
Amazon Redshift
IBM DB2
Teradata

Installation

Basic Installation

pip install production-db-utils

Database-Specific Dependencies

Install with specific database support:

# PostgreSQL
pip install pyspdbutils[postgresql]

# MySQL
pip install pyspdbutils[mysql]

# SQL Server
pip install pyspdbutils[sqlserver]

# Oracle
pip install pyspdbutils[oracle]

# Snowflake
pip install pyspdbutils[snowflake]

# Teradata
pip install pyspdbutils[teradata]

# Databricks
pip install pyspdbutils[databricks]

# MariaDB
pip install pyspdbutils[mariadb]

# Azure SQL
pip install pyspdbutils[azuresql]

# All databases
pip install pyspdbutils[all]

Quick Start

SQLite Example

from db_utils import DBManager, DBConfig
from db_utils.interfaces import SQLiteInterface

# Initialize database interface
interface = SQLiteInterface("example.db")

# Create manager with hash collision detection
manager = DBManager(interface, hash_columns=["id", "email"])

# Create table
schema = {
    "id": "INTEGER PRIMARY KEY",
    "name": "VARCHAR(100)",
    "email": "VARCHAR(255) UNIQUE",
    "created_at": "TIMESTAMP DEFAULT CURRENT_TIMESTAMP"
}
manager.create_table("users", schema)

# Insert data with automatic duplicate detection
user_data = {"id": 1, "name": "John Doe", "email": "john@example.com"}
result = manager.insert("users", user_data)

if result == "duplicate":
    print("User already exists!")
else:
    print("User created successfully!")

# Query data
users = manager.select("users", conditions={"name": "John Doe"})
print(f"Found {len(users)} users")

PostgreSQL Example

from db_utils import DBManager, DBConfig
from db_utils.interfaces import PostgreSQLInterface

# Initialize PostgreSQL interface
interface = PostgreSQLInterface(
    host="localhost",
    port=5432,
    user="username", 
    password="password",
    database="mydb"
)

# Or use environment variables with DBConfig
config = DBConfig.from_env("postgresql")
from db_utils.interfaces import SQLAlchemyInterface
interface = SQLAlchemyInterface(config.conn_str)

manager = DBManager(interface)

# Rest of the code is the same...

Using Environment Variables

Create a .env file:

POSTGRESQL_HOST=localhost
POSTGRESQL_PORT=5432
POSTGRESQL_USER=myuser
POSTGRESQL_PASSWORD=mypassword
POSTGRESQL_DATABASE=mydatabase

MYSQL_HOST=localhost
MYSQL_USER=root
MYSQL_PASSWORD=password
MYSQL_DATABASE=testdb

SNOWFLAKE_ACCOUNT=your-account
SNOWFLAKE_USER=your-user
SNOWFLAKE_PASSWORD=your-password
SNOWFLAKE_WAREHOUSE=your-warehouse
SNOWFLAKE_DATABASE=your-database
SNOWFLAKE_SCHEMA=your-schema

Then use DBConfig:

from db_utils import DBConfig
from db_utils.interfaces import SQLAlchemyInterface

# Load configuration from environment
config = DBConfig.from_env("postgresql")
interface = SQLAlchemyInterface(config.connection_string)

Advanced Usage

Hash Collision Detection

# Configure hash columns for duplicate detection
manager = DBManager(interface, hash_columns=["email", "phone"])

# Insert will automatically check for duplicates
data = {"name": "Jane Doe", "email": "jane@example.com", "phone": "+1234567890"}
result = manager.insert("users", data, skip_duplicates=True)

if result == "duplicate":
    print("Record with same email/phone already exists")

Transaction Management

# Using context manager for automatic transaction handling
with interface:
    manager.insert("users", user1_data)
    manager.insert("users", user2_data)
    # Automatically commits on success, rolls back on error

Custom Schema Creation

# Create complex table with constraints
schema = {
    "id": "SERIAL PRIMARY KEY",
    "username": "VARCHAR(50) UNIQUE NOT NULL",
    "email": "VARCHAR(100) UNIQUE NOT NULL", 
    "password_hash": "VARCHAR(255) NOT NULL",
    "is_active": "BOOLEAN DEFAULT TRUE",
    "created_at": "TIMESTAMP DEFAULT CURRENT_TIMESTAMP",
    "updated_at": "TIMESTAMP DEFAULT CURRENT_TIMESTAMP"
}

manager.create_table("users", schema)

# Add indexes (database-specific)
if hasattr(interface, 'execute_query'):
    interface.execute_query("CREATE INDEX idx_users_email ON users(email)")

Batch Operations

# Bulk insert with duplicate checking
users_data = [
    {"name": "User 1", "email": "user1@example.com"},
    {"name": "User 2", "email": "user2@example.com"},
    {"name": "User 3", "email": "user3@example.com"},
]

inserted_count = 0
duplicate_count = 0

for user_data in users_data:
    result = manager.insert("users", user_data)
    if result == "duplicate":
        duplicate_count += 1
    else:
        inserted_count += 1

print(f"Inserted: {inserted_count}, Duplicates: {duplicate_count}")

Query Building

from db_utils import QueryBuilder

# Custom query building
builder = QueryBuilder("postgresql")

# Build complex SELECT query
query, params = builder.build_query_params(
    "SELECT",
    "users",
    columns=["id", "name", "email"],
    conditions={"is_active": True, "created_at": "2024-01-01"},
    limit=10,
    offset=20
)

results = interface.execute_query(str(query), params, fetch="all")

Error Handling

The package includes comprehensive error handling:

from db_utils.exception import (
    DBOperationError,
    ValidationError,
    ConnectionError,
    ConfigurationError
)

try:
    manager.insert("users", invalid_data)
except ValidationError as e:
    print(f"Data validation failed: {e}")
except DBOperationError as e:
    print(f"Database operation failed: {e}")
except ConnectionError as e:
    print(f"Database connection failed: {e}")

Configuration

Supported Configuration Methods

Direct instantiation:

config = DBConfig("postgresql", host="localhost", user="user", password="pass")

Environment variables:

config = DBConfig.from_env("postgresql")

Mixed approach:

config = DBConfig("postgresql", host="custom-host")  # Other params from env

Database-Specific Configuration

Snowflake

config = DBConfig("snowflake",
    account="your-account",
    user="username",
    password="password",
    warehouse="compute_wh",
    database="analytics",
    schema="public",
    role="analyst"
)

Databricks

config = DBConfig("databricks",
    host="your-workspace.cloud.databricks.com",
    password="your-token",  # Personal access token
    database="/your/database/path"
)

Testing

Run the test suite:

# Install development dependencies
pip install production-db-utils[dev]

# Run tests
pytest

# Run with coverage
pytest --cov=db_utils --cov-report=html

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes
Add tests for your changes
Run the test suite (pytest)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📚 Additional Documentation

📦 INSTALL.md - Complete installation guide for all database types
🔧 SETUP.md - Development setup and contribution guidelines
📝 examples.py - Basic usage examples and patterns
🚀 advanced_examples_copy.py - Advanced features and enterprise usage
📊 data_approaches_demo.py - Data handling best practices

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

Version 1.0.0

Initial release
Support for 12+ database types
Hash collision detection
Schema validation
Production-grade error handling
Comprehensive test suite

Support

📧 Email: your.email@example.com
🐛 Issues: GitHub Issues
📖 Documentation: GitHub Wiki

Related Projects

SQLAlchemy - The Python SQL toolkit
Pandas - Data analysis and manipulation tool
Alembic - Database migration tool

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.8

Sep 9, 2025

1.0.7

Sep 9, 2025

1.0.6

Sep 8, 2025

1.0.5

Sep 8, 2025

1.0.4

Sep 8, 2025

1.0.3

Sep 8, 2025

1.0.1

Sep 8, 2025

This version

1.0.0

Sep 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyspdbutils-1.0.0.tar.gz (23.6 kB view details)

Uploaded Sep 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyspdbutils-1.0.0-py3-none-any.whl (37.1 kB view details)

Uploaded Sep 8, 2025 Python 3

File details

Details for the file pyspdbutils-1.0.0.tar.gz.

File metadata

Download URL: pyspdbutils-1.0.0.tar.gz
Upload date: Sep 8, 2025
Size: 23.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.4 CPython/3.10.6 Windows/10

File hashes

Hashes for pyspdbutils-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`3934ff21656f540d5f5d240d792f232f9cbb05d59085c476023883d1ddef3fb6`
MD5	`13f94267b953dedba72dc6ee518287da`
BLAKE2b-256	`7894e806c8611663a2317f445a745866174cd18ee72ff202f43d33da1434591f`

See more details on using hashes here.

File details

Details for the file pyspdbutils-1.0.0-py3-none-any.whl.

File metadata

Download URL: pyspdbutils-1.0.0-py3-none-any.whl
Upload date: Sep 8, 2025
Size: 37.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.4 CPython/3.10.6 Windows/10

File hashes

Hashes for pyspdbutils-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b1211c8f0781014004d1cf36f5d13d4ffd3c9d37da91d60fc8cfdcb7b1c4562a`
MD5	`99ebe47897ff37ca7ab128aa47f2e615`
BLAKE2b-256	`51df610b4638ff5d70550a47a79c4b6fd51dac51af7e53bff86dd3348ab1aa2c`

See more details on using hashes here.

pyspdbutils 1.0.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Production DB Utils

📖 Documentation Guide

Features

Supported Databases

Installation

Basic Installation

Database-Specific Dependencies

Quick Start

SQLite Example

PostgreSQL Example

Using Environment Variables

Advanced Usage

Hash Collision Detection

Transaction Management

Custom Schema Creation

Batch Operations

Query Building

Error Handling

Configuration

Supported Configuration Methods

Database-Specific Configuration

Snowflake

Databricks

Testing

Contributing

📚 Additional Documentation

License

Changelog

Version 1.0.0

Support

Related Projects

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes