Skip to main content

PostgreSQL helper functions for data analysis with enterprise-grade security

Project description

PostgreSQL Helper Functions

A Python package providing robust utilities for PostgreSQL database operations, query management, and data analysis workflows with enterprise-grade security.

Features

  • ๐Ÿ”„ Automatic retry logic with exponential backoff for unreliable connections
  • ๐Ÿ“Š Seamless pandas integration for data analysis
  • ๐Ÿ”ง SQL query templating and parameter substitution
  • ๐Ÿ”Š Cross-platform notifications when long queries complete
  • โš™๏ธ Environment-based configuration for secure credential management
  • ๐Ÿ›ก๏ธ Comprehensive error handling and logging
  • ๐Ÿ”’ SSL/TLS encryption with CA certificate verification for AWS RDS and other cloud databases
  • Updated in v1.3.3
    • โœ… Recommended install via PyPi: Use 'pip install pg_helpers' to install
  • Enhanced in v1.3.2
    • ๐Ÿ“ Flexible credential file location: Support for CREDENTIALS_DIR and CREDENTIALS_FILE environment variables
    • ๐Ÿ”„ Enhanced configuration: Specify custom .env file paths for different projects and environments
    • ๐Ÿข Centralized credentials: Share credential directories across multiple projects
  • Enhanced in v1.3.1:
    • ๐Ÿงช Comprehensive test suite with 40+ test cases covering all functionality
    • โœ… Cross-platform validation ensuring reliability on Windows, macOS, and Linux
    • ๐Ÿ” Function name improvements for better API clarity
  • Previous in v1.2.0:
    • ๐Ÿ” Full SSL support with optional CA certificate verification
    • ๐Ÿ›ก๏ธ Man-in-the-middle attack prevention for production environments
    • โœ… SSL connection testing and diagnostics
  • Previous in v1.1.0:
    • ๐Ÿš€ Advanced fallback methods for SQLAlchemy/pandas compatibility issues
    • ๐Ÿ” Enhanced debugging and diagnostic capabilities

What's New in Version 1.3.2 ๐Ÿงช

Flexible Credential File Configuration

  • Custom credential locations: Use CREDENTIALS_DIR and CREDENTIALS_FILE environment variables
  • Perfect for shared credentials: Point multiple projects to a centralized credential directory
  • Environment-specific files: Use different credential files for dev/staging/production
  • Fully backwards compatible: Existing projects continue to work without any changes
import os

# Use centralized credentials directory
os.environ['CREDENTIALS_DIR'] = r'C:\Documents\Credentials'
from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()  # Loads C:\Documents\Credentials\.env

# Use custom filename for database credentials
os.environ['CREDENTIALS_FILE'] = '.env.database'
from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()  # Loads ./.env.database

# Combine both for full flexibility
os.environ['CREDENTIALS_DIR'] = r'C:\Documents\Project\Assets'
os.environ['CREDENTIALS_FILE'] = '.env.production'
from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()  # Loads C:\Documents\Project\Assets\.env.production

Installation

From PyPi (Recommended)

pip install pg_helpers

From Source (Development)

git clone https://github.com/lenwood/pg_helpers.git
cd pg_helpers
pip install -e .

From GitHub (Specific Version)

pip install git+https://github.com/lenwood/pg_helpers.git@v1.3.3

Dependencies

pip install pandas psycopg2-binary sqlalchemy python-dotenv

Quick Start

1. Environment Setup

Create a .env file in your project root (NOT in this repository):

# Required database credentials
DB_USER=your_username
DB_PASSWORD=your_password
DB_HOST=your_host
DB_PORT=5432
DB_NAME=your_database

# SSL Configuration (optional)
# SSL Mode options: disable, allow, prefer, require, verify-ca, verify-full
DB_SSL_MODE=require

# Optional: Path to CA certificate for maximum security
# DB_SSL_CA_CERT=/path/to/rds-ca-2019-root.pem

# For AWS RDS with maximum security:
# DB_SSL_MODE=verify-full
# DB_SSL_CA_CERT=./certs/rds-ca-2019-root.pem

Flexible Credential Location Setup NEW in v1.3.2

You can now specify custom locations for your credential files using environment variables:

Option 1: Centralized credential directory

import os
# Point to a shared credentials folder
os.environ['CREDENTIALS_DIR'] = r'C:\Users\Me\Documents\Credentials'
# Will load: C:\Users\Me\Documents\Credentials\.env

from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()

Option 2: Custom credential filename

import os
# Use a specific credential file name in current directory
os.environ['CREDENTIALS_FILE'] = '.env.database'
# Will load: ./.env.database

from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()

Option 3: Both custom directory and filename

import os
# Full custom path
os.environ['CREDENTIALS_DIR'] = r'C:\Documents\Project\Assets'
os.environ['CREDENTIALS_FILE'] = '.env.production'
# Will load: C:\Documents\Project\Assets\.env.production

from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()

Configuration Priority:

  • Both CREDENTIALS_DIR and CREDENTIALS_FILE set โ†’ {CREDENTIALS_DIR}/{CREDENTIALS_FILE}
  • Only CREDENTIALS_DIR set โ†’ {CREDENTIALS_DIR}/.env
  • Only CREDENTIALS_FILE set โ†’ ./{CREDENTIALS_FILE}
  • Neither set (default) โ†’ ./.env

Benefits of flexible credentials:

  • ๐Ÿ” Centralized security: Keep all credentials in one secure location
  • ๐Ÿข Team workflows: Share credential directories across team members
  • ๐ŸŒ Environment-specific: Use .env.production, .env.staging, .env.dev
  • ๐Ÿ”„ Multi-project: Reuse credentials across multiple analysis projects
  • โœ… Backwards compatible: Existing code works without changes

2. Basic Usage

from pg_helpers import createPostgresqlEngine, dataGrabber, queryCleaner, check_ssl_connection

# Create secure database connection (uses SSL by default)
engine = createPostgresqlEngine()

# Verify SSL connection
ssl_info = check_ssl_connection(engine)
print(f"SSL Active: {ssl_info.get('ssl_active', 'Unknown')}")

# Execute a simple query
data = dataGrabber("SELECT * FROM users LIMIT 10", engine)
print(data.head())

# Enable debugging for troubleshooting
data = dataGrabber("SELECT * FROM complex_view", engine, debug=True)

# Use query templates
query = queryCleaner(
    'queries/user_analysis.sql',
    list1=[100, 200, 300],
    varString1='$USER_IDS',
    startDate='2023-01-01',
    endDate='2023-12-31'
)
results = dataGrabber(query, engine)

3. Custom SSL Configuration

from pg_helpers import createPostgresqlEngineWithCustomSSL

# Maximum security for production
engine = createPostgresqlEngineWithCustomSSL(
    ssl_ca_cert="/path/to/rds-ca-2019-root.pem",
    ssl_mode="verify-full"
)

# Basic SSL without certificate verification
engine = createPostgresqlEngineWithCustomSSL(
    ssl_mode="require"
)

# Disable SSL (not recommended for production)
engine = createPostgresqlEngineWithCustomSSL(
    ssl_mode="disable"
)

Function Reference

Database Operations

createPostgresqlEngine()

Creates a SQLAlchemy engine for PostgreSQL connections using environment variables with SSL support.

Returns: sqlalchemy.Engine

Environment variables:

  • Required: DB_USER, DB_PASSWORD, DB_HOST, DB_NAME
  • Optional: DB_PORT (default: 5432)
  • SSL Options:
    • DB_SSL_MODE (default: "require")
    • DB_SSL_CA_CERT (path to CA certificate)
    • DB_SSL_CERT (client certificate)
    • DB_SSL_KEY (client key)

createPostgresqlEngineWithCustomSSL(ssl_ca_cert=None, ssl_mode='require', ssl_cert=None, ssl_key=None) v1.2.0

Creates a SQLAlchemy engine with custom SSL configuration, overriding environment variables.

Parameters:

  • ssl_ca_cert (str, optional): Path to CA certificate file
  • ssl_mode (str): SSL mode - "disable", "allow", "prefer", "require", "verify-ca", "verify-full"
  • ssl_cert (str, optional): Path to client certificate file
  • ssl_key (str, optional): Path to client key file

Returns: sqlalchemy.Engine

SSL Modes Explained:

  • disable: No SSL connection
  • allow: Try SSL, fallback to non-SSL
  • prefer: Try SSL first, fallback to non-SSL
  • require: Default - Require SSL, fail if unavailable
  • verify-ca: Require SSL and verify server certificate
  • verify-full: Most secure - Require SSL, verify certificate and hostname

check_ssl_connection(engine=None) v1.2.0, improved v1.3.0

Tests SSL connection and displays SSL information. (Renamed from test_ssl_connection in v1.3.0 for API clarity)

Parameters:

  • engine (optional): SQLAlchemy engine (creates one if not provided)

Returns: dict with SSL connection details including:

  • ssl_active: Whether SSL is active
  • ssl_cipher: Encryption cipher used
  • ssl_version: SSL/TLS version
  • client_cert_present: Whether client certificate is present

dataGrabber(query, engine, limit='None', debug=False) ENHANCED in v1.1.0

Executes SQL queries and returns pandas DataFrames with robust error handling.

Parameters:

  • query (str): SQL query to execute
  • engine: SQLAlchemy engine
  • limit (str, optional): Row limit for results
  • debug (bool, optional): Enable detailed logging and debugging output

Returns: pandas.DataFrame

Features:

  • Multiple fallback methods for compatibility issues
  • Automatic detection and handling of metadata interpretation errors
  • Comprehensive error logging and debugging
  • Execution timing display
  • Cross-platform sound notifications
  • Automatic error propagation for retry logic

recursiveDataGrabber(query_dict, results_dict, n=1, max_attempts=50)

Executes multiple queries with automatic retry and exponential backoff.

Parameters:

  • query_dict (dict): Dictionary of {query_name: sql_string}
  • results_dict (dict): Dictionary to store results
  • n (int): Current attempt number
  • max_attempts (int): Maximum retry attempts

Returns: dict with DataFrames or None for failed queries

Use case: Perfect for running multiple large queries overnight with unreliable connections.

diagnose_connection_and_query(engine, query, limit=10) ENHANCED in v1.1.0

Diagnostic function to help troubleshoot SQLAlchemy/pandas compatibility issues.

Parameters:

  • engine: SQLAlchemy engine
  • query (str): SQL query to diagnose
  • limit (int): Number of rows to test with

Returns: dict with diagnostic information

Query Utilities

queryCleaner(file, list1='empty', varString1='empty', ...)

Loads SQL files and substitutes parameters for dynamic queries.

Parameters:

  • file (str): Path to SQL file
  • list1: List to substitute (converts to comma-separated string)
  • varString1 (str): Placeholder string in SQL file
  • startDate/endDate: Date range parameters

Returns: str - Processed SQL query

listPrep(iList)

Converts Python lists to SQL-compatible comma-separated strings.

Parameters:

  • iList: List of integers, floats, strings, or single value

Returns: str - SQL-formatted string

Testing and Quality Assurance NEW in v1.3.0

Running Tests

The package includes a comprehensive test suite with 40+ test cases covering all functionality:

# Install test dependencies
pip install pytest pytest-cov

# Run all tests
python -m pytest tests/ -v

# Run tests with coverage report
python -m pytest tests/ --cov=pg_helpers --cov-report=html --cov-report=term-missing

# Run specific test categories
python -m pytest tests/test_database.py::TestQueryUtils -v      # Query utilities
python -m pytest tests/test_database.py::TestConfig -v         # Configuration
python -m pytest tests/test_database.py::TestDatabase -v       # Database operations
python -m pytest tests/test_database.py::TestNotifications -v  # Cross-platform notifications

What the Tests Validate

Core Functionality Tests

  • โœ… Database engine creation with various SSL configurations
  • โœ… Query execution with multiple fallback methods
  • โœ… SSL connection testing and diagnostics
  • โœ… Configuration validation and environment variable handling
  • โœ… Query template processing with parameter substitution
  • โœ… Error handling for network issues and compatibility problems

Cross-Platform Compatibility

  • โœ… File operations on Windows, macOS, and Linux
  • โœ… Sound notifications across different operating systems
  • โœ… Environment variable handling with different shell environments
  • โœ… Temporary file management and cleanup

Error Condition Testing

  • โœ… Missing environment variables and configuration errors
  • โœ… SSL certificate validation failures and missing files
  • โœ… Database connection failures and retry logic
  • โœ… Pandas/SQLAlchemy compatibility issues and fallback methods
  • โœ… Invalid query parameters and malformed SQL

Integration Testing

  • โœ… End-to-end workflows combining multiple functions
  • โœ… Configuration to engine creation pipelines
  • โœ… Query templating to execution workflows

Test Output Example

================================================= test session starts =================================================
platform win32 -- Python 3.12.9, pytest-8.4.1, pluggy-1.5.0
collected 40 items

tests/test_database.py::TestQueryUtils::test_listPrep_empty_list PASSED                                          [  2%]
tests/test_database.py::TestQueryUtils::test_listPrep_floats PASSED                                              [  5%]
tests/test_database.py::TestQueryUtils::test_listPrep_integers PASSED                                            [  7%]
tests/test_database.py::TestQueryUtils::test_listPrep_single_value PASSED                                        [ 10%]
tests/test_database.py::TestQueryUtils::test_listPrep_strings PASSED                                             [ 12%]
... (34 more tests) ...
tests/test_database.py::TestNotifications::test_play_notification_sound_windows PASSED                           [100%]

================================================= 40 passed in 1.03s ==================================================

Flexible Credential File Configuration NEW in v1.3.2

  • Custom credential locations: Use CREDENTIALS_DIR and CREDENTIALS_FILE environment variables
  • Perfect for shared credentials: Point multiple projects to a centralized credential directory
  • Environment-specific files: Use different credential files for dev/staging/production
  • Fully backwards compatible: Existing projects continue to work without any changes
import os

# Use centralized credentials directory
os.environ['CREDENTIALS_DIR'] = r'C:\Documents\Credentials'
from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()  # Loads C:\Documents\Credentials\.env

# Use custom filename for database credentials
os.environ['CREDENTIALS_FILE'] = '.env.database'
from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()  # Loads ./.env.database

# Combine both for full flexibility
os.environ['CREDENTIALS_DIR'] = r'C:\Documents\Project\Assets'
os.environ['CREDENTIALS_FILE'] = '.env.production'
from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()  # Loads C:\Documents\Project\Assets\.env.production

Continuous Integration Setup NEW in v1.3.0

For automated testing across multiple environments, add this GitHub Actions workflow:

# .github/workflows/test.yml
name: Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest, macos-latest]
        python-version: [3.8, 3.9, "3.10", "3.11", "3.12"]
    steps:
    - uses: actions/checkout@v4
    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}
    - name: Install dependencies
      run: |
        python -m pip install --upgrade pip
        pip install pytest pytest-cov
        pip install -e .
    - name: Run tests
      run: python -m pytest tests/ -v --tb=short

Test Coverage Goals

  • Target Coverage: >90% (currently achieved)
  • Critical Functions: 100% coverage for database operations
  • Error Paths: All exception handling paths tested
  • Edge Cases: Empty lists, missing files, invalid configurations

Advanced Usage

SSL Security Configuration v1.2.0

For AWS RDS (Recommended)

from pg_helpers import createPostgresqlEngineWithCustomSSL, check_ssl_connection

# Download RDS CA certificate first
# wget https://s3.amazonaws.com/rds-downloads/rds-ca-2019-root.pem

# Maximum security configuration
engine = createPostgresqlEngineWithCustomSSL(
    ssl_ca_cert="./certs/rds-ca-2019-root.pem",
    ssl_mode="verify-full"
)

# Test the secure connection
ssl_info = check_ssl_connection(engine)
print(f"SSL Cipher: {ssl_info.get('ssl_cipher')}")
print(f"SSL Version: {ssl_info.get('ssl_version')}")

Environment-based SSL Configuration

# Set in .env file:
# DB_SSL_MODE=verify-full
# DB_SSL_CA_CERT=./certs/rds-ca-2019-root.pem

from pg_helpers import createPostgresqlEngine, check_ssl_connection

# Uses environment variables
engine = createPostgresqlEngine()

# Verify security
ssl_info = check_ssl_connection(engine)
if ssl_info.get('ssl_active'):
    print("โœ… Secure SSL connection established")
else:
    print("โš ๏ธ SSL not active - check configuration")

SSL Mode Comparison

# Different security levels for different environments

# Development (basic encryption)
dev_engine = createPostgresqlEngineWithCustomSSL(ssl_mode="require")

# Staging (certificate verification)
staging_engine = createPostgresqlEngineWithCustomSSL(
    ssl_mode="verify-ca",
    ssl_ca_cert="./certs/staging-ca.pem"
)

# Production (maximum security)
prod_engine = createPostgresqlEngineWithCustomSSL(
    ssl_mode="verify-full",
    ssl_ca_cert="./certs/rds-ca-2019-root.pem"
)

Debugging and Troubleshooting

from pg_helpers import createPostgresqlEngine, dataGrabber, diagnose_connection_and_query

engine = createPostgresqlEngine()

# Enable detailed debugging for problematic queries
data = dataGrabber("""
    SELECT * FROM complex_view 
    WHERE date_column > '2023-01-01'
""", engine, debug=True)

# Diagnose specific issues
diagnostics = diagnose_connection_and_query(engine, "SELECT * FROM problematic_table")
print(f"Engine info: {diagnostics['engine_info']}")
print(f"Test results: {diagnostics['test_results']}")
print(f"Recommendations: {diagnostics['recommendations']}")

Batch Query Processing

from pg_helpers import recursiveDataGrabber, queryCleaner

# Prepare multiple queries
queries = {
    'user_stats': queryCleaner('sql/user_stats.sql', startDate='2023-01-01'),
    'sales_data': queryCleaner('sql/sales.sql', list1=[1,2,3], varString1='$REGIONS'),
    'inventory': 'SELECT * FROM inventory WHERE status = "active"'
}

# Execute with automatic retry
results = {}
final_results = recursiveDataGrabber(queries, results)

# Access individual results
user_df = final_results['user_stats']
sales_df = final_results['sales_data']

Credential Management Strategies NEW in v1.3.2

Strategy 1: Centralized Credentials for Multiple Projects

Keep one credential directory for all your projects:

C:\Users\Me\Documents\Credentials\
โ”œโ”€โ”€ .env                    # General/shared credentials
โ”œโ”€โ”€ .env.database          # Database-specific credentials  
โ”œโ”€โ”€ .env.api               # API credentials
โ””โ”€โ”€ .env.production        # Production environment

Project1\
โ”œโ”€โ”€ analysis.py
โ””โ”€โ”€ (no .env file needed)

Project2\
โ”œโ”€โ”€ dashboard.py
โ””โ”€โ”€ (no .env file needed)

Setup in each project:

import os
os.environ['CREDENTIALS_DIR'] = r'C:\Users\Me\Documents\Credentials'
os.environ['CREDENTIALS_FILE'] = '.env.database'

from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()

Strategy 2: Environment-Specific Credentials

Use different credential files for different environments:

import os

# Detect environment (from ENV variable, config file, etc.)
environment = os.getenv('APP_ENV', 'development')

# Set credential file based on environment
os.environ['CREDENTIALS_DIR'] = r'C:\Documents\Credentials'
os.environ['CREDENTIALS_FILE'] = f'.env.{environment}'

from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()
# Loads .env.development, .env.staging, or .env.production

Strategy 3: Project-Specific Overrides

Use centralized credentials with project-specific overrides:

import os
from pathlib import Path

# Try project-local first, fall back to centralized
if Path('.env.local').exists():
    os.environ['CREDENTIALS_FILE'] = '.env.local'
else:
    os.environ['CREDENTIALS_DIR'] = r'C:\Documents\Credentials'
    os.environ['CREDENTIALS_FILE'] = '.env.database'

from pg_helpers import createPostgresqlEngine
engine = createPostgresqlEngine()

Best Practices for Credential Files

Security:

  • โœ… Store credential directories outside of project repositories
  • โœ… Use restrictive file permissions (e.g., chmod 600 on Linux/macOS)
  • โœ… Never commit credential file paths to version control
  • โœ… Document credential file locations in team wikis, not in code

Organization:

  • ๐Ÿ“ Use descriptive filenames: .env.database, .env.api, .env.aws
  • ๐Ÿ“ Group by environment: .env.production, .env.staging
  • ๐Ÿ“ Keep a .env.template in your repository with dummy values

Team Workflows:

# config.py (committed to repository)
import os
from pathlib import Path

def setup_credentials():
    """Load credentials from standard team location or local override"""
    # Check for local override first (for development)
    if Path('.env.local').exists():
        os.environ['CREDENTIALS_FILE'] = '.env.local'
        return
    
    # Use team shared credentials location
    team_creds = Path.home() / 'TeamShared' / 'Credentials'
    if team_creds.exists():
        os.environ['CREDENTIALS_DIR'] = str(team_creds)
        os.environ['CREDENTIALS_FILE'] = '.env.database'
        return
    
    # Fallback to default behavior
    print("โš ๏ธ No credentials found - using .env in current directory")

# Call this before importing pg_helpers
setup_credentials()

## Security Best Practices

### SSL/TLS Configuration
- **Always use SSL in production** - Set `DB_SSL_MODE=require` minimum
- **Use CA certificate verification** for cloud databases (AWS RDS, GCP Cloud SQL)
- **Download official CA certificates** from your cloud provider
- **Use `verify-full` mode** for maximum security in production
- **Test SSL connections** regularly with `check_ssl_connection()`

### AWS RDS Security Setup
```bash
# Download AWS RDS CA certificate
mkdir -p ./certs
wget -O ./certs/rds-ca-2019-root.pem \
  https://s3.amazonaws.com/rds-downloads/rds-ca-2019-root.pem

# Set environment variables
echo "DB_SSL_MODE=verify-full" >> .env
echo "DB_SSL_CA_CERT=./certs/rds-ca-2019-root.pem" >> .env

General Security

  • Never commit .env files - add to .gitignore
  • Use environment variables for all credentials
  • Rotate database passwords regularly
  • Use database connection pooling for production
  • Monitor SSL certificate expiration dates

File Structure for Your Projects

your_project/
โ”œโ”€โ”€ .env                 # Database credentials (DO NOT COMMIT)
โ”œโ”€โ”€ .gitignore          # Include .env and certs/
โ”œโ”€โ”€ main.py             # Your analysis code
โ”œโ”€โ”€ certs/              # SSL certificates (DO NOT COMMIT private keys)
โ”‚   โ”œโ”€โ”€ rds-ca-2019-root.pem
โ”‚   โ””โ”€โ”€ custom-ca.pem
โ”œโ”€โ”€ queries/            # SQL template files
โ”‚   โ”œโ”€โ”€ user_analysis.sql
โ”‚   โ”œโ”€โ”€ sales_report.sql
โ”‚   โ””โ”€โ”€ inventory.sql
โ”œโ”€โ”€ data/              # Output data files
โ”‚   โ””โ”€โ”€ results.pkl
โ””โ”€โ”€ tests/             # Test files (if contributing)
    โ””โ”€โ”€ test_database.py

Troubleshooting

Common Issues

SSL certificate verification failed

  • Ensure CA certificate file exists and is readable
  • Download the latest CA certificate from your provider
  • Check that ssl_mode is set correctly
  • Use check_ssl_connection() to diagnose SSL issues

"SSL connection required"

  • Check that your database server supports SSL
  • Verify DB_SSL_MODE is set correctly
  • For development, you can temporarily use ssl_mode="disable"

"Missing required environment variables"

  • Ensure .env file exists in your project root
  • Check that all required variables are set: DB_USER, DB_PASSWORD, DB_HOST, DB_NAME

Connection timeouts

  • The recursiveDataGrabber will automatically retry with exponential backoff
  • Check network connectivity and database server status
  • Consider increasing max_attempts for very unreliable connections

SQLAlchemy/pandas compatibility errors IMPROVED in v1.1.0

  • The package now automatically handles most compatibility issues
  • Enable debug=True to see which fallback method succeeds
  • Use diagnose_connection_and_query() for detailed troubleshooting

"immutabledict" or metadata interpretation errors FIXED in v1.1.0

  • These errors are now automatically detected and handled
  • The package will try alternative pandas parameters and manual DataFrame construction
  • No action needed from users - fallback methods handle this automatically

Sound notifications not working

  • This is normal and won't affect functionality
  • Ensure system sound is enabled
  • On Linux, you may need to install additional audio packages

Test failures during development NEW in v1.3.0

  • Ensure all dependencies are installed: pip install pytest pytest-cov
  • Check that you're running tests from the project root directory
  • Use python -m pytest tests/ -v for detailed output
  • Some tests require write access to temporary directories

Changelog

Version 1.3.3 (Current) ๐Ÿ“

  • โœ… Updated installation instructions: Recommend using 'pip install pg_helpers'

Version 1.3.2

  • ๐Ÿ“ Flexible credential file location: Support for CREDENTIALS_DIR and CREDENTIALS_FILE environment variables
  • ๐Ÿ”„ Enhanced configuration: Specify custom .env file paths for different projects and environments
  • ๐Ÿข Centralized credentials: Share credential directories across multiple projects
  • ๐ŸŒ Environment-specific files: Easy support for .env.production, .env.staging, .env.dev
  • โœ… Backwards compatible: Defaults to .env in current directory when no variables set
  • ๐Ÿ“– Documentation: New credential management strategies and best practices guide

Version 1.3.1

  • ๐Ÿงช Comprehensive test suite: 40+ test cases with >90% code coverage
  • โœ… Cross-platform validation: Tests confirm functionality on Windows, macOS, and Linux
  • ๐Ÿ”ง API improvements: test_ssl_connection() renamed to check_ssl_connection() for clarity
  • ๐Ÿš€ CI/CD ready: GitHub Actions workflow for automated testing across environments
  • ๐Ÿ“‹ Enhanced documentation: Detailed testing guide and troubleshooting section
  • ๐Ÿ›ก๏ธ Quality assurance: All functions tested including error conditions and edge cases

Version 1.2.0

  • ๐Ÿ”’ SSL/TLS support: Full SSL encryption with optional CA certificate verification
  • ๐Ÿ›ก๏ธ Security enhancements: Man-in-the-middle attack prevention for production environments
  • โœ… SSL testing: New SSL connection diagnostics
  • ๐Ÿ”ง Custom SSL configuration: Programmatic SSL parameter override
  • ๐Ÿ“‹ Environment SSL config: Optional SSL settings via environment variables
  • ๐Ÿ”„ Backward compatibility: Existing code continues to work without changes

Version 1.1.0

  • โœจ Enhanced error handling: Multiple fallback methods for pandas/SQLAlchemy compatibility
  • ๐Ÿ” Improved debugging: Comprehensive logging and diagnostic capabilities
  • ๐Ÿ› ๏ธ Better reliability: Automatic detection and handling of metadata interpretation errors
  • ๐Ÿ“Š Manual DataFrame construction: Fallback method for complex data type issues
  • ๐Ÿ”ง Alternative parameter testing: Tries different pandas configurations automatically

Version 1.0.0

  • Initial release with core functionality
  • Basic retry logic and PostgreSQL integration
  • Query templating and notification system

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Add tests for new functionality: python -m pytest tests/ -v
  4. Ensure all tests pass and maintain >90% coverage
  5. Run the full test suite: python -m pytest tests/ --cov=pg_helpers
  6. Submit a pull request

Development Setup

git clone https://github.com/lenwood/pg_helpers.git
cd pg_helpers
pip install -e .
pip install pytest pytest-cov
python -m pytest tests/ -v

License

MIT License - feel free to use in your projects!

Author

Chris Leonard


This package was designed for data analysts and engineers who work with PostgreSQL databases and need reliable, automated query execution with enterprise-grade security. Version 1.3.2 adds flexible credential management while ensuring this reliability through comprehensive testing across multiple platforms and Python versions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pg_helpers-1.3.3.tar.gz (33.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pg_helpers-1.3.3-py3-none-any.whl (25.9 kB view details)

Uploaded Python 3

File details

Details for the file pg_helpers-1.3.3.tar.gz.

File metadata

  • Download URL: pg_helpers-1.3.3.tar.gz
  • Upload date:
  • Size: 33.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for pg_helpers-1.3.3.tar.gz
Algorithm Hash digest
SHA256 1a828c0b393e469df5dbba5c00388b6b8d69d9d86999d2ea54904acefaa46607
MD5 52fd348fa8262a8d7607ee0a00a744a9
BLAKE2b-256 c4882cfdbcceb6cf97e902e78db66b4e2717665d5c4e461648a8ba33b3bd473e

See more details on using hashes here.

File details

Details for the file pg_helpers-1.3.3-py3-none-any.whl.

File metadata

  • Download URL: pg_helpers-1.3.3-py3-none-any.whl
  • Upload date:
  • Size: 25.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for pg_helpers-1.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 160d4343c2c8bd4dbbb2927cc23b4a5ffb429cb64710868aac36186af2b095cc
MD5 529ae981de3cc8d28e143b7953443fc7
BLAKE2b-256 dc751eed47b6154a22ef3cbc45ffb73a97a70e81d78b1b5d588a1aa7e1fb8a63

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page