Skip to main content

Enterprise-grade Python utilities: 320+ type-safe, tested functions across 23 specialized modules for async operations, data processing, file handling, security, and more

Project description

๐Ÿ› ๏ธ Pyutils Collection

PyPI version Python versions License: MIT

Enterprise-grade Python utilities - 320+ type-safe, tested functions across 23 specialized modules for async operations, data processing, file handling, security, and more.

๐ŸŽฏ What is This?

A curated collection of 320+ utility functions across 23 specialized modules - designed for copy-paste reuse or pip install. Each function is self-contained with type hints, docstrings, and handles its own dependencies gracefully.

Philosophy:

  • ๐Ÿ“‹ Copy-paste friendly - Functions work standalone
  • ๐Ÿ”’ Type-safe - Complete type hints (Python 3.10+)
  • ๐Ÿ“ Self-documenting - NumPy-style docstrings with examples
  • โœ… Well-tested - 88%+ coverage with 5500+ test cases
  • ๐ŸŽจ Optional deps - Functions gracefully handle missing libraries

๐Ÿ“ฆ Quick Start

# Install from PyPI
pip install pyutils-collection

# Or clone and copy what you need
git clone https://github.com/MForofontov/pyutils-collection.git
cd pyutils-collection/pyutils_collection

# Or install for development
pip install -e ".[dev]"

๐Ÿ“ฆ Modules Overview

Core Modules (23 categories)

Module Count Description
๐Ÿ”„ asyncio_functions 17 Async/await, connection pools, rate limiting, HTTP
๐Ÿ—œ๏ธ compression_functions 27 GZIP, BZ2, LZMA, Snappy, Zstandard, polyline encoding
๐Ÿ—„๏ธ database_functions 23 SQLAlchemy utils, transactions, schema inspection
๐Ÿ“… datetime_functions 27 Timezone conversion, business days, humanization
๐ŸŽจ decorators 50+ Caching, retry, timeout, type checking, profiling
๐Ÿ“ file_functions 32 I/O, hashing, search, temp files, format conversion
๐ŸŒ http_functions 9 REST operations, downloads, query strings
๐Ÿ”„ iterable_functions 55 Chunking, filtering, grouping, flattening
๐Ÿงฎ mathematical_functions 5 GCD, LCM, primes, factorial, fibonacci
๐Ÿ” security_functions 12 Encryption (AES/RSA), hashing, JWT tokens
๐Ÿ“Š serialization_functions 28 CSV, Excel, Parquet with streaming & conversion
๐Ÿ”Œ ssh_functions 12 Remote execution, SFTP, key generation
๐Ÿงช testing_functions 24 Fixtures, mocks, assertions, test data generators
๐ŸŒ network_functions 28 IP utilities, DNS, port scanning, connectivity
๐ŸŒ web_scraping_functions 18 HTML/CSS/XPath parsing, table extraction
๐ŸŽญ playwright_functions 6 Browser automation, screenshots, session management
๐Ÿ”— url_functions 8 Parse, build, validate, normalize URLs
regex_functions 5 Email/phone/URL validation & extraction
โš™๏ธ cli_functions 16 System info, process management, environment vars
๐Ÿ“ logger_functions 7 Logger setup, function logging, rotation
๐Ÿ”„ multiprocessing_functions 19 Parallel processing, pool management
๐Ÿ”ง batch_processing_functions 2 Chunked processing, streaming aggregation
๐ŸŒฟ env_config_functions 6 Config loading (env, YAML, TOML)
โœ… data_validation Many Type/schema validation, Pydantic/Cerberus support

๐Ÿ”‘ Key Features

Database-Agnostic Design

All database functions use SQLAlchemy for maximum portability:

  • โœ… PostgreSQL
  • โœ… MySQL / MariaDB
  • โœ… SQLite
  • โœ… Oracle
  • โœ… SQL Server

Type Safety

  • Complete type hints using modern Python syntax (list[str], dict[str, Any])
  • Runtime type checking with decorators
  • mypy-compliant codebase

Comprehensive Testing

  • 88%+ test coverage
  • 150+ test files with 1000+ test cases
  • Pytest-based testing framework
  • Comprehensive edge case coverage

Documentation

  • NumPy-style docstrings for all functions
  • Examples in docstrings
  • Time/space complexity notes for algorithms
  • Comprehensive README with usage examples

๐Ÿ“š Usage Examples

Database Operations

from database_functions import create_connection, atomic_transaction, execute_query
from database_functions.schema_inspection import (
    get_table_info,
    find_duplicate_rows,
    get_foreign_key_dependencies
)

# Create connection
conn = create_connection("postgresql://user:pass@localhost/db")

# Safe transaction
with atomic_transaction(conn) as trans:
    execute_query(trans, "INSERT INTO users VALUES (:name)", {"name": "John"})

# Schema inspection
table_info = get_table_info(conn, "users")
print(f"Columns: {table_info['columns']}")

# Find duplicates
duplicates = find_duplicate_rows(conn, "users", ["email"])

# Get FK dependencies for safe operations
deps = get_foreign_key_dependencies(conn)
print(f"Safe drop order: {deps['ordered_tables']}")

Async Operations

from asyncio_functions import async_batch, fetch_multiple_urls, AsyncConnectionPool

# Batch processing
async def process_items():
    results = await async_batch(
        items=range(100),
        func=process_item,
        batch_size=10
    )
    return results

# HTTP fetching
urls = ["https://api.example.com/1", "https://api.example.com/2"]
responses = await fetch_multiple_urls(urls, max_concurrent=5)

# Connection pooling
async with AsyncConnectionPool("postgresql://...") as pool:
    async with pool.acquire() as conn:
        result = await conn.fetch("SELECT * FROM users")

Decorators

from decorators import cache, retry, timeout, enforce_types

@cache(maxsize=128, ttl=3600)
@retry(max_attempts=3, backoff=2.0)
@timeout(seconds=30)
@enforce_types
def fetch_user_data(user_id: int) -> dict:
    # Function logic here
    return {"id": user_id, "name": "John"}

File Operations

from file_functions import read_file_lines, hash_file, find_files_by_pattern
from file_functions import temp_file_context

# Read file
lines = read_file_lines("data.txt", encoding="utf-8")

# Hash file
file_hash = hash_file("document.pdf", algorithm="sha256")

# Find files
python_files = find_files_by_pattern("/project", "*.py")

# Temp file context
with temp_file_context(suffix=".txt") as temp_path:
    # Use temp file
    temp_path.write_text("temporary data")

Data Serialization

from serialization_functions import (
    stream_csv_chunks,
    csv_to_parquet,
    read_excel_sheet
)

# Stream large CSV
for chunk in stream_csv_chunks("large_file.csv", chunk_size=10000):
    process_chunk(chunk)

# Convert formats
csv_to_parquet("input.csv", "output.parquet", compression="snappy")

# Read Excel
data = read_excel_sheet("report.xlsx", sheet_name="Sales")

๐Ÿ“‹ Requirements

  • Python: 3.10+
  • Philosophy: Functions handle missing deps gracefully - install only what you need
  • Common deps: numpy, aiohttp, sqlalchemy, psutil, tqdm
  • Optional: playwright, paramiko, bcrypt, pydantic, cerberus, etc.

๐Ÿงช Testing

# Run all 5500+ tests
python -m pytest

# Coverage report (88%+)
python -m pytest --cov=. --cov-report=html

๐Ÿค Contributing

See .github/copilot-instructions.md for detailed guidelines:

  • NumPy-style docstrings with examples
  • Complete type hints (Python 3.10+ syntax)
  • 95%+ test coverage per function
  • Self-contained, copy-paste friendly code

๐Ÿ“„ License

MIT License - see LICENSE file for details.

๐Ÿ‘ค Author

MForofontov

๐Ÿ”— Links


โญ Star this repository if you find it useful! โœจ Key Features

  • ๐ŸŽฏ Self-contained functions - Copy one file, get everything you need
  • ๐Ÿ”’ Type-safe - Full type hints with modern Python syntax
  • ๐Ÿ“ Well-documented - NumPy-style docstrings with examples & complexity
  • โœ… Tested - 88% coverage, 5500+ test cases across 150+ files
  • ๐Ÿ”ง Graceful degradation - Optional deps handled automatically
  • ๐Ÿ—„๏ธ DB-agnostic - SQLAlchemy support for PostgreSQL, MySQL, SQLite, Oracle, SQL Server๏ฟฝ Usage Examples
# Import from installed package
from pyutils_collection.decorators import cache, retry, timeout

# Or copy decorators locally and use
from decorators import cache, retry, timeout

@cache(maxsize=128, ttl=3600)
@retry(max_attempts=3, backoff=2.0)
@timeout(seconds=30)
def fetch_user_data(user_id: int) -> dict:
    return {"id": user_id, "name": "John"}

from pyutils_collection.asyncio_functions import async_batch, fetch_multiple_urls

urls = ["https://api.example.com/1", "https://api.example.com/2"]
responses = await fetch_multiple_urls(urls, max_concurrent=5)

from pyutils_collection.database_functions import create_connection, atomic_transaction

conn = create_connection("postgresql://user:pass@localhost/db")
with atomic_transaction(conn) as trans:
    execute_query(trans, "INSERT INTO users VALUES (:name)", {"name": "John"})

from pyutils_collection.serialization_functions import stream_csv_chunks, csv_to_parquet

for chunk in stream_csv_chunks("large.csv", chunk_size=10000):
    process_chunk(chunk)
csv_to_parquet("input.csv", "output.parquet", compression="snappy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyutils_collection-0.1.4.tar.gz (329.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyutils_collection-0.1.4-py3-none-any.whl (653.7 kB view details)

Uploaded Python 3

File details

Details for the file pyutils_collection-0.1.4.tar.gz.

File metadata

  • Download URL: pyutils_collection-0.1.4.tar.gz
  • Upload date:
  • Size: 329.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pyutils_collection-0.1.4.tar.gz
Algorithm Hash digest
SHA256 c93947f526033f509e4ac426e39876323cbc6be81af23c387cab6627bad8bd0e
MD5 e0a05a05a51025cb31e72c277827cd1f
BLAKE2b-256 c36ec1418c3d36493f402f46bcad42f49c4fe341e5f41722fdadcca00f39694e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyutils_collection-0.1.4.tar.gz:

Publisher: publish-pypi.yml on MForofontov/pyutils-collection

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pyutils_collection-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for pyutils_collection-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 8d624c1b74292601f06787f7664fb9db677e8a603ba81c30b1896349ea3868c9
MD5 c808a85dbfc6fcc18587ddc1381a2619
BLAKE2b-256 1e5eded0e86ce86b0ebc556b7c193a1b05b23ed48895f524a7ee8d13153d564f

See more details on using hashes here.

Provenance

The following attestation bundles were made for pyutils_collection-0.1.4-py3-none-any.whl:

Publisher: publish-pypi.yml on MForofontov/pyutils-collection

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page