Enterprise-grade Python utilities: 320+ type-safe, tested functions across 23 specialized modules for async operations, data processing, file handling, security, and more
Project description
๐ ๏ธ Pyutils Collection
Enterprise-grade Python utilities - 320+ type-safe, tested functions across 23 specialized modules for async operations, data processing, file handling, security, and more.
๐ฏ What is This?
A curated collection of 320+ utility functions across 23 specialized modules - designed for copy-paste reuse or pip install. Each function is self-contained with type hints, docstrings, and handles its own dependencies gracefully.
Philosophy:
- ๐ Copy-paste friendly - Functions work standalone
- ๐ Type-safe - Complete type hints (Python 3.10+)
- ๐ Self-documenting - NumPy-style docstrings with examples
- โ Well-tested - 88%+ coverage with 5500+ test cases
- ๐จ Optional deps - Functions gracefully handle missing libraries
๐ฆ Quick Start
# Install from PyPI
pip install pyutils-collection
# Or clone and copy what you need
git clone https://github.com/MForofontov/pyutils-collection.git
cd pyutils-collection/pyutils_collection
# Or install for development
pip install -e ".[dev]"
๐ฆ Modules Overview
Core Modules (23 categories)
| Module | Count | Description |
|---|---|---|
| ๐ asyncio_functions | 17 | Async/await, connection pools, rate limiting, HTTP |
| ๐๏ธ compression_functions | 27 | GZIP, BZ2, LZMA, Snappy, Zstandard, polyline encoding |
| ๐๏ธ database_functions | 23 | SQLAlchemy utils, transactions, schema inspection |
| ๐ datetime_functions | 27 | Timezone conversion, business days, humanization |
| ๐จ decorators | 50+ | Caching, retry, timeout, type checking, profiling |
| ๐ file_functions | 32 | I/O, hashing, search, temp files, format conversion |
| ๐ http_functions | 9 | REST operations, downloads, query strings |
| ๐ iterable_functions | 55 | Chunking, filtering, grouping, flattening |
| ๐งฎ mathematical_functions | 5 | GCD, LCM, primes, factorial, fibonacci |
| ๐ security_functions | 12 | Encryption (AES/RSA), hashing, JWT tokens |
| ๐ serialization_functions | 28 | CSV, Excel, Parquet with streaming & conversion |
| ๐ ssh_functions | 12 | Remote execution, SFTP, key generation |
| ๐งช testing_functions | 24 | Fixtures, mocks, assertions, test data generators |
| ๐ network_functions | 28 | IP utilities, DNS, port scanning, connectivity |
| ๐ web_scraping_functions | 18 | HTML/CSS/XPath parsing, table extraction |
| ๐ญ playwright_functions | 6 | Browser automation, screenshots, session management |
| ๐ url_functions | 8 | Parse, build, validate, normalize URLs |
| regex_functions | 5 | Email/phone/URL validation & extraction |
| โ๏ธ cli_functions | 16 | System info, process management, environment vars |
| ๐ logger_functions | 7 | Logger setup, function logging, rotation |
| ๐ multiprocessing_functions | 19 | Parallel processing, pool management |
| ๐ง batch_processing_functions | 2 | Chunked processing, streaming aggregation |
| ๐ฟ env_config_functions | 6 | Config loading (env, YAML, TOML) |
| โ data_validation | Many | Type/schema validation, Pydantic/Cerberus support |
๐ Key Features
Database-Agnostic Design
All database functions use SQLAlchemy for maximum portability:
- โ PostgreSQL
- โ MySQL / MariaDB
- โ SQLite
- โ Oracle
- โ SQL Server
Type Safety
- Complete type hints using modern Python syntax (
list[str],dict[str, Any]) - Runtime type checking with decorators
- mypy-compliant codebase
Comprehensive Testing
- 88%+ test coverage
- 150+ test files with 1000+ test cases
- Pytest-based testing framework
- Comprehensive edge case coverage
Documentation
- NumPy-style docstrings for all functions
- Examples in docstrings
- Time/space complexity notes for algorithms
- Comprehensive README with usage examples
๐ Usage Examples
Database Operations
from database_functions import create_connection, atomic_transaction, execute_query
from database_functions.schema_inspection import (
get_table_info,
find_duplicate_rows,
get_foreign_key_dependencies
)
# Create connection
conn = create_connection("postgresql://user:pass@localhost/db")
# Safe transaction
with atomic_transaction(conn) as trans:
execute_query(trans, "INSERT INTO users VALUES (:name)", {"name": "John"})
# Schema inspection
table_info = get_table_info(conn, "users")
print(f"Columns: {table_info['columns']}")
# Find duplicates
duplicates = find_duplicate_rows(conn, "users", ["email"])
# Get FK dependencies for safe operations
deps = get_foreign_key_dependencies(conn)
print(f"Safe drop order: {deps['ordered_tables']}")
Async Operations
from asyncio_functions import async_batch, fetch_multiple_urls, AsyncConnectionPool
# Batch processing
async def process_items():
results = await async_batch(
items=range(100),
func=process_item,
batch_size=10
)
return results
# HTTP fetching
urls = ["https://api.example.com/1", "https://api.example.com/2"]
responses = await fetch_multiple_urls(urls, max_concurrent=5)
# Connection pooling
async with AsyncConnectionPool("postgresql://...") as pool:
async with pool.acquire() as conn:
result = await conn.fetch("SELECT * FROM users")
Decorators
from decorators import cache, retry, timeout, enforce_types
@cache(maxsize=128, ttl=3600)
@retry(max_attempts=3, backoff=2.0)
@timeout(seconds=30)
@enforce_types
def fetch_user_data(user_id: int) -> dict:
# Function logic here
return {"id": user_id, "name": "John"}
File Operations
from file_functions import read_file_lines, hash_file, find_files_by_pattern
from file_functions import temp_file_context
# Read file
lines = read_file_lines("data.txt", encoding="utf-8")
# Hash file
file_hash = hash_file("document.pdf", algorithm="sha256")
# Find files
python_files = find_files_by_pattern("/project", "*.py")
# Temp file context
with temp_file_context(suffix=".txt") as temp_path:
# Use temp file
temp_path.write_text("temporary data")
Data Serialization
from serialization_functions import (
stream_csv_chunks,
csv_to_parquet,
read_excel_sheet
)
# Stream large CSV
for chunk in stream_csv_chunks("large_file.csv", chunk_size=10000):
process_chunk(chunk)
# Convert formats
csv_to_parquet("input.csv", "output.parquet", compression="snappy")
# Read Excel
data = read_excel_sheet("report.xlsx", sheet_name="Sales")
๐ Requirements
- Python: 3.10+
- Philosophy: Functions handle missing deps gracefully - install only what you need
- Common deps:
numpy,aiohttp,sqlalchemy,psutil,tqdm - Optional:
playwright,paramiko,bcrypt,pydantic,cerberus, etc.
๐งช Testing
# Run all 5500+ tests
python -m pytest
# Coverage report (88%+)
python -m pytest --cov=. --cov-report=html
๐ค Contributing
See .github/copilot-instructions.md for detailed guidelines:
- NumPy-style docstrings with examples
- Complete type hints (Python 3.10+ syntax)
- 95%+ test coverage per function
- Self-contained, copy-paste friendly code
๐ License
MIT License - see LICENSE file for details.
๐ค Author
MForofontov
- GitHub: @MForofontov
๐ Links
- Repository: https://github.com/MForofontov/pyutils-collection
- Issues: https://github.com/MForofontov/pyutils-collection/issues
- Documentation: https://github.com/MForofontov/pyutils-collection#readme
โญ Star this repository if you find it useful! โจ Key Features
- ๐ฏ Self-contained functions - Copy one file, get everything you need
- ๐ Type-safe - Full type hints with modern Python syntax
- ๐ Well-documented - NumPy-style docstrings with examples & complexity
- โ Tested - 88% coverage, 5500+ test cases across 150+ files
- ๐ง Graceful degradation - Optional deps handled automatically
- ๐๏ธ DB-agnostic - SQLAlchemy support for PostgreSQL, MySQL, SQLite, Oracle, SQL Server๏ฟฝ Usage Examples
# Import from installed package
from pyutils_collection.decorators import cache, retry, timeout
# Or copy decorators locally and use
from decorators import cache, retry, timeout
@cache(maxsize=128, ttl=3600)
@retry(max_attempts=3, backoff=2.0)
@timeout(seconds=30)
def fetch_user_data(user_id: int) -> dict:
return {"id": user_id, "name": "John"}
from pyutils_collection.asyncio_functions import async_batch, fetch_multiple_urls
urls = ["https://api.example.com/1", "https://api.example.com/2"]
responses = await fetch_multiple_urls(urls, max_concurrent=5)
from pyutils_collection.database_functions import create_connection, atomic_transaction
conn = create_connection("postgresql://user:pass@localhost/db")
with atomic_transaction(conn) as trans:
execute_query(trans, "INSERT INTO users VALUES (:name)", {"name": "John"})
from pyutils_collection.serialization_functions import stream_csv_chunks, csv_to_parquet
for chunk in stream_csv_chunks("large.csv", chunk_size=10000):
process_chunk(chunk)
csv_to_parquet("input.csv", "output.parquet", compression="snappy
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyutils_collection-0.1.4.tar.gz.
File metadata
- Download URL: pyutils_collection-0.1.4.tar.gz
- Upload date:
- Size: 329.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c93947f526033f509e4ac426e39876323cbc6be81af23c387cab6627bad8bd0e
|
|
| MD5 |
e0a05a05a51025cb31e72c277827cd1f
|
|
| BLAKE2b-256 |
c36ec1418c3d36493f402f46bcad42f49c4fe341e5f41722fdadcca00f39694e
|
Provenance
The following attestation bundles were made for pyutils_collection-0.1.4.tar.gz:
Publisher:
publish-pypi.yml on MForofontov/pyutils-collection
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyutils_collection-0.1.4.tar.gz -
Subject digest:
c93947f526033f509e4ac426e39876323cbc6be81af23c387cab6627bad8bd0e - Sigstore transparency entry: 957510410
- Sigstore integration time:
-
Permalink:
MForofontov/pyutils-collection@4dfc1bd811c8e4742ac40a1771d2e77ad3974257 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/MForofontov
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@4dfc1bd811c8e4742ac40a1771d2e77ad3974257 -
Trigger Event:
release
-
Statement type:
File details
Details for the file pyutils_collection-0.1.4-py3-none-any.whl.
File metadata
- Download URL: pyutils_collection-0.1.4-py3-none-any.whl
- Upload date:
- Size: 653.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d624c1b74292601f06787f7664fb9db677e8a603ba81c30b1896349ea3868c9
|
|
| MD5 |
c808a85dbfc6fcc18587ddc1381a2619
|
|
| BLAKE2b-256 |
1e5eded0e86ce86b0ebc556b7c193a1b05b23ed48895f524a7ee8d13153d564f
|
Provenance
The following attestation bundles were made for pyutils_collection-0.1.4-py3-none-any.whl:
Publisher:
publish-pypi.yml on MForofontov/pyutils-collection
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pyutils_collection-0.1.4-py3-none-any.whl -
Subject digest:
8d624c1b74292601f06787f7664fb9db677e8a603ba81c30b1896349ea3868c9 - Sigstore transparency entry: 957510418
- Sigstore integration time:
-
Permalink:
MForofontov/pyutils-collection@4dfc1bd811c8e4742ac40a1771d2e77ad3974257 -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/MForofontov
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@4dfc1bd811c8e4742ac40a1771d2e77ad3974257 -
Trigger Event:
release
-
Statement type: