Bolster's Brain, you've been warned
Project description
Bolster
Bolster's Brain, you've been warned 🧠
A comprehensive Python utility library for data science, web scraping, cloud services, and general development workflows. Originally designed as a personal toolkit, Bolster has evolved into a robust collection of utilities that enhance productivity across data analysis, system administration, and software development tasks.
🚀 Quick Start
Installation
pip install bolster
Basic Usage
import bolster
# Efficient data processing with built-in progress tracking
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
results = bolster.poolmap(lambda x: x**2, data)
print(results) # {1: 1, 2: 4, 3: 9, 4: 16, ...}
# Smart retry logic with exponential backoff
@bolster.backoff(Exception, tries=3, delay=1, backoff=2)
def unreliable_api_call():
# Your potentially failing code here
return "Success!"
# Efficient tree/dict navigation
nested_data = {
"users": {
"active": [{"name": "Alice", "age": 25}, {"name": "Bob", "age": 30}],
"inactive": [{"name": "Charlie", "age": 35}],
}
}
# Find all ages recursively
ages = bolster.get_recursively(nested_data, "age")
print(ages) # [25, 30, 35]
# Flatten nested structures
flat = bolster.flatten_dict(nested_data)
print(flat["users:active:0:name"]) # 'Alice'
🎯 Core Features
Concurrency & Performance
poolmap(): ThreadPoolExecutor wrapper with progress monitoring and robust error handlingexceptional_executor(): Graceful handling of failed futures in concurrent operationsbackoff(): Exponential backoff retry decorator for unreliable operationsmemoize(): Instance method caching with hit/miss tracking for performance optimization
Data Processing & Transformation
aggregate(): Pandas-like groupby operations for dictionaries and liststransform_(): Flexible data transformation with key mapping and function applicationbatch()/chunks(): Efficient sequence partitioning for processing large datasets- Compression utilities:
compress_for_relay()/decompress_from_relay()for data serialization
Tree & Dictionary Navigation
get_recursively(): Extract values from deeply nested structures by keyflatten_dict(): Convert nested dictionaries to flat key-value pairs- Tree analysis:
breadth(),depth(),leaves(),leaf_paths()for structure inspection - Path navigation:
keys_at(),items_at()for level-specific data access
Development & Debugging
arg_exception_logger(): Decorator for debugging function calls with automatic argument loggingMultipleErrors: Accumulate and handle multiple exceptions in complex workflowsworking_directory(): Context manager for safe directory operationspretty_print_request(): HTTP request debugging with automatic auth redaction
📊 Data Sources
Bolster includes specialized modules for working with Northern Ireland and UK data sources:
Northern Ireland Water Quality
from bolster.data_sources.ni_water import get_water_quality, get_water_quality_by_zone
# Get comprehensive water quality data for all NI supply zones
df = get_water_quality()
print(df.shape) # Shows number of zones and parameters
# Get specific zone data
zone_data = get_water_quality_by_zone("BALM") # Belfast Malone area
print(f"Hardness: {zone_data['NI Hardness Classification']}")
Electoral Office for Northern Ireland (EONI)
from bolster.data_sources.eoni import get_election_results
# Get Assembly election results
results_2016 = get_election_results(2016)
results_2022 = get_election_results(2022)
# Compare party performance across elections
comparison = bolster.diff(results_2022, results_2016)
Companies House Data
from bolster.data_sources.companies_house import search_companies, get_company_details
# Search for companies
results = search_companies("Technology")
# Get detailed company information
company = get_company_details("12345678") # Company number
print(f"{company['name']} - Status: {company['status']}")
UK Met Office
from bolster.data_sources.metoffice import get_precipitation_data
# Get weather data for a specific location
weather = get_precipitation_data("Belfast", start_date="2024-01-01", end_date="2024-01-31")
Northern Ireland House Price Index
from bolster.data_sources.nihpi import get_house_price_index
# Get latest house price data
hpi_data = get_house_price_index()
print(f"Current average price: £{hpi_data['average_price']:,.0f}")
☁️ Cloud Services
AWS Integration
from bolster.aws import get_session, S3Handler, DynamoHandler
# Get configured AWS session
session = get_session(profile="production")
# S3 operations with best practices
s3 = S3Handler(session)
s3.upload_file("local_file.txt", "bucket-name", "remote/path/file.txt")
# DynamoDB operations
dynamo = DynamoHandler(session)
items = dynamo.scan_table("user-data", filters={"status": "active"})
Azure Integration
from bolster.azure import AzureHandler
# Azure Blob Storage operations
azure = AzureHandler(connection_string="DefaultEndpointsProtocol=https;...")
azure.upload_blob("container", "blob_name", data)
🌐 Web Scraping & HTTP
from bolster.web import safe_request, parse_html_table
# Robust HTTP requests with automatic retries
response = safe_request("https://api.example.com/data", max_retries=3, timeout=30)
# Parse HTML tables into pandas DataFrames
tables = parse_html_table("https://example.com/tables")
print(tables[0].head()) # First table as DataFrame
🖥️ Command Line Interface
Bolster includes a CLI for common operations:
# Get precipitation data
bolster get-precipitation --location "Belfast" --start-date "2024-01-01"
# Get help on available commands
bolster --help
🔧 Advanced Examples
Concurrent Data Processing
import bolster
from datetime import datetime
# Process large datasets with progress tracking
def process_user_data(user_id):
# Simulate data processing
return {"user_id": user_id, "processed_at": datetime.now()}
user_ids = range(1000) # 1000 users to process
# Process with automatic progress bar and error handling
results = bolster.poolmap(
process_user_data,
user_ids,
max_workers=10,
progress=True, # Shows progress bar
)
print(f"Processed {len(results)} users successfully")
Smart Caching and Memoization
class DataProcessor:
@bolster.memoize
def expensive_calculation(self, data_hash):
# Expensive operation that we want to cache
import time
time.sleep(2) # Simulate expensive operation
return f"Processed: {data_hash}"
processor = DataProcessor()
# First call - takes 2 seconds
result1 = processor.expensive_calculation("abc123")
# Second call with same input - returns immediately from cache
result2 = processor.expensive_calculation("abc123")
# Check cache performance
print(f"Cache hits: {len(processor._memoize__hits)}")
print(f"Cache misses: {len(processor._memoize__misses)}")
Robust API Integration with Backoff
import requests
import bolster
@bolster.backoff((requests.RequestException, ConnectionError), tries=5, delay=1, backoff=2)
def fetch_api_data(url):
response = requests.get(url, timeout=10)
response.raise_for_status()
return response.json()
# This will automatically retry with exponential backoff on failure
data = fetch_api_data("https://api.unreliable-service.com/data")
Complex Data Transformation
# Transform API response to database format
api_response = {
"user_name": "john_doe",
"user_email": "john@example.com",
"account_type": "premium",
"signup_timestamp": "2024-01-01T12:00:00Z",
}
# Define transformation rules
rules = {
"user_name": ("username", str.upper), # Rename and transform
"user_email": ("email", None), # Keep as-is but rename
"account_type": ("tier", lambda x: x.title()), # Transform value
"signup_timestamp": ("created_at", bolster.parse_iso_datetime),
}
# Apply transformation
db_record = bolster.transform_(api_response, rules)
print(db_record)
# {'username': 'JOHN_DOE', 'email': 'john@example.com',
# 'tier': 'Premium', 'created_at': datetime(2024, 1, 1, 12, 0, 0)}
🏗️ Development Setup
Prerequisites
- Python 3.8+ (3.9, 3.10, 3.11, 3.12 supported)
- PDM (Python Dependency Management)
Installation for Development
# Clone the repository
git clone https://github.com/andrewbolster/bolster.git
cd bolster
# Install with development dependencies
pdm install -G dev
# Install pre-commit hooks
pdm run pre-commit install
# Run tests
pdm run pytest
# Run with coverage
pdm run pytest --cov=bolster --cov-report=html
# Build documentation
cd docs
pdm run make html
Running Tests
# Run all tests
pdm run pytest
# Run with verbose output and coverage
pdm run pytest -v --cov=bolster --cov-report=term-missing
# Run specific test file
pdm run pytest tests/test_core_utilities.py
# Run doctests
pdm run pytest --doctest-modules src/bolster/
📚 Documentation
- Full Documentation: https://bolster.readthedocs.io
- API Reference: Auto-generated from docstrings
- Examples: See
/notebooksdirectory for Jupyter notebook examples - Data Sources: Detailed documentation for each data source module
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Development Guidelines
- Testing: Ensure all new features have comprehensive tests
- Documentation: Add docstrings and update README for new features
- Code Style: Follow the existing code style (enforced by ruff)
- Type Hints: Include type annotations for all public functions
- Performance: Consider performance implications for data processing functions
📄 License
This project is licensed under the GNU General Public License v3 (GPLv3) - see the LICENSE file for details.
🐛 Bug Reports
If you encounter any bugs or issues, please file a bug report at: https://github.com/andrewbolster/bolster/issues
🔗 Links
- PyPI: https://pypi.org/project/bolster/
- GitHub: https://github.com/andrewbolster/bolster
- Documentation: https://bolster.readthedocs.io
- Author: Andrew Bolster
Built with ❤️ for data science, automation, and general productivity enhancement.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bolster-0.4.0.tar.gz.
File metadata
- Download URL: bolster-0.4.0.tar.gz
- Upload date:
- Size: 52.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd3c3db12107bfa6b95a3818915cd392411fc6b1f927d4eda3c8b9c51aa037cc
|
|
| MD5 |
80ec69e4d403a8e1636afc391908fab0
|
|
| BLAKE2b-256 |
88db86b85ce63cd3579df36fa4780e88f0072d21b33c3ccc4767dc412aff92d6
|
File details
Details for the file bolster-0.4.0-py3-none-any.whl.
File metadata
- Download URL: bolster-0.4.0-py3-none-any.whl
- Upload date:
- Size: 54.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
636c29a4801e262fbe287acf4162ea1b1edcb82c0db3f21dc818f89e65271399
|
|
| MD5 |
cc977b8cb337d78e10fd01791310692a
|
|
| BLAKE2b-256 |
17ab50878447d25b46f4ee05cd4eddf4b09777d1b2ec693550d7590f06202413
|