A comprehensive Python utilities package with enhanced auto-discovery

These details have not been verified by PyPI

Project links

Homepage

Project description

Siege Utilities

siege_utilities is the shared utilities library behind Siege Analytics workflows:

Geospatial + GeoDjango boundary/data services (tiered: [geo-lite] / [geo] / [geodjango])
Google Workspace write APIs (Sheets, Docs, Slides, Drive) with multi-account management
Census API/data selection/crosswalk tooling
Isochrone analysis with configurable CRS and domain exceptions
Configuration and profile management
Distributed processing helpers (Spark/HDFS/Databricks)
Reporting and chart generation

Related: Siege Analytics ZSH Configuration

siege_utilities works alongside siege_analytics_zshrc — a modular ZSH configuration system for data engineering environments. Together they form a two-part toolchain:

siege_utilities — Python library: geospatial, reporting, analytics, Django models, distributed computing
siege_analytics_zshrc — Shell environment: Java/Spark/Hadoop/Python version management, credential handling, cluster connectivity

They are designed to work together (the ZSH config sets up SPARK_HOME, JAVA_HOME, pyenv, and credential paths that siege_utilities expects) but each can be used independently. You don't need the ZSH config to use the Python library, and vice versa.

Python Version Support

Version	Status	CI
3.11	Fully supported (floor)	Required pass
3.12	Fully supported	Required pass
3.13	Supported	Allow-failure while stabilizing
3.14	Experimental	Not yet in CI (awaiting ecosystem wheels)

The library requires Python 3.11+. Geospatial extras ([geo], [geodjango]) depend on C-extension packages (GDAL/GEOS/PROJ bindings) whose wheel availability varies by Python version — check PyPI for your target version before installing.

Install

See Installation Options for all supported install commands (base, geo-lite, geo, geodjango, all).

Quick Usage

import siege_utilities as su

su.log_info("Ready.")
recommendations = su.select_census_datasets("demographics", "tract")

Error Handling

siege_utilities follows a fail-loud-over-silent-swallow policy: when a function cannot deliver its documented output, it raises a typed exception rather than returning None, False, or an empty container that looks like a legitimate "no result." This distinguishes real failures from expected empty-input paths and prevents silent data corruption in downstream pipelines.

Key exception types by subsystem:

Subsystem	Exception	Parent	Raised When
reporting (top-level)	`ReportingConfigError`	`RuntimeError`	export / import config fails
reporting.chart_types	`UnknownChartTypeError`	`LookupError`	chart type not in registry
reporting.chart_types	`ChartParameterError`	`ValueError`	required params missing / bad
reporting.chart_types	`ChartCreationError`	`RuntimeError`	`create_function` failed
reporting.client_branding	`ClientBrandingNotFoundError`	`LookupError`	named client has no config
reporting.client_branding	`ClientBrandingError`	`RuntimeError`	I/O or YAML parse failure
geo.census_geocoder	`CensusGeocodeError`	`RuntimeError`	Census API/network failure
geo.spatial_data	`SpatialDataError`	`RuntimeError`	portal dataset / OSM failure
geo.boundary_result	`BoundaryRetrievalError` (+ 6 subclasses)	`Exception`	boundary lookup problems

All new exceptions use raise ... from e chaining — inspect exc.__cause__ to see the underlying error. Because the exception types subclass standard Python exceptions, broad existing except ValueError: / except LookupError: handlers continue to work.

See docs/FAILURE_MODES.md for the catalog of silent-swallow patterns this library has eliminated and the migration guidance for callers that relied on the old return-value behavior.

Import Philosophy

This project intentionally favors convenience access patterns, including broad function availability from the package surface. That is a design choice, not an accident.

Contributor rule: convenience imports are acceptable in explicit API-aggregation surfaces, but implementation modules should prefer explicit imports to avoid hidden collisions and reduce regression risk.

Contributor Requirements

Every PR must include:

Tests for changed behavior (and regression test for bug fixes)
Documentation updates
Notebook updates when user-facing workflows or APIs change
CodeRabbit feedback addressed for correctness/regression/API-risk findings
Required CI/PR checks green (including CodeRabbit status once enabled)

Pre-PR Validation Commands

# Test naming/location hygiene
python scripts/check_test_file_hygiene.py

# API contract tooling regression check
python scripts/contracts/generate_public_api_contract.py --output /tmp/contract_candidate.json
python scripts/contracts/compare_public_api_contracts.py \
  --baseline /tmp/contract_baseline.json \
  --candidate /tmp/contract_candidate.json \
  --release-impact <patch|minor|major> \
  --allowlist scripts/contracts/contract_allowlist.json

# Contract-tool unit tests
python -m pytest -q --no-cov tests/test_api_contract_tools.py

If a PR intentionally adds public API symbols, classify as minor and update scripts/contracts/contract_allowlist.json in the same PR.

See:

docs/policies/CODING_STYLE.md
docs/policies/PR_REVIEW_RUBRIC.md
docs/policies/CHANGE_CLASSIFICATION_AND_RELEASE_POLICY.md
docs/policies/CONTRIBUTOR_GOVERNANCE.md
docs/RELEASE_LINEAGE.md
docs/EXAMPLES.md
docs/ISOCHRONES_AND_WKLS.md
docs/MANAGED_ENVIRONMENTS.md
docs/INTENT.md — per-module purpose + divergence catalog (ELE-2416)
docs/FAILURE_MODES.md — cross-cutting anti-pattern catalog (ELE-2418)
docs/TEST_UPGRADES.md — test-quality patterns and coverage scorecard (ELE-2419)
docs/ARCHITECTURE.md + docs/adr/ — three-layer model and ADRs (ELE-2417)
docs/NOTEBOOKS.md — notebook inventory and consolidation plan (ELE-2421)

External Contributor Workflow

Use this path when contributing from a fork:

Fork this repository on GitHub, then clone your fork:

git clone https://github.com/<your-user>/siege_utilities.git
cd siege_utilities
git remote add upstream https://github.com/siege-analytics/siege_utilities.git

Create and activate a local virtual environment, then install from the cloned repo:

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Validate notebooks and notebook outputs:

python -m pytest -q --no-cov tests/test_notebooks_output_policy.py

If your change updates user-facing workflows or APIs, update the impacted notebooks and ensure notebooks/output/ artifacts remain reviewable.

Open an issue in siege-analytics/siege_utilities describing the change for merge review, link your fork branch/PR, and include:

Reproduction or motivation
Proposed change scope
Test evidence
Documentation and notebook updates

GeoDjango Integration

Full spatial data platform with 37 concrete models, 9 population services, and 7 management commands.

Model Hierarchy

TemporalGeographicFeature (abstract — no geometry)
├── TemporalBoundary (abstract — MultiPolygon)
│   ├── CensusTIGERBoundary (abstract — GEOID + TIGER metadata)
│   │   ├── State, County, Tract, BlockGroup, Block, Place, ZCTA
│   │   ├── CongressionalDistrict, CBSA, UrbanArea
│   │   ├── StateLegislativeUpper, StateLegislativeLower, VTD, Precinct
│   │   └── SchoolDistrictElementary, Secondary, Unified
│   ├── GADMBoundary → GADMCountry, GADMAdmin1-5
│   ├── NLRBRegion, FederalJudicialDistrict
│   ├── NCESLocaleBoundary, TimezoneGeometry
│   └── Intersections (County×CD, VTD×CD, Tract×CD)
├── TemporalLinearFeature (abstract — MultiLineString)
└── TemporalPointFeature (abstract — Point)
    └── SchoolLocation

Spatial Queries

from django.contrib.gis.geos import Point
from siege_utilities.geo.django.models import Tract, County, State

# Find tract containing a point
point = Point(-122.4194, 37.7749, srid=4326)
tract = Tract.objects.containing_point(point).for_year(2020).first()

# Nearest boundaries within distance (meters)
nearby = County.objects.nearest(point, max_distance_m=50_000)

# Temporal + spatial filtering
counties_2020 = County.objects.for_state("06").for_year(2020)

Management Commands

# Census TIGER/Line boundaries
python manage.py populate_boundaries --year 2020 --type county --state CA

# Demographics from ACS
python manage.py populate_demographics --year 2020 --dataset acs5 --variables B19013_001

# PL 94-171 redistricting data
python manage.py populate_pl_demographics --year 2020 --state CA

# Boundary crosswalks (2010 → 2020)
python manage.py populate_crosswalks --source-year 2010 --target-year 2020

# NCES school district + locale data
python manage.py populate_nces --year 2020

# NLRB region boundaries
python manage.py populate_nlrb_regions --year 2024

# Timezone boundaries (from timezone-boundary-builder)
python manage.py populate_timezones --file timezones.geojson --year 2024

Services

Service	Purpose
`BoundaryPopulationService`	Load TIGER/Line shapefiles into boundary models
`DemographicPopulationService`	Fetch ACS/Decennial data into DemographicSnapshot
`CrosswalkPopulationService`	Build boundary change crosswalks between vintages
`TimeseriesService`	Auto-populate DemographicTimeSeries from snapshots
`DemographicRollupService`	Aggregate child geographies to parents (GEOID or crosswalk)
`UrbanicityClassificationService`	Classify tracts by NCES urbanicity codes
`NCESPopulationService`	Load school districts, locales, and school locations
`NLRBPopulationService`	Populate NLRB region boundaries
`TimezonePopulationService`	Load IANA timezone geometries from GeoJSON

Demographics & Rollups

from siege_utilities.geo.django.models import DemographicSnapshot, DemographicTimeSeries
from siege_utilities.geo.django.services import DemographicRollupService

# Query demographics
snapshots = DemographicSnapshot.objects.filter(
    content_type__model='tract',
    dataset='acs5',
    year=2020,
)

# Roll up tract data to county level
svc = DemographicRollupService()
results = svc.rollup(
    source_level='tract',
    target_level='county',
    year=2020,
    variables=['B19013_001', 'B01003_001'],
    state_fips='06',
    min_coverage=0.8,  # warn if <80% of child geographies have data
)

# Crosswalk-aware rollup (handles boundary changes)
results = svc.rollup(
    source_level='tract',
    target_level='county',
    year=2020,
    variables=['B01003_001'],
    crosswalk_year=2010,  # map 2010 tracts to 2020 counties via crosswalk
)

Census Data Intelligence

Consolidated Census metadata registry with intelligent dataset selection.

from siege_utilities.config.census_registry import (
    SurveyType, GeographyLevel, resolve_geographic_level,
    VARIABLE_GROUPS, CANONICAL_GEOGRAPHIC_LEVELS,
)
from siege_utilities.geo import quick_census_selection

# Resolve geography aliases
level = GeographyLevel("congressional_district")  # resolves alias → "cd"

# Quick selection for analysis
result = quick_census_selection("business", "county")
print(f"Use {result['recommendations']['primary_recommendation']['dataset']}")

# Census API with caching
from siege_utilities.geo import CensusAPIClient

client = CensusAPIClient(cache_backend='django')  # or 'sqlite', 'memory'
data = client.get_acs5(
    year=2020,
    variables=['B19013_001'],
    geography='tract',
    state='06',
)

Census API Client

Direct access to Census Bureau data with built-in caching and rate limiting.

from siege_utilities.geo import CensusAPIClient

client = CensusAPIClient(api_key="your-key")

# ACS 5-Year estimates
median_income = client.get_acs5(
    year=2020,
    variables=['B19013_001', 'B01003_001'],
    geography='county',
    state='06',
)

# PL 94-171 redistricting data
from siege_utilities.geo.census_files.pl_downloader import PLFileDownloader

downloader = PLFileDownloader()
pl_data = downloader.download_state("CA", year=2020)

Hydra + Pydantic Configuration

from siege_utilities.config import HydraConfigManager

with HydraConfigManager() as manager:
    user_profile = manager.load_user_profile()
    branding = manager.load_branding_config("client_a")
    db_connections = manager.load_database_connections("client_a")

Reporting & Visualization

from siege_utilities.reporting import ReportGenerator

report_gen = ReportGenerator(client_name="Demo Company")

report_content = {
    "metadata": {"title": "Analytics Summary"},
    "sections": [{"type": "text", "title": "Overview", "content": "Report summary."}],
}
report_gen.generate_pdf_report(report_content, output_path="report.pdf")

Capabilities: 7+ map types (choropleth, marker, 3D, heatmap, cluster, flow), PDF reports with TOC, PowerPoint generation, GA geographic analysis with Census demographic joins.

Function Categories

Category	Count	Description	Dependencies
Core	16	Logging, strings, basic utils	None
Config	54	Database, project, client setup	None
Files	21	File ops, paths, remote downloads	None
Distributed	37	Spark utilities, HDFS operations	PySpark
Geo	65+	Census data, boundaries, spatial, GeoDjango	pandas, geopandas
Analytics	45+	Google Analytics, Workspace (Sheets/Docs/Slides), Snowflake	pandas, google-api-python-client
Reporting	30+	Charts, maps, GA reports, PDF generation	matplotlib, reportlab
Testing	15	Environment setup, test runners	None
Git	9	Branch ops, commit management	None
Development	9	Architecture analysis, code hygiene	None
Hygiene	5	Docstring generation, analysis	None
Data	3	Sample data utilities	pandas

Installation Options

# Core only (pyyaml, requests, tqdm, pydantic)
pip install siege-utilities

# Add extras for what you need
pip install siege-utilities[geo-lite]         # shapely, pyproj, geopy (no GDAL needed)
pip install siege-utilities[geo]              # geo-lite + geopandas, fiona, rtree, tobler (needs GDAL)
pip install siege-utilities[geodjango]        # geo + Django, DRF, PostGIS
pip install siege-utilities[data]             # pandas, numpy, openpyxl, faker
pip install siege-utilities[reporting]        # matplotlib, seaborn, folium, plotly, reportlab
pip install siege-utilities[analytics]        # GA4, Facebook, Snowflake, scipy, scikit-learn
pip install siege-utilities[distributed]      # PySpark, Apache Sedona
pip install siege-utilities[config-extras]    # Hydra, hydra-zen, omegaconf
pip install siege-utilities[web]              # BeautifulSoup, lxml
pip install siege-utilities[database]         # SQLAlchemy, psycopg2
pip install siege-utilities[all]              # Everything

# Combine extras
pip install siege-utilities[data,geo,reporting]

# Development
git clone https://github.com/siege-analytics/siege_utilities.git
cd siege_utilities
pip install -e ".[all,dev]"

Testing

1884 tests across all modules.

# Full suite
python -m pytest tests/ -v

# By marker
python -m pytest tests/ -m core
python -m pytest tests/ -m geo
python -m pytest tests/ -m "not requires_gdal"

# Quick smoke test
python -m pytest tests/ --tb=short -q

Architecture

siege_utilities/
├── config/              # Census registry, Hydra/Pydantic configs, client management
│   ├── census_registry.py   # Single source of truth for Census metadata
│   └── ...
├── geo/                 # Geospatial: Census API, GEOID utils, geocoding, spatial ops
│   ├── census_api_client.py
│   ├── census_files/    # PL 94-171, TIGER/Line downloaders
│   └── django/          # GeoDjango integration
│       ├── models/      # 37 concrete models (boundaries, demographics, crosswalks)
│       ├── services/    # 9 population services
│       ├── management/  # 7 management commands
│       ├── managers/    # Custom querysets (containing_point, nearest, for_year)
│       └── serializers/ # DRF GeoJSON serializers
├── distributed/         # Spark, HDFS, Databricks utilities
├── reporting/           # PDF, PowerPoint, choropleth, GA reports
├── analytics/           # GA4, Google Workspace (Sheets/Docs/Slides), Snowflake
├── files/               # File operations, hashing, remote downloads
├── core/                # Logging, string utilities
└── development/         # Architecture analysis, package management

Documentation

Sphinx Docs: siege-analytics.github.io/siege_utilities
Notebooks: 18 Jupyter notebooks covering all major features (in notebooks/)

Contributing

See CONTRIBUTING.md for the full guide: fork, clone, install locally, run tests, and submit a PR.

Quick version:

git clone https://github.com/<your-user>/siege_utilities.git
cd siege_utilities
python3.11 -m venv .venv && source .venv/bin/activate
pip install -e ".[all,dev]"
python -m pytest tests/ -v

License

Dual license model (effective March 6, 2026):

AGPL-3.0-only for open-source usage
Commercial license for proprietary/commercial usage by separate agreement

Attribution is required in both paths. See LICENSE, LICENSES/AGPL-3.0.txt, and COMMERCIAL_LICENSE.md.

Siege Utilities: Spatial Intelligence, In Python.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

3.17.2

May 14, 2026

3.17.1

May 14, 2026

This version

3.17.0

May 14, 2026

3.16.0

May 13, 2026

3.15.1

May 12, 2026

3.15.0

May 11, 2026

3.14.0

May 7, 2026

3.13.2

Apr 30, 2026

3.12.0

Mar 14, 2026

3.11.0

Mar 14, 2026

3.10.0

Mar 14, 2026

3.9.1

Mar 14, 2026

3.9.0

Mar 14, 2026

3.8.4

Mar 9, 2026

3.8.3

Mar 7, 2026

3.8.2

Mar 6, 2026

3.8.1

Mar 4, 2026

3.8.0

Mar 4, 2026

3.7.0

Mar 3, 2026

3.6.0

Mar 3, 2026

3.5.0

Mar 3, 2026

3.4.1

Mar 2, 2026

3.4.0

Mar 2, 2026

3.3.3

Mar 2, 2026

3.3.2

Mar 2, 2026

3.3.1

Mar 2, 2026

3.3.0

Mar 1, 2026

3.2.0

Mar 1, 2026

3.1.0

Mar 1, 2026

3.0.1

Feb 27, 2026

3.0.0

Feb 26, 2026

2.2.0

Feb 26, 2026

2.1.0

Feb 24, 2026

2.0.0

Feb 23, 2026

1.0.0

Jun 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

siege_utilities-3.17.0.tar.gz (1.1 MB view details)

Uploaded May 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

siege_utilities-3.17.0-py3-none-any.whl (1.3 MB view details)

Uploaded May 14, 2026 Python 3

File details

Details for the file siege_utilities-3.17.0.tar.gz.

File metadata

Download URL: siege_utilities-3.17.0.tar.gz
Upload date: May 14, 2026
Size: 1.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for siege_utilities-3.17.0.tar.gz
Algorithm	Hash digest
SHA256	`3a773431a99e97f68f3d157900bd2ef810c62c3ad41759df3699b2bd24729455`
MD5	`6fe4ebdb950cf0e7d17435752c87c363`
BLAKE2b-256	`41a825a4a6f3439be4df7893727320c9064be6be5264ca29260ffeb8e652d13b`

See more details on using hashes here.

File details

Details for the file siege_utilities-3.17.0-py3-none-any.whl.

File metadata

Download URL: siege_utilities-3.17.0-py3-none-any.whl
Upload date: May 14, 2026
Size: 1.3 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for siege_utilities-3.17.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2f009db943758674d004b66f397056a92f9ead4785afd3a31ae64fe72753d13a`
MD5	`4c6a81a7849d45ae7f7ce20043f2aa60`
BLAKE2b-256	`7ebde41c6a0b8488f8f7e19bf8ad59ba17b3db6604f48d240ec7e67e1ea88bcc`

See more details on using hashes here.

siege-utilities 3.17.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Siege Utilities

Related: Siege Analytics ZSH Configuration

Python Version Support

Install

Quick Usage

Error Handling

Import Philosophy

Contributor Requirements

Pre-PR Validation Commands

External Contributor Workflow

GeoDjango Integration

Model Hierarchy

Spatial Queries

Management Commands

Services

Demographics & Rollups

Census Data Intelligence

Census API Client

Hydra + Pydantic Configuration

Reporting & Visualization

Function Categories

Installation Options

Testing

Architecture

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes