Python toolkit for reproducible NYC 311 complaint analysis via a typed SDK and CLI.
Project description
nyc311
Python toolkit for reproducible NYC 311 complaint analysis via a typed SDK and CLI.
Status
nyc311 is now on the stable 0.2 line with a tested toolkit for loading,
analyzing, and exporting NYC 311 complaint data.
The first public stable release shipped in 0.2.0, and the 0.2.x line focuses
on packaging polish, developer ergonomics, and incremental workflow improvements
on top of the current analysis surface.
What ships in the stable 0.2 line
- load filtered NYC 311-style records from local CSV extracts or the live Socrata API
- stage reproducible local CSV snapshots from live fetches
- derive deterministic first-pass topic labels for supported complaint types
- aggregate complaint topics by borough or community district
- measure topic-rule coverage and summarize resolution gaps
- score anomalies over aggregated topic summaries
- export CSV tables, boundary-backed GeoJSON, and markdown report cards
- run the workflow through both a thin CLI and a composable functional SDK
Install
Choose the dependency footprint that matches your workflow:
pip install nyc311
For the full turnkey experience:
pip install "nyc311[all]"
For pandas-backed conversion helpers:
pip install "nyc311[dataframes]"
For plotting and exploratory analysis without the geospatial stack:
pip install "nyc311[science]"
Why this exists
NYC 311 data is one of the richest public records of neighborhood quality-of-life complaints in the country, but much of the useful signal is locked inside short text fields such as complaint descriptors.
This project aims to turn those records into reusable outputs for civic analysis, journalism, and research while staying honest about what is truly implemented today.
Core workflow
The stable 0.2 line focuses on a deterministic, testable workflow:
- load records from a local CSV extract or a filtered Socrata slice
- filter by date, geography, and complaint type
- assign a first-pass topic label using explicit keyword rules
- aggregate counts by borough or community district
- export a CSV summary table or boundary-backed GeoJSON artifact
Supported topic extraction
The current rules-based topic extractor is implemented only for:
Blocked DrivewayIllegal ParkingNoise - ResidentialRodent
This is intentionally described as first-pass topic extraction, not clustering or advanced NLP.
Quick links
Docs: Home, Getting Started, CLI Reference, SDK Guide, Examples, Architecture, Contributing
Example
from datetime import date
from pathlib import Path
from nyc311 import analysis, export, models, pipeline
records = pipeline.fetch_service_requests(
filters=models.ServiceRequestFilter(
start_date=date(2025, 1, 1),
end_date=date(2025, 1, 31),
geography=models.GeographyFilter("borough", models.BOROUGH_BROOKLYN),
complaint_types=("Noise - Residential",),
),
socrata_config=models.SocrataConfig(page_size=250, max_pages=1),
)
export.export_service_requests_csv(
records,
models.ExportTarget("csv", Path("brooklyn-noise-snapshot.csv")),
)
assignments = analysis.extract_topics(records, models.TopicQuery("Noise - Residential"))
summary = analysis.aggregate_by_geography(assignments, geography="community_district")
export.export_topic_table(
summary,
models.ExportTarget("csv", Path("brooklyn-noise-topics.csv")),
)
CLI equivalent:
nyc311 fetch \
--output brooklyn-noise-snapshot.csv \
--complaint-type "Noise - Residential" \
--geography borough \
--geography-value BROOKLYN \
--start-date 2025-01-01 \
--end-date 2025-01-31 \
--page-size 250 \
--max-pages 1
nyc311 topics \
--source brooklyn-noise-snapshot.csv \
--complaint-type "Noise - Residential" \
--geography community_district \
--output brooklyn-noise-topics.csv
Live-data snapshot workflow:
nyc311 fetch \
--output brooklyn-rodent-snapshot.csv \
--complaint-type "Rodent" \
--geography borough \
--geography-value BROOKLYN \
--start-date 2025-01-01 \
--end-date 2025-01-31 \
--page-size 500 \
--max-pages 1
Data assumptions
load_service_requests() currently supports:
- local CSV files
- live Socrata loading via
SocrataConfig
CSV inputs use these columns:
unique_keycreated_datecomplaint_typedescriptorboroughcommunity_districtorcommunity_board
resolution_description is optional and loaded when present. It is currently
used by the resolution-gap and report-card helpers, while topic extraction
remains descriptor-driven.
Public package surface
The current public package surface is organized around explicit namespaces:
nyc311.modelsfor dataclasses, constants, and configsnyc311.iofor CSV and Socrata loadingnyc311.analysisfor topic extraction, coverage, gaps, and anomaliesnyc311.geographiesfor packaged boundary layers and geometry helpersnyc311.samplesfor packaged sample records and sample-aligned boundariesnyc311.exportfor CSV, GeoJSON, and report exportsnyc311.pipelinefor one-call workflow helpersnyc311.dataframesfor optional pandas conversionsnyc311.spatialfor optional geopandas helpersnyc311.plottingfor optional plotting helpersnyc311.presetsfor reusable filter and Socrata config buildersnyc311.cliwith thetopicsandfetchsubcommands
Documentation
The hosted docs site is the canonical reference: nyc311.readthedocs.io.
If you are browsing in GitHub, the source docs live in docs/, including
index.md, getting-started.md, cli.md, sdk.md, examples.md, api.md,
architecture.md, and contributing.md.
Runnable examples live in examples/ as self-contained consumer projects.
For local preview:
make docs
make docs-build
Development
uv sync
uv sync --all-groups --all-extras
uv run --all-extras pytest -m "not integration"
uv run ruff check .
uv run ruff format --check .
uv run mypy
uv run mkdocs serve
uv run mkdocs build --strict
uv run python scripts/audit_public_api.py
uv run pytest -m "fetch and not integration"
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nyc311-0.2.1.tar.gz.
File metadata
- Download URL: nyc311-0.2.1.tar.gz
- Upload date:
- Size: 12.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60d3c91bb2b0371602d6e5d4eb46517702dae8ca6786445919d9c65cbfe93b99
|
|
| MD5 |
ad5dee024e3079deaf21d0c97ea88bf4
|
|
| BLAKE2b-256 |
3dbefd2523773873a897453b5bfd8aefd3bea49baac31d24356f87314b4ab656
|
Provenance
The following attestation bundles were made for nyc311-0.2.1.tar.gz:
Publisher:
cd.yml on random-walks/nyc311
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nyc311-0.2.1.tar.gz -
Subject digest:
60d3c91bb2b0371602d6e5d4eb46517702dae8ca6786445919d9c65cbfe93b99 - Sigstore transparency entry: 1210010213
- Sigstore integration time:
-
Permalink:
random-walks/nyc311@cb6e8a4fdc6d2d0b20fd1a9e6923321b7c6d92f7 -
Branch / Tag:
refs/tags/0.2.1 - Owner: https://github.com/random-walks
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@cb6e8a4fdc6d2d0b20fd1a9e6923321b7c6d92f7 -
Trigger Event:
release
-
Statement type:
File details
Details for the file nyc311-0.2.1-py3-none-any.whl.
File metadata
- Download URL: nyc311-0.2.1-py3-none-any.whl
- Upload date:
- Size: 8.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69fdba07202f2b7c7ad931ef24ea791ba92d4e89d79821c8233d9c6234de24c8
|
|
| MD5 |
28df5a176d8bd81a97b97e9704ee97b0
|
|
| BLAKE2b-256 |
feffb63eb72a12b35ceb957486a3c5737a40919879a65ce9a2bbc2712615d669
|
Provenance
The following attestation bundles were made for nyc311-0.2.1-py3-none-any.whl:
Publisher:
cd.yml on random-walks/nyc311
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nyc311-0.2.1-py3-none-any.whl -
Subject digest:
69fdba07202f2b7c7ad931ef24ea791ba92d4e89d79821c8233d9c6234de24c8 - Sigstore transparency entry: 1210010271
- Sigstore integration time:
-
Permalink:
random-walks/nyc311@cb6e8a4fdc6d2d0b20fd1a9e6923321b7c6d92f7 -
Branch / Tag:
refs/tags/0.2.1 - Owner: https://github.com/random-walks
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
cd.yml@cb6e8a4fdc6d2d0b20fd1a9e6923321b7c6d92f7 -
Trigger Event:
release
-
Statement type: