U.S. water quality and home safety data by ZIP code — violations, lead/copper, radon, PFAS, flood risk, home values, remediation costs from 50+ federal sources
Project description
us-water-quality-data
U.S. water quality data by ZIP code, packaged for Python. Includes violation history, lead/copper levels, radon zone classification, and Home Safety Scores for 3,500+ ZIP codes, sourced from the EPA Safe Drinking Water Information System (SDWIS).
Install
pip install us-water-quality-data
Quick Start
import us_water_quality_data as water
# Lookup a specific ZIP code
record = water.lookup("10001")
print(record)
# {'zip': '10001', 'city': 'New York', 'state': 'NY',
# 'home_safety_score': 36, 'home_safety_grade': 'F',
# 'total_violations': 7, 'lead_level_mg_l': 0.01, ...}
# All ZIP codes in California
ca = water.get_state("CA")
print(f"{len(ca)} ZIP codes in CA")
# 10 worst scores in the country
worst = water.get_worst(10)
for z in worst:
print(f"{z['zip']} {z['city']}, {z['state']}: {z['home_safety_score']}")
# 10 best scores
best = water.get_best(10)
# All states in the dataset
print(water.states()) # ['AK', 'AL', 'AR', ...]
# Total ZIP codes
print(water.count()) # 1990+
# Search by city
chicago = water.search_city("chicago")
# Dataset metadata
print(water.meta["updated"]) # '2026-03-17'
print(water.meta["total_zips"]) # 1990
print(water.meta["states_covered"]) # 51
API Reference
lookup(zip_code: str) -> dict | None
Lookup water quality data for a specific ZIP code. Zero-pads short codes automatically.
water.lookup("10001") # dict
water.lookup("00000") # None
water.lookup("6001") # same as "06001"
get_state(state: str) -> list[dict]
Get all ZIP records for a given state. Case-insensitive.
water.get_state("CA") # all California ZIPs
water.get_state("ny") # works too
get_worst(n: int = 10) -> list[dict]
Get the ZIP codes with the worst (lowest) Home Safety Scores, sorted ascending.
get_best(n: int = 10) -> list[dict]
Get the ZIP codes with the best (highest) Home Safety Scores, sorted descending.
states() -> list[str]
Get a sorted list of all unique 2-letter state abbreviations in the dataset.
count() -> int
Get the total number of ZIP codes in the dataset.
zips() -> list[str]
Get all ZIP codes in the dataset as a list of strings.
search_city(city: str) -> list[dict]
Search ZIP codes by city name (case-insensitive partial match).
water.search_city("chicago") # all Chicago ZIPs
water.search_city("san fran") # partial match works
meta
Dataset metadata as a dict-like object.
water.meta["name"] # 'ZipCheckup U.S. Water Quality Dataset'
water.meta["license"] # 'CC-BY-4.0'
water.meta["source"] # 'U.S. EPA Safe Drinking Water Information System (SDWIS)'
water.meta["updated"] # '2026-03-17'
water.meta["total_zips"] # number of ZIPs
water.meta["states_covered"] # number of states
water.meta["fields"] # dict of field name -> description
Data Fields
| Field | Type | Description |
|---|---|---|
zip |
str | 5-digit U.S. ZIP code |
city |
str | City name |
state |
str | 2-letter state abbreviation |
home_safety_score |
int|None | Composite score 0-100 |
home_safety_grade |
str | Letter grade: A / B / C / D / F |
total_violations |
int | Total violations in past 5 years |
health_violations |
int | Health-based violations in past 5 years |
unresolved_violations |
int | Currently unresolved violations |
contaminant_count |
int | Distinct health-based contaminants |
health_contaminant_names |
str | Semicolon-separated contaminant names |
lead_level_mg_l |
float|None | 90th percentile lead level (mg/L) |
copper_level_mg_l |
float|None | 90th percentile copper level (mg/L) |
radon_zone |
int|None | EPA radon zone: 1 (highest) to 3 (lowest) |
water_source |
str | SW = Surface Water, GW = Groundwater |
system_name |
str | Primary water system name |
pwsid |
str | EPA Public Water System ID |
population |
int|None | Population served |
latitude |
float | ZIP centroid latitude |
longitude |
float | ZIP centroid longitude |
Coverage
- ZIP codes: 3,500+ (growing with each release)
- States: All 50 U.S. states + D.C.
- Violation window: Rolling 5 years
- Update frequency: New versions published with each dataset refresh
Data Source
All data is derived from the EPA Safe Drinking Water Information System (SDWIS). Lead and copper levels come from EPA Lead and Copper Rule (LCR) sampling. Radon zones are county-level EPA classifications.
Home Safety Score is a composite 0-100 score that penalizes health-based violations, unresolved violations, lead exceedances, and contaminant count. Methodology: zipcheckup.com/about/home-safety-score/
Also Available
- npm package:
us-water-quality-data(Node.js / TypeScript) - Live site: zipcheckup.com -- free water quality reports by ZIP
- Open dataset (CSV/JSON): zipcheckup.com/data/
- API: api.zipcheckup.com/v1/
License
Data: CC BY 4.0. Code: MIT.
Data by ZipCheckup.com -- sourced from EPA SDWIS.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file us_water_quality_data-2026.5.3.tar.gz.
File metadata
- Download URL: us_water_quality_data-2026.5.3.tar.gz
- Upload date:
- Size: 1.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b0865707c56ba9a40510d5454f64bdd3da2c72d5546158ef767dc05ecf68fa2
|
|
| MD5 |
3eb94822ba5c9a72b723a90f17e07827
|
|
| BLAKE2b-256 |
2ab5486b9375277a822bb4f7bc54759b6ef0dde1669cd976b2df71c8716766ab
|
File details
Details for the file us_water_quality_data-2026.5.3-py3-none-any.whl.
File metadata
- Download URL: us_water_quality_data-2026.5.3-py3-none-any.whl
- Upload date:
- Size: 1.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a9c5e4f43e0499b946d654ed4dffdf8e604f5b503c9b4d2da9f14eaa2d43535
|
|
| MD5 |
86201d9d7e14587699fa676d67e40237
|
|
| BLAKE2b-256 |
2a78d5e6b2250c641112655bacc1eb8a640cfe25d5afc4e2a0880fc1a5828b95
|