Skip to main content

U.S. water quality and home safety data by ZIP code — violations, lead/copper, radon, PFAS, flood risk, home values, remediation costs from 50+ federal sources

Project description

us-water-quality-data

U.S. water quality data by ZIP code, packaged for Python. Includes violation history, lead/copper levels, radon zone classification, and Home Safety Scores for 3,500+ ZIP codes, sourced from the EPA Safe Drinking Water Information System (SDWIS).

PyPI License: CC BY 4.0 Python 3.9+

Install

pip install us-water-quality-data

Quick Start

import us_water_quality_data as water

# Lookup a specific ZIP code
record = water.lookup("10001")
print(record)
# {'zip': '10001', 'city': 'New York', 'state': 'NY',
#  'home_safety_score': 36, 'home_safety_grade': 'F',
#  'total_violations': 7, 'lead_level_mg_l': 0.01, ...}

# All ZIP codes in California
ca = water.get_state("CA")
print(f"{len(ca)} ZIP codes in CA")

# 10 worst scores in the country
worst = water.get_worst(10)
for z in worst:
    print(f"{z['zip']} {z['city']}, {z['state']}: {z['home_safety_score']}")

# 10 best scores
best = water.get_best(10)

# All states in the dataset
print(water.states())  # ['AK', 'AL', 'AR', ...]

# Total ZIP codes
print(water.count())  # 1990+

# Search by city
chicago = water.search_city("chicago")

# Dataset metadata
print(water.meta["updated"])        # '2026-03-17'
print(water.meta["total_zips"])     # 1990
print(water.meta["states_covered"]) # 51

API Reference

lookup(zip_code: str) -> dict | None

Lookup water quality data for a specific ZIP code. Zero-pads short codes automatically.

water.lookup("10001")   # dict
water.lookup("00000")   # None
water.lookup("6001")    # same as "06001"

get_state(state: str) -> list[dict]

Get all ZIP records for a given state. Case-insensitive.

water.get_state("CA")   # all California ZIPs
water.get_state("ny")   # works too

get_worst(n: int = 10) -> list[dict]

Get the ZIP codes with the worst (lowest) Home Safety Scores, sorted ascending.

get_best(n: int = 10) -> list[dict]

Get the ZIP codes with the best (highest) Home Safety Scores, sorted descending.

states() -> list[str]

Get a sorted list of all unique 2-letter state abbreviations in the dataset.

count() -> int

Get the total number of ZIP codes in the dataset.

zips() -> list[str]

Get all ZIP codes in the dataset as a list of strings.

search_city(city: str) -> list[dict]

Search ZIP codes by city name (case-insensitive partial match).

water.search_city("chicago")     # all Chicago ZIPs
water.search_city("san fran")    # partial match works

meta

Dataset metadata as a dict-like object.

water.meta["name"]            # 'ZipCheckup U.S. Water Quality Dataset'
water.meta["license"]         # 'CC-BY-4.0'
water.meta["source"]          # 'U.S. EPA Safe Drinking Water Information System (SDWIS)'
water.meta["updated"]         # '2026-03-17'
water.meta["total_zips"]      # number of ZIPs
water.meta["states_covered"]  # number of states
water.meta["fields"]          # dict of field name -> description

Data Fields

Field Type Description
zip str 5-digit U.S. ZIP code
city str City name
state str 2-letter state abbreviation
home_safety_score int|None Composite score 0-100
home_safety_grade str Letter grade: A / B / C / D / F
total_violations int Total violations in past 5 years
health_violations int Health-based violations in past 5 years
unresolved_violations int Currently unresolved violations
contaminant_count int Distinct health-based contaminants
health_contaminant_names str Semicolon-separated contaminant names
lead_level_mg_l float|None 90th percentile lead level (mg/L)
copper_level_mg_l float|None 90th percentile copper level (mg/L)
radon_zone int|None EPA radon zone: 1 (highest) to 3 (lowest)
water_source str SW = Surface Water, GW = Groundwater
system_name str Primary water system name
pwsid str EPA Public Water System ID
population int|None Population served
latitude float ZIP centroid latitude
longitude float ZIP centroid longitude

Coverage

  • ZIP codes: 3,500+ (growing with each release)
  • States: All 50 U.S. states + D.C.
  • Violation window: Rolling 5 years
  • Update frequency: New versions published with each dataset refresh

Data Source

All data is derived from the EPA Safe Drinking Water Information System (SDWIS). Lead and copper levels come from EPA Lead and Copper Rule (LCR) sampling. Radon zones are county-level EPA classifications.

Home Safety Score is a composite 0-100 score that penalizes health-based violations, unresolved violations, lead exceedances, and contaminant count. Methodology: zipcheckup.com/about/home-safety-score/

Also Available

License

Data: CC BY 4.0. Code: MIT.

Data by ZipCheckup.com -- sourced from EPA SDWIS.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

us_water_quality_data-2026.5.3.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

us_water_quality_data-2026.5.3-py3-none-any.whl (1.7 MB view details)

Uploaded Python 3

File details

Details for the file us_water_quality_data-2026.5.3.tar.gz.

File metadata

File hashes

Hashes for us_water_quality_data-2026.5.3.tar.gz
Algorithm Hash digest
SHA256 0b0865707c56ba9a40510d5454f64bdd3da2c72d5546158ef767dc05ecf68fa2
MD5 3eb94822ba5c9a72b723a90f17e07827
BLAKE2b-256 2ab5486b9375277a822bb4f7bc54759b6ef0dde1669cd976b2df71c8716766ab

See more details on using hashes here.

File details

Details for the file us_water_quality_data-2026.5.3-py3-none-any.whl.

File metadata

File hashes

Hashes for us_water_quality_data-2026.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7a9c5e4f43e0499b946d654ed4dffdf8e604f5b503c9b4d2da9f14eaa2d43535
MD5 86201d9d7e14587699fa676d67e40237
BLAKE2b-256 2a78d5e6b2250c641112655bacc1eb8a640cfe25d5afc4e2a0880fc1a5828b95

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page