Skip to main content

U.S. water quality and home safety data by ZIP code — violations, lead/copper, radon, PFAS, flood risk, home values, remediation costs from 50+ federal sources

Project description

us-water-quality-data

U.S. water quality data by ZIP code, packaged for Python. Includes violation history, lead/copper levels, radon zone classification, and Home Safety Scores for 3,500+ ZIP codes, sourced from the EPA Safe Drinking Water Information System (SDWIS).

PyPI License: CC BY 4.0 Python 3.9+

Install

pip install us-water-quality-data

Quick Start

import us_water_quality_data as water

# Lookup a specific ZIP code
record = water.lookup("10001")
print(record)
# {'zip': '10001', 'city': 'New York', 'state': 'NY',
#  'home_safety_score': 36, 'home_safety_grade': 'F',
#  'total_violations': 7, 'lead_level_mg_l': 0.01, ...}

# All ZIP codes in California
ca = water.get_state("CA")
print(f"{len(ca)} ZIP codes in CA")

# 10 worst scores in the country
worst = water.get_worst(10)
for z in worst:
    print(f"{z['zip']} {z['city']}, {z['state']}: {z['home_safety_score']}")

# 10 best scores
best = water.get_best(10)

# All states in the dataset
print(water.states())  # ['AK', 'AL', 'AR', ...]

# Total ZIP codes
print(water.count())  # 1990+

# Search by city
chicago = water.search_city("chicago")

# Dataset metadata
print(water.meta["updated"])        # '2026-03-17'
print(water.meta["total_zips"])     # 1990
print(water.meta["states_covered"]) # 51

API Reference

lookup(zip_code: str) -> dict | None

Lookup water quality data for a specific ZIP code. Zero-pads short codes automatically.

water.lookup("10001")   # dict
water.lookup("00000")   # None
water.lookup("6001")    # same as "06001"

get_state(state: str) -> list[dict]

Get all ZIP records for a given state. Case-insensitive.

water.get_state("CA")   # all California ZIPs
water.get_state("ny")   # works too

get_worst(n: int = 10) -> list[dict]

Get the ZIP codes with the worst (lowest) Home Safety Scores, sorted ascending.

get_best(n: int = 10) -> list[dict]

Get the ZIP codes with the best (highest) Home Safety Scores, sorted descending.

states() -> list[str]

Get a sorted list of all unique 2-letter state abbreviations in the dataset.

count() -> int

Get the total number of ZIP codes in the dataset.

zips() -> list[str]

Get all ZIP codes in the dataset as a list of strings.

search_city(city: str) -> list[dict]

Search ZIP codes by city name (case-insensitive partial match).

water.search_city("chicago")     # all Chicago ZIPs
water.search_city("san fran")    # partial match works

meta

Dataset metadata as a dict-like object.

water.meta["name"]            # 'ZipCheckup U.S. Water Quality Dataset'
water.meta["license"]         # 'CC-BY-4.0'
water.meta["source"]          # 'U.S. EPA Safe Drinking Water Information System (SDWIS)'
water.meta["updated"]         # '2026-03-17'
water.meta["total_zips"]      # number of ZIPs
water.meta["states_covered"]  # number of states
water.meta["fields"]          # dict of field name -> description

Data Fields

Field Type Description
zip str 5-digit U.S. ZIP code
city str City name
state str 2-letter state abbreviation
home_safety_score int|None Composite score 0-100
home_safety_grade str Letter grade: A / B / C / D / F
total_violations int Total violations in past 5 years
health_violations int Health-based violations in past 5 years
unresolved_violations int Currently unresolved violations
contaminant_count int Distinct health-based contaminants
health_contaminant_names str Semicolon-separated contaminant names
lead_level_mg_l float|None 90th percentile lead level (mg/L)
copper_level_mg_l float|None 90th percentile copper level (mg/L)
radon_zone int|None EPA radon zone: 1 (highest) to 3 (lowest)
water_source str SW = Surface Water, GW = Groundwater
system_name str Primary water system name
pwsid str EPA Public Water System ID
population int|None Population served
latitude float ZIP centroid latitude
longitude float ZIP centroid longitude

Coverage

  • ZIP codes: 3,500+ (growing with each release)
  • States: All 50 U.S. states + D.C.
  • Violation window: Rolling 5 years
  • Update frequency: New versions published with each dataset refresh

Data Source

All data is derived from the EPA Safe Drinking Water Information System (SDWIS). Lead and copper levels come from EPA Lead and Copper Rule (LCR) sampling. Radon zones are county-level EPA classifications.

Home Safety Score is a composite 0-100 score that penalizes health-based violations, unresolved violations, lead exceedances, and contaminant count. Methodology: zipcheckup.com/about/home-safety-score/

Also Available

License

Data: CC BY 4.0. Code: MIT.

Data by ZipCheckup.com -- sourced from EPA SDWIS.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

us_water_quality_data-2026.4.26.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

us_water_quality_data-2026.4.26-py3-none-any.whl (1.7 MB view details)

Uploaded Python 3

File details

Details for the file us_water_quality_data-2026.4.26.tar.gz.

File metadata

File hashes

Hashes for us_water_quality_data-2026.4.26.tar.gz
Algorithm Hash digest
SHA256 075f23242db5f7a1d93b770893abfb76c41e2e7b16f83b4e1ff30b1a1e6efe3b
MD5 6f91c0185630e25fb147383487664d33
BLAKE2b-256 5feae630156dc046a2c74002ed30ff1f332f745a9475918f5270eeb1b6241976

See more details on using hashes here.

File details

Details for the file us_water_quality_data-2026.4.26-py3-none-any.whl.

File metadata

File hashes

Hashes for us_water_quality_data-2026.4.26-py3-none-any.whl
Algorithm Hash digest
SHA256 ee2a71a696a24d0ee8350ffa7e4f88f5056b12bd16d1e66b9ed0024d0598908d
MD5 3bf1a9fb627be50b6a69866e8313d85e
BLAKE2b-256 4b42f479a8af9b20575cff70e8d5b7b223471eaabebe59a3c00565dc70d449a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page