Skip to main content

Production-ready data connectors for 40 major government and research data sources

Project description

KRL Data Connectors

PyPI version Python Version License Documentation Status Tests Coverage Downloads

Institutional-grade, production-ready connectors for socioeconomic and policy data infrastructure

InstallationQuick StartDocumentationExamplesContributing


Overview

KRL Data Connectors provide standardized, robust interfaces for accessing a broad spectrum of socioeconomic, demographic, health, and environmental datasets. Designed for institutional workflows, these connectors ensure reproducibility, scalability, and operational reliability. KRL Data Connectors are a core component of the KRL Analytics Suite, supporting high-impact economic analysis, causal inference, and policy evaluation at scale.

Key Advantages

  • Unified API: Interact with diverse data sources via a consistent, type-safe interface.
  • Production-Ready: Engineered for operational resilience with structured logging, error handling, and retry logic.
  • Type-Safe: Full type hints and validation across all connectors.
  • Smart Caching: Minimize redundant API calls and optimize data retrieval.
  • Rich Metadata: Automatic metadata extraction and data profiling.
  • Comprehensive Testing: 2,800+ tests across 40 connectors, 80%+ coverage.
  • Quickstart Notebooks: Jupyter notebooks for rapid onboarding.
  • Secure API Key Management: Multiple secure credential resolution strategies.

Supported Data Sources

KRL Data Connectors deliver institutional access to 40 production-ready datasets across 14 domains:

Economic & Financial Data (8 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
FRED Economics Yes Daily/Real-time 800K+ series ✅ Production
BLS Labor Recommended Monthly National/State ✅ Production
BEA Economics Yes Quarterly/Annual National/Regional ✅ Production
OECD International No Varies Country-level ✅ Production
World Bank International No Annual Country-level ✅ Production
SEC Financial No Real-time Public filings ✅ Production
Treasury Financial No Daily Federal finances ✅ Production
FDIC Banking No Quarterly Bank data ✅ Production

Demographic & Labor Data (3 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
Census ACS Demographics Optional Annual All US geographies ✅ Production
Census CBP Business Optional Annual County-level ✅ Production
Census LEHD Employment No Quarterly County-level ✅ Production

Health & Wellbeing Data (5 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
HRSA Health No Annual HPSA/MUA/P ✅ Production
CDC WONDER Health No Varies County-level ✅ Production
County Health Rankings Health No Annual County-level ✅ Production
FDA Health No Real-time Drugs/devices ✅ Production
NIH Research No Daily Grants/projects ✅ Production

Environmental & Climate Data (5 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
EPA EJScreen Environment No Annual Block group ✅ Production
EPA Air Quality Environment No Hourly/Real-time Station-level ✅ Production
EPA Superfund Environment No Real-time Site-level ✅ Production
EPA Water Quality Environment No Real-time Facility-level ✅ Production
NOAA Climate Climate No Daily Station-level ✅ Production

Education Data (3 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
NCES Education No Annual School-level ✅ Production
College Scorecard Education Yes Annual Institution ✅ Production
IPEDS Education No Annual Institution ✅ Production

Housing & Urban Data (2 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
HUD Fair Market Rent Housing Yes Annual Metro/County ✅ Production
Zillow Research Housing No Monthly Metro/ZIP ✅ Production

Agricultural Data (2 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
USDA Food Atlas Agricultural Yes Annual County-level ✅ Production
USDA NASS Agricultural Yes Varies National/State ✅ Production

Crime & Justice Data (3 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
FBI UCR Crime Recommended Annual Agency-level ✅ Production
Bureau of Justice Justice No Annual National ✅ Production
Victims of Crime Justice No Annual State-level ✅ Production

Energy Data (1 connector)

Data Source Domain Auth Required Update Frequency Coverage Status
EIA Energy Yes Real-time National/State ✅ Production

Science & Research Data (2 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
USGS Geoscience No Real-time National ✅ Production
NSF Research No Daily Awards/grants ✅ Production

Transportation Data (1 connector)

Data Source Domain Auth Required Update Frequency Coverage Status
FAA Aviation No Real-time Airport/flight ✅ Production

Labor Safety Data (1 connector)

Data Source Domain Auth Required Update Frequency Coverage Status
OSHA Safety No Real-time Inspections ✅ Production

Social Services Data (2 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
Social Security Admin Social No Annual National ✅ Production
ACF Social No Annual State/County ✅ Production

Veterans Services Data (1 connector)

Data Source Domain Auth Required Update Frequency Coverage Status
VA Veterans No Real-time Facilities/benefits ✅ Production

Total: 40 Production-Ready Connectors | ✅ All Production | 🎉 100% Complete


Installation

This section describes installation options for integrating KRL Data Connectors into institutional environments.

# Basic installation
pip install krl-data-connectors

# With all optional dependencies
pip install krl-data-connectors[all]

# Development installation
pip install krl-data-connectors[dev]

Quick Start

The following examples illustrate initializing and using KRL Data Connectors for key data sources. All connectors are designed for direct integration into reproducible, scalable analytics pipelines.

County Business Patterns (CBP)

from krl_data_connectors import CountyBusinessPatternsConnector

# Initialize connector (API key detected from environment: CENSUS_API_KEY)
cbp = CountyBusinessPatternsConnector()

# Retrieve retail trade data for Rhode Island
retail_data = cbp.get_state_data(
    year=2021,
    state='44',  # Rhode Island FIPS code
    naics='44'   # Retail trade sector
)

print(f"Retrieved {len(retail_data)} records")
print(retail_data[['NAICS2017', 'ESTAB', 'EMP', 'PAYANN']].head())

LEHD Origin-Destination

from krl_data_connectors import LEHDConnector

# Initialize connector
lehd = LEHDConnector()

# Retrieve origin-destination employment flows
od_data = lehd.get_od_data(
    state='ri',
    year=2021,
    job_type='JT00',  # All jobs
    segment='S000'    # All workers
)

print(f"Retrieved {len(od_data)} origin-destination pairs")
print(od_data[['w_geocode', 'h_geocode', 'S000', 'SA01']].head())

FRED

from krl_data_connectors import FREDConnector

# Initialize connector (API key from FRED_API_KEY)
fred = FREDConnector()

# Fetch unemployment rate time series
unemployment = fred.get_series(
    series_id="UNRATE",
    observation_start="2020-01-01",
    observation_end="2023-12-31"
)

print(unemployment.head())

BLS

from krl_data_connectors import BLSConnector

# Initialize connector (API key from BLS_API_KEY)
bls = BLSConnector()

# Get unemployment rate for multiple states
unemployment = bls.get_series(
    series_ids=['LASST060000000000003', 'LASST440000000000003'],
    start_year=2020,
    end_year=2023
)

print(unemployment.head())

BEA

from krl_data_connectors import BEAConnector

# Initialize connector (API key from BEA_API_KEY)
bea = BEAConnector()

# Get GDP by state
gdp_data = bea.get_data(
    dataset='Regional',
    method='GetData',
    TableName='SAGDP2N',
    LineCode=1,
    Year='2021',
    GeoFips='STATE'
)

print(gdp_data.head())

Caching and Base Connector

All connectors inherit from BaseConnector, which provides standardized caching, configuration, and logging.

from krl_data_connectors import FREDConnector

# Enable automatic caching
fred = FREDConnector(
    api_key="your_api_key",
    cache_dir="/tmp/fred_cache",
    cache_ttl=3600  # 1 hour
)

# Cached responses are automatic
data1 = fred.get_series("UNRATE")  # Fetches from API
data2 = fred.get_series("UNRATE")  # Returns from cache

# Access cache statistics
stats = fred.cache.get_stats()
print(f"Hit rate: {stats['hit_rate']:.1f}%")

Architecture

KRL Data Connectors are engineered for extensibility and operational precision. Each connector extends a common BaseConnector, standardizing logging, configuration, caching, and request management.

BaseConnector Capabilities

The BaseConnector class implements:

  • Structured Logging: JSON logs with request and response metadata.
  • Configuration Management: Supports environment variables and YAML configuration.
  • Intelligent Caching: File-based and Redis caching with configurable TTL.
  • Error Handling: Automatic retries, API rate limiting, and timeouts.
  • Request Management: HTTP session pooling and connection reuse.
from abc import ABC, abstractmethod
from krl_core import get_logger, ConfigManager, FileCache

class BaseConnector(ABC):
    """Abstract base class for data connectors."""
    def __init__(self, api_key=None, cache_dir=None, cache_ttl=3600):
        self.logger = get_logger(self.__class__.__name__)
        self.config = ConfigManager()
        self.cache = FileCache(
            cache_dir=cache_dir,
            default_ttl=cache_ttl,
            namespace=self.__class__.__name__.lower()
        )
        # ... initialization

API Key Management

KRL Data Connectors resolve API credentials securely and automatically, supporting multiple strategies for institutional and development environments. For comprehensive details, see API_KEY_SETUP.md.

Credential Resolution Order

  1. Environment Variables (recommended for production)
  2. Configuration file at ~/.krl/apikeys (recommended for development)
  3. Direct assignment in code (not recommended for production)

Example: Environment Variables

export BEA_API_KEY="your_bea_key"
export FRED_API_KEY="your_fred_key"
export BLS_API_KEY="your_bls_key"
export CENSUS_API_KEY="your_census_key"

Example: Configuration File

mkdir -p ~/.krl
cat > ~/.krl/apikeys << EOF
BEA API KEY: your_bea_key
FRED API KEY: your_fred_key
BLS API KEY: your_bls_key
CENSUS API: your_census_key
EOF
chmod 600 ~/.krl/apikeys

Obtaining API Keys

Service Required? Registration URL
CBP/Census Optional https://api.census.gov/data/key_signup.html
FRED Yes https://fred.stlouisfed.org/docs/api/api_key.html
BLS Recommended* https://www.bls.gov/developers/home.htm
BEA Yes https://apps.bea.gov/api/signup/
LEHD No N/A

*BLS is accessible without a key but with reduced rate limits.

Configuration Utilities

KRL Data Connectors provide utilities for automatic discovery of configuration files:

from krl_data_connectors import find_config_file, BEAConnector

config_path = find_config_file('apikeys')
print(f"Config found at: {config_path}")

# Connectors use config file or environment variables automatically
bea = BEAConnector()

Configuration

KRL Data Connectors support flexible configuration via environment variables and YAML files, enabling precise control over credentials, caching, and logging.

Environment Variables

# API Keys
export CENSUS_API_KEY="your_census_key"
export FRED_API_KEY="your_fred_key"
export BLS_API_KEY="your_bls_key"
export BEA_API_KEY="your_bea_key"

# Cache settings
export KRL_CACHE_DIR="~/.krl_cache"
export KRL_CACHE_TTL="3600"

# Logging
export KRL_LOG_LEVEL="INFO"
export KRL_LOG_FORMAT="json"

YAML Configuration File

fred:
  api_key: "your_fred_key"
  base_url: "https://api.stlouisfed.org/fred"
  timeout: 30

census:
  api_key: "your_census_key"
  base_url: "https://api.census.gov/data"

cache:
  directory: "~/.krl_cache"
  ttl: 3600

logging:
  level: "INFO"
  format: "json"

Apply configuration in code:

from krl_core import ConfigManager

config = ConfigManager("config.yaml")
fred = FREDConnector(api_key=config.get("fred.api_key"))

Connector Catalog

KRL Data Connectors deliver reliable, scalable integration with the following data sources. All connectors are engineered for institutional-grade reliability and seamless analytics integration.

Production-Ready Connectors

In Development and Planned

  • CDC WONDER: Mortality and natality data (API non-functional; web interface recommended).
  • USDA Food Environment Atlas: Food access, insecurity, and local food systems.
  • OECD, World Bank, College Scorecard, IPEDS, Superfund Sites, and more: See ROADMAP.md for the full development roadmap.

Roadmap and Quality Standards

KRL Data Connectors are developed according to a structured roadmap, targeting 40 connectors across all major institutional domains. Connectors are prioritized by institutional demand, API availability, and domain coverage.

Quality controls:

  • Minimum 90% test coverage with comprehensive unit tests
  • Full type hints and validation on all public methods
  • Robust error handling and informative error messages
  • Intelligent, configurable caching
  • Structured JSON logging
  • Docstrings, usage examples, and quickstart notebooks
  • Secure API key management and input validation

For implementation schedules and API specifications, see ROADMAP.md.


Testing

KRL Data Connectors implement a 10-layer testing architecture following industry best practices from FAANG, fintech, and defense sectors. All tooling is open-source (OSS).

Testing Stack

Layer Purpose Tools Status
1. Unit Tests Individual function correctness pytest, hypothesis ✅ 408 tests, 73% coverage
2. Integration Component interactions pytest, requests-mock ✅ Implemented
3. E2E Tests Full workflow validation playwright 🔄 Planned
4. Performance Load & stress testing locust, pytest-benchmark 🔄 Planned
5. SAST Static security analysis bandit, safety, mypy ✅ Configured
6. DAST Runtime security testing OWASP ZAP 🔄 Planned
7. Mutation Test quality measurement mutmut, hypothesis 🔄 Planned
8. Contract Type & interface validation pydantic, mypy ✅ Configured
9. Penetration Ethical hacking assessment metasploit, burp 📅 Annual
10. Monitoring Continuous validation GitHub Actions, Snyk ✅ Active

Quick Test Commands

# Run all tests
make test

# Run with coverage
make coverage

# Run security scans
make security

# Run type checking
make type-check

# Full CI simulation
make ci

# See all available commands
make help

Coverage Goals

  • Current: 73.30% overall, 408 tests passing
  • Target: 90%+ line coverage, 85%+ branch coverage
  • Mutation Goal: 90%+ kill rate

For detailed testing guide, see docs/TESTING_GUIDE.md.


Development

Establish a reproducible development environment and contribute to KRL Data Connectors using the following workflow:

# Clone the repository
git clone https://github.com/KR-Labs/krl-data-connectors.git
cd krl-data-connectors

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate  # or `.venv\Scripts\activate` on Windows

# Install development and test dependencies
pip install -e ".[dev,test]"

# Run pre-commit hooks
pre-commit install
pre-commit run --all-files

# Execute tests
pytest

# Build documentation
cd docs && make html

Contributing

KR-Labs welcomes contributions that advance the scalability, reliability, and coverage of KRL Data Connectors. Review the CONTRIBUTING.md guidelines prior to submitting changes.

All contributors must sign the Contributor License Agreement (CLA) before code can be merged.


License

KRL Data Connectors are distributed under the Apache License 2.0. See the LICENSE file for full license text.

License highlights:

  • Permits commercial use, modification, and redistribution
  • Patent grant included
  • Compatible with proprietary software

Support

For technical support, institutional deployment, and community engagement:


Related Projects

KRL Data Connectors are part of the KR-Labs analytics infrastructure ecosystem:


Citation

To cite KRL Data Connectors in research or institutional documentation, use:

@software{krl_data_connectors,
  title = {KRL Data Connectors: Standardized Interfaces for Economic and Social Data},
  author = {KR-Labs},
  year = {2025},
  url = {https://github.com/KR-Labs/krl-data-connectors},
  license = {Apache-2.0}
}

Built for reproducibility, scalability, and institutional trust by KR-Labs

© 2025 KR-Labs. All rights reserved.
KR-Labs is a trademark of Quipu Research Labs, LLC, a subsidiary of Sudiata Giddasira, Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

krl_data_connectors-0.3.0.tar.gz (158.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

krl_data_connectors-0.3.0-py3-none-any.whl (176.3 kB view details)

Uploaded Python 3

File details

Details for the file krl_data_connectors-0.3.0.tar.gz.

File metadata

  • Download URL: krl_data_connectors-0.3.0.tar.gz
  • Upload date:
  • Size: 158.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for krl_data_connectors-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0ea6cd5cde066df1c70008d64473a6ab33cd95ef1b069ffc051ab09160125185
MD5 019502348b7610fa189955d543d7a58e
BLAKE2b-256 3d5ff2d95355aa58b4ce4924ae9cbcf5904f4ad49dbfc80e6310548eeb2e69b3

See more details on using hashes here.

File details

Details for the file krl_data_connectors-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for krl_data_connectors-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 49ef63e7303405906b1579a154586affb8ce5b959945215a2b39d878a092b5a5
MD5 1a15dfeac203f0e68034139699cffcc2
BLAKE2b-256 072f54cf964f94d6bf01e65de5ffef93049d1732a5d1d416d84dedce5c474d1c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page