Skip to main content

Production-ready data connectors for 52 government and research data sources with 2,098 automated tests

Project description

KR-Labs Logo

KRL Data Connectors

Production-grade connectors for socioeconomic, demographic, and policy data infrastructure.

License Python 3.9+ Build Tests Comprehensive Tests Lint Security License Compliance Code style: black

InstallationQuick StartDocumentationExamplesContributing


Overview

KRL Data Connectors provide a unified, production-grade interface for institutional access to socioeconomic, demographic, health, and environmental datasets. Designed for operational discipline and engineered reliability, these connectors form the backbone of the KRL Analytics Suite, supporting robust economic analysis, causal inference, and policy evaluation at scale.

Built on a foundation of 73 source modules, 52 connector implementations across 14 domains, and validated by over 2,098 automated tests, this library represents a comprehensive data infrastructure solution for research and production environments.

Infrastructure Guarantees

KRL Data Connectors guarantee:

  • Unified API Surface: Consistent, type-safe interfaces across all 52 connector implementations spanning 14 domains.
  • Operational Reliability: Structured logging, resilient error handling, and automated retry mechanisms backed by comprehensive integration testing.
  • Type Safety and Validation: Complete type hints across 73 source modules with runtime input validation and contract testing.
  • Optimized Caching: Intelligent caching to reduce redundant data access and improve throughput, with configurable TTL and multiple backend support.
  • Automated Metadata Extraction: Standardized profiling of dataset metadata for downstream integration and data cataloging.
  • Comprehensive Testing: 2,098 automated tests across unit, integration, and contract layers, maintaining sustained coverage above 78%.
  • Secure Credential Management: Institutional-grade credential resolution with multiple configuration sources and environment isolation.
  • Production Documentation: 16 quickstart notebooks, comprehensive API documentation, and operational runbooks for deployment scenarios.

Supported Data Sources

KRL Data Connectors provide institutional-grade access to 52 production-ready connector implementations across 14 domains. Each domain is represented by connectors engineered for reliability, operational consistency, and seamless integration with analytical workflows. The library encompasses 73 source modules and is validated by 2,098 automated tests ensuring data integrity and API stability.

Economic & Financial Data (8 connectors)

Data Source Domain Auth Required Update Frequency Coverage Status
FRED Economics Yes Daily/Real-time 800K+ series Production
BLS Labor Recommended Monthly National/State Production
BEA Economics Yes Quarterly/Annual National/Regional Production
OECD International No Varies Country-level Production
World Bank International No Annual Country-level Production
SEC Financial No Real-time Public filings Production
Treasury Financial No Daily Federal finances Production
FDIC Banking No Quarterly Bank data Production

Demographic & Labor Data (3 connectors)

Institutional connectors for core demographic and labor datasets.

Data Source Domain Auth Required Update Frequency Coverage Status
Census ACS Demographics Optional Annual All US geographies Production
Census CBP Business Optional Annual County-level Production
Census LEHD Employment No Quarterly County-level Production

Health & Wellbeing Data (5 connectors)

Connectors for health outcomes, provider, and wellbeing datasets.

Data Source Domain Auth Required Update Frequency Coverage Status
HRSA Health No Annual HPSA/MUA/P Production
CDC WONDER Health No Varies County-level Production
County Health Rankings Health No Annual County-level Production
FDA Health No Real-time Drugs/devices Production
NIH Research No Daily Grants/projects Production

Environmental & Climate Data (5 connectors)

Production connectors for environmental, air, and climate data.

Data Source Domain Auth Required Update Frequency Coverage Status
EPA EJScreen Environment No Annual Block group Production
EPA Air Quality Environment No Hourly/Real-time Station-level Production
EPA Superfund Environment No Real-time Site-level Production
EPA Water Quality Environment No Real-time Facility-level Production
NOAA Climate Climate No Daily Station-level Production

Education Data (3 connectors)

Institutional connectors for education and postsecondary data.

Data Source Domain Auth Required Update Frequency Coverage Status
NCES Education No Annual School-level Production
College Scorecard Education Yes Annual Institution Production
IPEDS Education No Annual Institution Production

Housing & Urban Data (2 connectors)

Production-ready connectors for housing and urban indicators.

Data Source Domain Auth Required Update Frequency Coverage Status
HUD Fair Market Rent Housing Yes Annual Metro/County Production
Zillow Research Housing No Monthly Metro/ZIP Production

Agricultural Data (2 connectors)

Connectors for agricultural and food environment data.

Data Source Domain Auth Required Update Frequency Coverage Status
USDA Food Atlas Agricultural Yes Annual County-level Production
USDA NASS Agricultural Yes Varies National/State Production

Crime & Justice Data (3 connectors)

Production connectors for crime and justice data.

Data Source Domain Auth Required Update Frequency Coverage Status
FBI UCR Crime Recommended Annual Agency-level Production
Bureau of Justice Justice No Annual National Production
Victims of Crime Justice No Annual State-level Production

Energy Data (1 connector)

Institutional connector for energy datasets.

Data Source Domain Auth Required Update Frequency Coverage Status
EIA Energy Yes Real-time National/State Production

Science & Research Data (2 connectors)

Connectors for geoscience and research awards data.

Data Source Domain Auth Required Update Frequency Coverage Status
USGS Geoscience No Real-time National Production
NSF Research No Daily Awards/grants Production

Transportation Data (1 connector)

Production connector for aviation and transportation data.

Data Source Domain Auth Required Update Frequency Coverage Status
FAA Aviation No Real-time Airport/flight Production

Labor Safety Data (1 connector)

Connector for labor safety and inspection data.

Data Source Domain Auth Required Update Frequency Coverage Status
OSHA Safety No Real-time Inspections Production

Social Services Data (2 connectors)

Connectors for social services and benefits data.

Data Source Domain Auth Required Update Frequency Coverage Status
Social Security Admin Social No Annual National Production
ACF Social No Annual State/County Production

Veterans Services Data (1 connector)

Connector for veterans services and benefits data.

Data Source Domain Auth Required Update Frequency Coverage Status
VA Veterans No Real-time Facilities/benefits Production

Total: 52 Production-Ready Connector Implementations | All connectors maintained with continuous integration, automated testing, and security scanning.


Security & IP Protection

KRL Data Connectors are developed and maintained under a comprehensive, audit-grade security and intellectual property (IP) protection framework. The repository implements a 10-layer defense strategy with 6 automated CI/CD workflows, continuous security scanning, and zero-tolerance secret detection to ensure compliance, confidentiality, and operational integrity.

Compliance and Protection Layers

Layer Control Area Implementation Coverage
1 Legal Protection (Copyright, Trademark, License) Automated header validation 100% of 136 total files
2 Technical Protection (Secret Scanning) Gitleaks, GitHub Advanced Security, Pre-commit Repository-wide, all commits
3 Build Verification (CI/CD Security) 6 GitHub Actions workflows All pull requests and pushes
4 License Compliance Automated dependency scanning Apache 2.0, GPL/AGPL exclusion
5 Continuous Monitoring & Response Multi-scanner validation, SARIF reporting Real-time detection and alerts
6 Static Application Security Testing (SAST) Bandit, Safety, CodeQL All Python code
7 Dependency Security Trivy, pip-audit All dependencies, daily scans
8 Type & Contract Verification mypy strict mode, pydantic validation All public APIs
9 Code Quality Gates Black, isort, flake8, ruff Pre-commit and CI enforcement
10 Documentation & Auditability Structured logging, SBOM generation All operations logged

Automated Security Controls

All code contributions and repository changes are subject to automated validation through 6 continuous integration workflows:

  • Secret Detection: Gitleaks, GitHub Advanced Security, pre-commit hooks with zero false-negative tolerance
  • Vulnerability Assessment: Trivy container scanning, CodeQL semantic analysis, SARIF reporting
  • Dependency Security: CVE scanning, pip-audit, safety checks with automatic PR creation for updates
  • License Compliance: GPL/AGPL blocklisting, Apache 2.0 compatibility verification, SBOM generation
  • Static Analysis: Bandit security linting, Safety vulnerability scanning, mypy strict type checking
  • Code Quality: Black formatting (100% compliance), isort import ordering, flake8 style enforcement
  • Copyright Validation: Automated header verification across all source files with trademark protection

Repository Validation Status

Continuous validation across 2,098 automated tests confirms:

  • Zero secrets detected across 1,000+ commit history
  • 100% of 136 files with compliant copyright headers
  • Full adherence to Apache 2.0 licensing with no GPL/AGPL dependencies
  • 78%+ sustained test coverage across all connectors
  • All 6 CI/CD workflows passing with automated quality gates

Contributor Compliance

All contributions undergo rigorous automated validation:

  • Credentials and secrets automatically blocked at commit time via pre-commit hooks
  • All pull requests subject to 6 CI/CD workflows including security, testing, and compliance validation
  • Pre-commit hooks and automated tooling required for all contributors
  • Signed Contributor License Agreement (CLA) required for code contributions
  • Automated dependency updates via Dependabot with security patch prioritization

For detailed security practices, vulnerability reporting procedures, and compliance documentation, refer to SECURITY.md.


Installation

To ensure seamless integration with institutional environments, KRL Data Connectors support multiple installation profiles tailored for production, development, and extended use cases.

# Basic installation
pip install krl-data-connectors

# With all optional dependencies
pip install krl-data-connectors[all]

# Development installation
pip install krl-data-connectors[dev]

Quick Start

The following examples demonstrate reproducible initialization and usage of KRL Data Connectors for primary data sources. All connectors are designed for direct integration into institutional analytics pipelines.

County Business Patterns (CBP)

from krl_data_connectors import CountyBusinessPatternsConnector

# Initialize connector using environment variable for API key
cbp = CountyBusinessPatternsConnector()

# Retrieve retail trade data for Rhode Island (FIPS 44, NAICS 44)
retail_data = cbp.get_state_data(
    year=2021,
    state='44',
    naics='44'
)

print(f"Records: {len(retail_data)}")
print(retail_data[['NAICS2017', 'ESTAB', 'EMP', 'PAYANN']].head())

LEHD Origin-Destination

from krl_data_connectors import LEHDConnector

# Initialize connector
lehd = LEHDConnector()

# Retrieve origin-destination employment flows for RI, 2021
od_data = lehd.get_od_data(
    state='ri',
    year=2021,
    job_type='JT00',
    segment='S000'
)

print(f"Origin-destination pairs: {len(od_data)}")
print(od_data[['w_geocode', 'h_geocode', 'S000', 'SA01']].head())

FRED

from krl_data_connectors import FREDConnector

# Initialize connector using FRED_API_KEY environment variable
fred = FREDConnector()

# Fetch unemployment rate time series
unemployment = fred.get_series(
    series_id="UNRATE",
    observation_start="2020-01-01",
    observation_end="2023-12-31"
)

print(unemployment.head())

BLS

from krl_data_connectors import BLSConnector

# Initialize connector using BLS_API_KEY environment variable
bls = BLSConnector()

# Retrieve unemployment rate for selected states
unemployment = bls.get_series(
    series_ids=['LASST060000000000003', 'LASST440000000000003'],
    start_year=2020,
    end_year=2023
)

print(unemployment.head())

BEA

from krl_data_connectors import BEAConnector

# Initialize connector using BEA_API_KEY environment variable
bea = BEAConnector()

# Retrieve GDP by state for 2021
gdp_data = bea.get_data(
    dataset='Regional',
    method='GetData',
    TableName='SAGDP2N',
    LineCode=1,
    Year='2021',
    GeoFips='STATE'
)

print(gdp_data.head())

Caching and Base Connector

All connectors inherit from BaseConnector, which standardizes caching, configuration, and logging for operational consistency.

from krl_data_connectors import FREDConnector

# Enable file-based caching with 1-hour TTL
fred = FREDConnector(
    api_key="your_api_key",
    cache_dir="/tmp/fred_cache",
    cache_ttl=3600
)

# Cached responses are handled automatically
data1 = fred.get_series("UNRATE")  # API call
data2 = fred.get_series("UNRATE")  # Served from cache

# Retrieve cache statistics
stats = fred.cache.get_stats()
print(f"Cache hit rate: {stats['hit_rate']:.1f}%")

Architecture

KRL Data Connectors are built on a modular, auditable architecture supporting institutional-scale deployment. The codebase spans 73 source modules organized across 14 domain-specific directories, with all 52 connector implementations deriving from a unified BaseConnector that enforces interface consistency, operational controls, and standardized behavior.

BaseConnector Foundation

The BaseConnector class provides institutional-grade infrastructure for all 52 connector implementations:

  • Structured Logging: JSON-formatted logs with complete request/response metadata and audit trails
  • Consistent Configuration: Environment variable and YAML-based configuration for credentials, caching, and logging
  • Standardized Caching: File-based and memory caching with configurable TTL across all connectors
  • Error Handling: Uniform exception handling, retry logic, and rate limit management
  • Type Safety: Full mypy strict mode compliance with pydantic validation across all public APIs
  • Request Management: HTTP session pooling, connection reuse, and automated timeout handling
from abc import ABC, abstractmethod
from krl_core import get_logger, ConfigManager, FileCache

class BaseConnector(ABC):
    """Abstract base class for all data connectors. Enforces interface and operational consistency."""
    def __init__(self, api_key=None, cache_dir=None, cache_ttl=3600):
        self.logger = get_logger(self.__class__.__name__)
        self.config = ConfigManager()
        self.cache = FileCache(
            cache_dir=cache_dir,
            default_ttl=cache_ttl,
            namespace=self.__class__.__name__.lower()
        )
        # ... initialization

API Key Management

KRL Data Connectors implement secure, automated credential resolution for institutional and development use. The credential management system ensures operational consistency and auditability. See API_KEY_SETUP.md for procedural details.

Credential Resolution Order

API credentials are resolved in the following order:

  1. Environment Variables (preferred for production)
  2. Configuration file at ~/.krl/apikeys (recommended for development/testing)
  3. Direct assignment in code (permitted for testing only)

Example: Environment Variables

export BEA_API_KEY="your_bea_key"
export FRED_API_KEY="your_fred_key"
export BLS_API_KEY="your_bls_key"
export CENSUS_API_KEY="your_census_key"

Example: Configuration File

mkdir -p ~/.krl
cat > ~/.krl/apikeys << EOF
BEA API KEY: your_bea_key
FRED API KEY: your_fred_key
BLS API KEY: your_bls_key
CENSUS API: your_census_key
EOF
chmod 600 ~/.krl/apikeys

Obtaining API Keys

Service Required? Registration URL
CBP/Census Optional https://api.census.gov/data/key_signup.html
FRED Yes https://fred.stlouisfed.org/docs/api/api_key.html
BLS Recommended* https://www.bls.gov/developers/home.htm
BEA Yes https://apps.bea.gov/api/signup/
LEHD No N/A

*BLS is accessible without a key but with reduced rate limits.

Configuration Utilities

Automatic discovery and management of configuration files are supported:

from krl_data_connectors import find_config_file, BEAConnector

config_path = find_config_file('apikeys')
print(f"Config found at: {config_path}")

# Connectors resolve credentials from configuration or environment
bea = BEAConnector()

Configuration

KRL Data Connectors support robust, institution-ready configuration via environment variables and YAML files. This enables precise management of credentials, caching, and logging for reproducible deployments.

Environment Variables

# API Keys
export CENSUS_API_KEY="your_census_key"
export FRED_API_KEY="your_fred_key"
export BLS_API_KEY="your_bls_key"
export BEA_API_KEY="your_bea_key"

# Cache settings
export KRL_CACHE_DIR="~/.krl_cache"
export KRL_CACHE_TTL="3600"

# Logging
export KRL_LOG_LEVEL="INFO"
export KRL_LOG_FORMAT="json"

YAML Configuration File

fred:
  api_key: "your_fred_key"
  base_url: "https://api.stlouisfed.org/fred"
  timeout: 30

census:
  api_key: "your_census_key"
  base_url: "https://api.census.gov/data"

cache:
  directory: "~/.krl_cache"
  ttl: 3600

logging:
  level: "INFO"
  format: "json"

Apply configuration in code:

from krl_core import ConfigManager

config = ConfigManager("config.yaml")
fred = FREDConnector(api_key=config.get("fred.api_key"))

Connector Catalog

KRL Data Connectors enable reliable, scalable integration with the following data sources. All connectors are engineered to institutional standards for reliability and seamless analytics integration.

Production-Ready Connectors

In Development and Planned

  • CDC WONDER: Mortality and natality data (API non-functional; web interface recommended).
  • USDA Food Environment Atlas: Food access, insecurity, and local food systems.
  • OECD, World Bank, College Scorecard, IPEDS, Superfund Sites, and additional sources: Refer to ROADMAP.md for the complete development roadmap.

Roadmap and Quality Standards

KRL Data Connectors are developed in accordance with a structured roadmap, with 52 connector implementations currently in production across major institutional domains. Prioritization is driven by institutional requirements, API stability, and comprehensive domain coverage.

Quality Standards (Applied to All 52 Connectors):

  • ≥78% test coverage sustained across 2,098 automated tests
  • Full type hints with mypy strict mode compliance on all public methods
  • Pydantic validation for all API contracts and data models
  • Robust error handling with structured diagnostics and retry logic
  • Configurable caching (file-based and memory) with intelligent TTL management
  • Structured JSON logging for operational transparency and audit trails
  • Comprehensive documentation with 16 quickstart Jupyter notebooks
  • Secure API key management with automated validation and credential resolution

For implementation milestones and API specifications, consult ROADMAP.md.


Testing

KRL Data Connectors utilize a 10-layer, open-source testing architecture aligned with institutional best practices. All tests and validation tools are fully auditable and reproducible.

Testing Model

Layer Purpose Tools Status
1. Unit Tests Function correctness pytest, hypothesis Implemented (2,098 tests, 78%+ coverage)
2. Integration Component interaction pytest, requests-mock Implemented
3. E2E Tests Workflow validation playwright Planned
4. Performance Load/stress testing locust, pytest-benchmark Planned
5. SAST Static security analysis bandit, safety, mypy Active in CI
6. DAST Runtime security testing OWASP ZAP Planned
7. Mutation Test quality measurement mutmut, hypothesis Planned
8. Contract Type/interface validation pydantic, mypy strict Active in CI
9. Penetration Security assessment metasploit, burp Annual
10. Monitoring Continuous validation GitHub Actions (6 workflows), Snyk Active

Test Execution

# Run all tests
make test

# Run with coverage
make coverage

# Run security scans
make security

# Run type checking
make type-check

# Simulate full CI pipeline
make ci

# List available commands
make help

Coverage Objectives

  • Current: 78%+ overall test coverage sustained across 2,098 automated tests
  • Target: ≥90% line coverage, ≥85% branch coverage across all 52 connector implementations
  • Mutation: ≥90% kill rate for test quality validation

For detailed procedures, see docs/TESTING_GUIDE.md.


Development

All development must ensure reproducibility and full compliance with institutional contribution standards. Establish a local environment and follow the verified workflow:

# Clone the repository
git clone https://github.com/KR-Labs/krl-data-connectors.git
cd krl-data-connectors

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate  # or `.venv\Scripts\activate` on Windows

# Install development and test dependencies
pip install -e ".[dev,test]"

# Run pre-commit hooks for code compliance
pre-commit install
pre-commit run --all-files

# Execute tests
pytest

# Build documentation
cd docs && make html

Contributing

All contributions must maintain reproducibility, compliance, and operational reliability. Review CONTRIBUTING.md before submitting changes. A signed Contributor License Agreement (CLA) is required for all code contributions.


License

KRL Data Connectors are distributed under the Apache License 2.0. For the full license text, refer to the LICENSE file.

Key License Terms:

  • Permits commercial use, modification, and redistribution
  • Patent grant included
  • Compatible with proprietary software

Support

Use the following channels for verified support and communication:


Related Projects

KRL Data Connectors are a core component of the KR-Labs analytics infrastructure ecosystem:


Citation

For institutional or research citation of KRL Data Connectors, use the following BibTeX entry:

@software{krl_data_connectors,
  title = {KRL Data Connectors: Standardized Interfaces for Economic and Social Data},
  author = {KR-Labs},
  year = {2025},
  url = {https://github.com/KR-Labs/krl-data-connectors},
  license = {Apache-2.0}
}

Engineered for reproducibility, scalability, and institutional trust by KR-Labs

© 2025 KR-Labs. All rights reserved.
KR-Labs is a trademark of Quipu Research Labs, LLC, a subsidiary of Sudiata Giddasira, Inc.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

krl_data_connectors-0.3.2.tar.gz (237.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

krl_data_connectors-0.3.2-py3-none-any.whl (262.2 kB view details)

Uploaded Python 3

File details

Details for the file krl_data_connectors-0.3.2.tar.gz.

File metadata

  • Download URL: krl_data_connectors-0.3.2.tar.gz
  • Upload date:
  • Size: 237.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for krl_data_connectors-0.3.2.tar.gz
Algorithm Hash digest
SHA256 03b184b56ee593d9fe916190acb80b7435d79eedcf6abe6f2e074caf6edbd60a
MD5 735b2e66e89497b874ddcd7427e7646d
BLAKE2b-256 7a3af5a6a42e07513955552613a61a9d02c526c0dd3975952d094979426719df

See more details on using hashes here.

Provenance

The following attestation bundles were made for krl_data_connectors-0.3.2.tar.gz:

Publisher: publish-pypi.yml on KR-Labs/krl-data-connectors

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file krl_data_connectors-0.3.2-py3-none-any.whl.

File metadata

File hashes

Hashes for krl_data_connectors-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 127f62bf7d0e8894a8d489c567390de7b15fc497c7f16cce7f5cd02dfea8c1a0
MD5 5165e15b5096d09f8bd54d3c2ff76f38
BLAKE2b-256 8334229acd61fe1e0529e4a7cb72bb71c408e7cb9f65e7955f32dc32cef4d6d2

See more details on using hashes here.

Provenance

The following attestation bundles were made for krl_data_connectors-0.3.2-py3-none-any.whl:

Publisher: publish-pypi.yml on KR-Labs/krl-data-connectors

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page