A lightweight, automated testing system for data pipelines and tools

These details have not been verified by PyPI

Project description

Locaria Integrated Testing Framework

A lightweight, automated testing system for data pipelines and tools. Focuses on business-logic validation, data quality checks, and operational sanity tests rather than UI or cosmetic testing.

Features

Business Logic Validation - Test time splits sum to 100%, financial ratios are within bounds, etc.
Data Quality Checks - Schema validation, null checks, row count sanity, data freshness
Configurable Thresholds - Firestore-based configuration for easy threshold updates
Integrated Logging - Sheet Logger integration for persistent test result storage
Email Alerts - Real-time failure notifications via existing email manager API
Pipeline-Specific Tests - Custom business logic validation for different data domains

Quick Start

Basic Usage

from modules.integrated_tests import create_testkit, SchemaTests, DataQualityTests, FreshnessTests

# Initialize testing framework
testkit = create_testkit("locate_2_pulls", "daily_updates")

# Initialize test classes
schema_tests = SchemaTests(testkit)
data_quality_tests = DataQualityTests(testkit)
freshness_tests = FreshnessTests(testkit)

try:
    # Your data pipeline code
    df = extract_data()
    
    # Stage 1: Intake tests
    schema_tests.check_required_columns(df, ["employee_id", "date", "hours"])
    schema_tests.check_data_types(df, {"employee_id": "object", "hours": "float64"})
    
    # Stage 2: Transform tests
    df_transformed = transform_data(df)
    # Row count checks moved to RowCountTests
    row_count_tests.check_row_count_change(df_transformed, "table_name", "append")
    
    # Stage 3: Load tests
    load_to_bq(df_transformed, table="finance.time_splits")
    freshness_tests.check_data_freshness(df_transformed, "timestamp")
    
finally:
    # Always finalize the test run
    testkit.finalize_run()

Environment Setup

The framework automatically integrates with the existing locate_2_pulls configuration store. No additional environment variables are required for basic functionality.

Optional Environment Variables

For advanced usage or when running outside the locate_2_pulls environment:

# Email API configuration (fallback)
export EMAIL_API_URL="https://your-app.appspot.com/api/tools/send_email_direct"

# Sheet Logger configuration (fallback)
export TEST_LOGS_SPREADSHEET_ID="your-google-sheets-id"
export GOOGLE_CREDENTIALS_PATH="/path/to/credentials.json"

Automatic Configuration

The framework automatically uses:

Firestore Project: locaria-dev-config-store (from config store)
Sheet Logger: Existing sheet logger instance from config store
Email API: Tool URL from config store (tool_URL + api/tools/send_email_direct)

Test Classes

SchemaTests

Schema validation tests for data quality assurance:

check_required_columns() - Validate required columns exist
check_data_types() - Validate column data types
check_null_constraints() - Check for nulls in critical fields
check_unique_constraints() - Validate unique key constraints
check_column_values() - Check values within expected ranges or sets
check_schema_completeness() - Comprehensive schema validation

DataQualityTests

Data quality tests for common validation scenarios:

check_row_count_change() - Row count change check with Firestore history (append/truncate)
check_numeric_ranges() - Values within expected ranges
check_duplicate_records() - Detect duplicate entries
check_data_completeness() - Data completeness above threshold
check_date_ranges() - Date values within reasonable bounds

FreshnessTests

Data freshness tests for ensuring data is up-to-date:

check_data_freshness() - Verify data is up-to-date
check_timestamp_progression() - Timestamps moving forward
check_data_consistency() - Data frequency and gap validation
check_data_age_distribution() - Data age distribution analysis

Configuration

Configuration is stored in Firestore in the locaria-dev-config-store project (automatically detected from the existing config store) under the integrated_testing_config collection.

Default Configuration

{
  "thresholds": {
    "row_count_change": {
      "warn_percentage": 20,
      "fail_percentage": 50
    },
    "out_of_office_percentage": {
      "warn_threshold": 25,
      "fail_threshold": 35
    },
    "time_split_tolerance": {
      "precision": 0.01
    },
    "data_freshness": {
      "max_age_hours": 24,
      "warn_age_hours": 12
    }
  },
  "test_switches": {
    "enable_schema_validation": true,
    "enable_business_logic_checks": true,
    "enable_freshness_checks": true,
    "enable_row_count_validation": true
  },
  "email_alerts": {
    "failure_recipients": ["data_team@locaria.com"],
    "warning_recipients": ["data_team@locaria.com"],
    "digest_frequency": "daily"
  }
}

Managing Configuration

from modules.integrated_tests import ConfigManager

# Initialize config manager
config_manager = ConfigManager()

# Create default configuration for a repository
config_manager.create_default_config_for_repository("locate_2_pulls")

# Update thresholds
config_manager.update_thresholds(
    "locate_2_pulls",
    "row_count_change",
    {"warn_percentage": 15, "fail_percentage": 40}
)

# Update test switches
config_manager.update_test_switches(
    "locate_2_pulls",
    {"enable_schema_validation": False}
)

Test Severity Levels

FAIL - Stops pipeline execution, logs error, sends immediate email alert
WARN - Continues pipeline execution, logs warning, sends digest email
PASS - Test passed, logs success

Email Templates

The framework uses pre-configured email templates in the email manager:

Test Failure Alert - Immediate notification for FAIL results
Test Warning Digest - Grouped notification for WARN results

Acknowledgment System

The acknowledgment system prevents email spam by allowing users to acknowledge known issues, which mutes them for a configurable period. Both warnings and failures can be acknowledged and stored in Firestore.

How It Works

Issue Detection: Tests detect issues and log them with acknowledgment metadata
Firestore Storage: Both acknowledgeable warnings and failures are stored in Firestore during finalize_run()
Email Filtering: Email system checks acknowledgment status before sending
User Acknowledgment: Users can acknowledge issues through web interface
Mute Period: Acknowledged issues are muted for 7 days (configurable)
Automatic Expiry: Mute periods expire automatically and issues are archived

Firestore Structure

Collection: pipeline_acknowledgments
└── Document: {repo}%{pipeline}%{test_name}
    └── Subcollection: issues
        └── Document: {issue_key_simple}
            - acknowledged: bool
            - muted_until: timestamp (UTC)
            - status: "WARN" or "FAIL"
            - identifier: str
            - details: str
            - issue_first_occurrence: timestamp (UTC)
            - issue_last_occurrence: timestamp (UTC)
            - issue_owner: str
            - acknowledged_by / acknowledged_at / acknowledgment_reason
    └── Subcollection: archives
        └── Document: {issue_key_simple}
            - Archived issues (expired mutes or manually deleted)

Web Interface

The acknowledgment system includes a modern web interface accessible at /tools/acknowledgment-manager in the Analytics Hub.

Features

Real-time Filtering: Filter by repository, pipeline, test type, and issue status (Warning/Failure)
Issue Management: Acknowledge/unacknowledge issues with configurable mute periods
Bulk Operations: Select multiple issues and acknowledge or archive them in batch
Issue Details: View comprehensive issue information including first/last occurrence and ownership
Archive Management: Expired mutes are automatically archived; manual deletions are also archived

Examples

See the examples/ directory for complete pipeline implementations:

sample_pipeline.py - Complete pipeline with integrated testing
config_store_integration_example.py - Demonstrates automatic config store integration

Run the examples:

cd modules/integrated_tests/examples
python sample_pipeline.py
python config_store_integration_example.py

Architecture

integrated_tests/
├── __init__.py                 # Main module exports
├── main/
│   └── testkit.py             # Core framework and orchestration
├── utils/
│   └── config_manager.py      # Firestore configuration management
├── generic_tests/
│   ├── __init__.py
│   ├── schema_tests.py        # Schema validation tests
│   ├── data_quality_tests.py  # Data quality tests
│   └── freshness_tests.py     # Data freshness tests
├── pipeline_specific_tests/   # Business logic tests per domain
│   └── __init__.py
├── examples/
│   └── sample_pipeline.py     # Usage examples
└── README.md

Best Practices

Test Design

Focus on business logic and data quality
Use descriptive test names that explain the business rule
Test at multiple stages: intake, transform, load, post-load
Include both positive and negative test cases

Error Handling

Always use try/finally blocks to ensure test finalization
Handle missing data gracefully
Provide meaningful error messages
Log sufficient context for debugging

Performance

Batch test operations when possible
Use efficient pandas operations
Avoid unnecessary data copies
Cache configuration when appropriate

Configuration

Use Firestore for dynamic configuration
Provide sensible defaults
Document all thresholds and switches
Version control configuration changes

Troubleshooting

Common Issues

Sheet Logger Not Working
- Check TEST_LOGS_SPREADSHEET_ID environment variable
- Verify Google credentials path
- Ensure spreadsheet exists and is accessible
Email Alerts Not Sending
- Check EMAIL_API_URL environment variable
- Verify email templates are configured in email manager
- Check network connectivity
Firestore Configuration Issues
- Verify locaria-dev-config-store project access
- Check collection and document permissions
- Ensure configuration document exists
Test Failures
- Check test thresholds in Firestore
- Verify data quality and schema
- Review test logic and business rules

Debug Mode

Enable debug logging by setting the log level in configuration:

config_manager.update_repository_config(
    "locate_2_pulls",
    {"logging": {"log_level": "DEBUG"}}
)

Contributing

When adding new tests:

Follow the existing naming conventions
Include comprehensive error handling
Add configuration options for thresholds
Update documentation
Add examples for new functionality

Support

For questions or issues, contact the Data Team at data_team@locaria.com.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.3.1

May 6, 2026

1.3.0

Apr 27, 2026

1.2.5

Jan 15, 2026

1.2.4

Jan 15, 2026

1.2.3

Dec 23, 2025

1.2.2

Dec 23, 2025

1.2.1

Dec 4, 2025

1.2.0

Dec 4, 2025

1.1.16

Dec 4, 2025

1.1.15

Dec 4, 2025

1.1.14

Dec 3, 2025

1.1.13

Dec 3, 2025

1.1.12

Dec 3, 2025

1.1.11

Dec 3, 2025

1.1.10

Dec 2, 2025

1.1.9

Dec 2, 2025

1.1.8

Nov 27, 2025

1.1.7

Nov 27, 2025

1.1.6

Nov 27, 2025

1.1.5

Nov 27, 2025

1.1.4

Nov 27, 2025

1.1.3

Nov 27, 2025

1.1.2

Nov 27, 2025

1.1.1

Nov 26, 2025

1.1.0

Nov 25, 2025

1.0.12

Nov 22, 2025

This version

1.0.11

Nov 22, 2025

1.0.10

Nov 21, 2025

1.0.9

Nov 20, 2025

1.0.8

Nov 19, 2025

1.0.7

Nov 18, 2025

1.0.6

Nov 14, 2025

1.0.5

Nov 14, 2025

1.0.4

Nov 13, 2025

1.0.3

Nov 11, 2025

1.0.2

Nov 11, 2025

1.0.1

Nov 11, 2025

1.0.0

Nov 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

locaria_integrated_testing-1.0.11.tar.gz (45.8 kB view details)

Uploaded Nov 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

locaria_integrated_testing-1.0.11-py3-none-any.whl (48.4 kB view details)

Uploaded Nov 22, 2025 Python 3

File details

Details for the file locaria_integrated_testing-1.0.11.tar.gz.

File metadata

Download URL: locaria_integrated_testing-1.0.11.tar.gz
Upload date: Nov 22, 2025
Size: 45.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for locaria_integrated_testing-1.0.11.tar.gz
Algorithm	Hash digest
SHA256	`f81b9a62ff3137df0ea8b6553f372150d29a241e8f5b7fd900d0ecdd38f47586`
MD5	`b1f4b9bfa907cecdf51455434bdb4a5d`
BLAKE2b-256	`3d61579c4c44dcf4da8fa71225711aa2fc3e04165bcd0a22a35d017107ab9d2e`

See more details on using hashes here.

File details

Details for the file locaria_integrated_testing-1.0.11-py3-none-any.whl.

File metadata

Download URL: locaria_integrated_testing-1.0.11-py3-none-any.whl
Upload date: Nov 22, 2025
Size: 48.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for locaria_integrated_testing-1.0.11-py3-none-any.whl
Algorithm	Hash digest
SHA256	`71c2d11aef4c40fd001f20e7b4e96e565d1d07a84cf5f35bd8899eb5e5e4a36e`
MD5	`4bff37c440fe69595e5d9f1b18977af4`
BLAKE2b-256	`fcd6b1348cab13b917283708cec0b9386d18836b228bbe5ec80346681df66170`

See more details on using hashes here.

locaria-integrated-testing 1.0.11

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Locaria Integrated Testing Framework

Features

Quick Start

Basic Usage

Environment Setup

Optional Environment Variables

Automatic Configuration

Test Classes

SchemaTests

DataQualityTests

FreshnessTests

Configuration

Default Configuration

Managing Configuration

Test Severity Levels

Email Templates

Acknowledgment System

How It Works

Firestore Structure

Web Interface

Features

Examples

Architecture

Best Practices

Test Design

Error Handling

Performance

Configuration

Troubleshooting

Common Issues

Debug Mode

Contributing

Support

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes