Enterprise data quality layer for AI agents - Validates data quality with Verodat cloud integration. Requires Verodat API key.
Project description
verodat-adri – Enterprise Data Quality for Production AI
Enterprise-grade data quality protection for production AI workflows.
Built on the proven open source ADRI framework with enterprise extensions for team collaboration, compliance, and production reliability.
# Same code works in both editions - zero migration effort
from adri import adri_protected
@adri_protected(contract="customer_data")
def process_customers(data):
return enhanced_data
Why Upgrade to Enterprise?
For Production Teams
- Centralized Monitoring: Optional upload of assessment logs to Verodat (requires API key)
- Audit Logs: Local JSONL audit logs (Verodat-compatible schema)
- Environment Separation: Dev/test/prod config via
ADRI/config.yaml+ADRI_ENV - Same API: Same
@adri_protectedusage as open source
For AI/ML Engineers
- Workflow Context: Accepts
workflow_contextmetadata (logged when verbose) - Reasoning Mode:
reasoning_mode=Trueenables reasoning-oriented behavior + optional local JSONL prompt/response logs - Data Provenance: Accepts
data_provenancemetadata (logged when verbose) - Performance Options: Fast-path assessment manifest storage (memory/file/redis backends)
For Engineering Managers
- License Gate: API key validation + 24h cache on first use
- Operational Visibility: Centralized logs when Verodat upload is enabled
- Developer Productivity: No code changes needed beyond installing enterprise package and setting
VERODAT_API_KEY - Team Adoption: Same decorator + CLI patterns
Enterprise vs Open Source Comparison
| Feature Category | Open Source ADRI | verodat-adri Enterprise | Enterprise Value |
|---|---|---|---|
| Core Data Quality | ✅ | ✅ | Identical reliability |
| Data contracts & validation | ✅ | ✅ | Same proven algorithms |
| 5-dimension quality scoring | ✅ | ✅ | Same assessment engine |
| Local JSONL audit logs | ✅ | ✅ | File-based development logging |
| CLI tools and contract generation | ✅ | ✅ | Full feature compatibility |
| Enterprise Add-ons | |||
| API key (license) validation | ❌ | ✅ | Enforces enterprise access on first use |
| Optional Verodat API upload | ❌ | ✅ | Centralize assessments outside local files |
| Configuration | |||
Environment-aware config (ADRI_ENV) |
✅ | ✅ | Dev/test/prod behavior via config + env var |
| Workflow & LLM Context | |||
workflow_context parameter |
✅ | ✅ | Attach workflow metadata to assessments |
data_provenance parameter |
❌ | ✅ | Attach provenance metadata (enterprise wrapper) |
| Local reasoning JSONL logs | ❌ | ✅ | Stores prompt/response JSONL when enabled |
| Performance | |||
| Fast-path manifest store (memory/file/redis) | ✅ | ✅ | Faster assessment-id availability via local store |
Zero-Effort Migration
Step 1: Install Enterprise Edition
pip install verodat-adri # Replaces 'adri' package
Step 2: Add API Key
export VERODAT_API_KEY="your-api-key-here"
Step 3: Your Code Already Works!
# This exact code works in both editions - no changes needed
from adri import adri_protected
import pandas as pd
@adri_protected(
contract="customer_data",
min_score=85,
on_failure="raise"
)
def process_customer_data(data):
# Your existing data processing logic
enhanced_data = data.copy()
enhanced_data['processed_at'] = pd.Timestamp.now()
return enhanced_data
# Optional enterprise add-ons:
# ✅ License validation and access control
# ✅ Optional Verodat upload (when configured)
# ✅ Environment-aware contract resolution
# ✅ Optional reasoning logs (when enabled)
Enterprise Feature Deep Dive
1. Centralized Logging (Optional)
Problem Solved: Local-only logs are hard to aggregate across projects/environments.
Enterprise Solution: When Verodat upload is enabled, assessments can be sent to Verodat via API.
from adri import adri_protected
# Optional centralized logging (when Verodat upload is configured)
@adri_protected(contract="sales_data")
def analyze_sales_performance(data):
return analysis_results
# The assessment can be uploaded to Verodat (depending on your logging configuration)
Customer Value:
- Visibility: Aggregate quality signals across environments
- Operational debugging: Access logs outside local machines/containers
- Auditability: Preserve verifiable assessment artifacts
2. Environment-Aware Configuration
Problem Solved: Same data quality rules in dev and production can cause issues.
Enterprise Solution: Environment-specific configurations with automatic environment detection.
# ADRI/config.yaml - Enterprise environment configuration
adri:
project_name: "customer_analytics"
default_environment: "development"
environments:
development:
paths:
contracts: "./dev/contracts"
assessments: "./dev/assessments"
protection:
default_min_score: 70 # Relaxed for development
default_failure_mode: "warn" # Non-blocking for development
production:
paths:
contracts: "./prod/contracts"
assessments: "./prod/assessments"
protection:
default_min_score: 85 # Strict for production
default_failure_mode: "raise" # Blocking for production
# Runtime environment switching
export ADRI_ENV=production # Automatically uses prod config
@adri_protected(
contract="customer_data",
environment="production" # Explicit environment override
)
def process_production_data(data):
return processed_data
Customer Value:
- Safety: Prevents accidentally deploying development-grade quality rules to production
- Consistency: Standardized quality requirements across environments
- DevOps Integration: Works with GitOps and infrastructure-as-code workflows
3. Reasoning Logs (Local JSONL)
Problem Solved: AI decision-making processes are often black boxes with no audit trail.
Enterprise Solution: Optional local JSONL logging of prompt/response metadata.
from adri_enterprise import adri_protected
@adri_protected(
contract="credit_decisions",
reasoning_mode=True, # Enable AI reasoning logging
store_prompt=True, # Log AI prompts
store_response=True, # Log AI responses
llm_config={
"model": "claude-3-5-sonnet",
"temperature": 0.1,
"seed": 42
}
)
def assess_credit_risk(customer_data):
# AI reasoning steps are automatically logged
risk_assessment = ai_model.analyze_credit_risk(customer_data)
return risk_assessment
# Optional JSONL audit logs:
# - adri_reasoning_prompts.jsonl
# - adri_reasoning_responses.jsonl
Customer Value:
- Debugging: Retain prompts/responses alongside quality results
- Traceability: Persist per-run artifacts in JSONL for later review
4. Workflow Context Metadata
Problem Solved: Data quality assessments are isolated from broader workflow context.
Enterprise Solution: Attach workflow context metadata to an assessment.
from adri_enterprise import adri_protected
# Example workflow context
@adri_protected(
contract="transaction_processing",
workflow_context={
"run_id": "run_20250116_143022_abc123",
"workflow_id": "payment_processing_pipeline",
"workflow_version": "2.1.0",
"step_id": "fraud_detection",
"step_sequence": 3,
"run_at_utc": "2025-01-16T14:30:22Z"
},
data_provenance={
"source_type": "database",
"database": "transactions_prod",
"table": "payment_events",
"extracted_at": "2025-01-16T14:25:00Z",
"record_count": 15000
}
)
def detect_transaction_fraud(transaction_data):
fraud_scores = ml_model.predict_fraud(transaction_data)
return fraud_scores
# Metadata is available to logging / callbacks for downstream correlation.
Customer Value:
- Correlation: Link assessments to workflow runs and steps
- Debugging: Carry “what run/step produced this data?” through logs
5. Production-Grade Performance
Problem Solved: Open source logging can be too slow for high-throughput production systems.
Enterprise Solution: Optimized performance with batching, caching, and async processing.
from adri import adri_protected
# Performance optimizations are automatic:
@adri_protected(contract="high_volume_processing")
def process_high_volume_data(data):
return processed_data
# Behind the scenes:
# ✅ License validation cached for 24 hours
# ✅ Fast-path manifest storage provides faster assessment-id availability
Performance Benchmarks: See test suite for performance checks and thresholds.
Enterprise Deployment Patterns
Pattern 1: Team Development Workflow
# Development team using shared Verodat workspace
from adri import adri_protected
@adri_protected(
contract="user_behavior_analysis",
environment="development", # Uses dev quality thresholds
reasoning_mode=True # Enables reasoning-mode behavior
)
def analyze_user_behavior(user_data):
insights = ai_model.analyze_behavior(user_data)
return insights
# Benefits:
# - Consistent dev environment config
# - Optional reasoning logs
Pattern 2: Production AI Pipeline
# Production deployment with strict quality controls
from adri_enterprise import adri_protected
@adri_protected(
contract="risk_assessment",
environment="production", # Uses prod quality thresholds
reasoning_mode=True, # Required for compliance
min_score=95, # Strict production requirements
on_failure="raise", # Block processing on quality failure
workflow_context=workflow_metadata,
data_provenance=data_source_info
)
def assess_financial_risk(customer_portfolio_data):
risk_scores = risk_model.predict_risk(customer_portfolio_data)
return risk_scores
# Benefits:
# - Strict quality enforcement protects production systems
# - Complete audit trail for regulatory compliance
# - Workflow integration provides operational visibility
# - Data lineage tracking for compliance and debugging
Pattern 3: Regulated Industry Compliance
# Healthcare/Finance with full audit requirements
from adri_enterprise import adri_protected
@adri_protected(
contract="patient_data_processing",
reasoning_mode=True, # Required: AI reasoning audit trail
store_prompt=True, # Required: Store all AI prompts
store_response=True, # Required: Store all AI responses
data_provenance={
"source_type": "ehr_system",
"patient_consent_id": "consent_12345",
"data_classification": "phi",
"retention_policy": "7_years"
},
min_score=98, # Very strict for patient data
on_failure="raise" # Block processing on any quality issue
)
def process_patient_diagnosis_data(patient_data):
diagnosis_insights = medical_ai.analyze_symptoms(patient_data)
return diagnosis_insights
# Notes:
# This shows parameterization for stricter controls; any compliance posture depends on your
# own policies and how you retain/secure these logs.
Installation & Setup
Requirements
- Python: 3.10+ (same as open source)
- Verodat API Key: Required for enterprise features
- Dependencies: Same as open source +
requestsfor API integration
Get Your API Key
- Create account at verodat.com
- Navigate to Account Settings → API Keys
- Generate new API key for your organization
- Set environment variable:
export VERODAT_API_KEY="your-key"
Install Enterprise Edition
# Remove open source version (if installed)
pip uninstall adri
# Install enterprise edition
pip install verodat-adri
# With optional backends
pip install verodat-adri[redis] # Redis backend for fast-path logging
Verify Installation
from adri import adri_protected
import pandas as pd
# Test with sample data
test_data = pd.DataFrame({"id": [1, 2, 3], "value": [10, 20, 30]})
@adri_protected(contract="installation_test")
def test_installation(data):
return len(data)
result = test_installation(test_data)
print(f"✅ Enterprise ADRI working! Processed {result} records")
Configuration
Environment Variables
# Required
export VERODAT_API_KEY="your-enterprise-api-key"
# Optional
export VERODAT_API_URL="https://api.verodat.com/api/v1" # Custom API endpoint
export ADRI_ENV="production" # Environment selection
Enterprise Configuration File
ADRI/config.yaml (Environment-aware structure):
adri:
project_name: "customer_analytics_platform"
default_environment: "development"
environments:
development:
paths:
contracts: "./dev/contracts"
assessments: "./dev/assessments"
training_data: "./dev/training-data"
audit_logs: "./dev/audit-logs"
protection:
default_min_score: 70
default_failure_mode: "warn"
production:
paths:
contracts: "./prod/contracts"
assessments: "./prod/assessments"
training_data: "./prod/training-data"
audit_logs: "./prod/audit-logs"
protection:
default_min_score: 85
default_failure_mode: "raise"
Usage with Configuration
from adri_enterprise import adri_protected
# Automatically uses production config when ADRI_ENV=production
@adri_protected(
contract="customer_data",
environment="production" # Explicit production environment
)
def process_production_customers(data):
return enhanced_data
Advanced Enterprise Features
1. AI Reasoning Validation and Logging
Track every AI decision for compliance and debugging:
@adri_protected(
contract="ai_credit_decisions",
reasoning_mode=True, # Enable AI reasoning tracking
store_prompt=True, # Log all AI prompts for audit
store_response=True, # Log all AI responses for audit
llm_config={
"model": "claude-3-5-sonnet",
"temperature": 0.1,
"seed": 42, # Reproducible AI decisions
"max_tokens": 2000
}
)
def ai_credit_assessment(applicant_data):
# Optional logs (depending on configuration):
# - Local JSONL files
credit_decision = ai_model.assess_creditworthiness(applicant_data)
return credit_decision
Compliance Notes: This package provides logging hooks and audit artifacts, but regulatory compliance is program/process dependent and must be validated by your organization.
2. Data Provenance and Lineage Tracking
Track data sources through complex processing pipelines:
@adri_protected(
contract="cross_platform_analytics",
data_provenance={
"source_type": "verodat_query", # Data source type
"verodat_query_id": 12345, # Source query ID
"verodat_account_id": 91, # Source account
"verodat_workspace_id": 161, # Source workspace
"dataset_id": 4203, # Source dataset
"record_count": 15000, # Expected record count
"extracted_at": "2025-01-16T14:30:00Z", # Extraction timestamp
"data_classification": "confidential" # Security classification
}
)
def cross_platform_analysis(multi_source_data):
# Data lineage automatically tracked to Verodat
combined_insights = complex_analytics.process(multi_source_data)
return combined_insights
Operational Benefits:
- Root Cause Analysis: Trace quality issues back to specific data sources
- Impact Assessment: Understand downstream effects of data source changes
- Compliance Reporting: Automated data lineage reports for auditors
- Quality Attribution: Identify which data sources contribute to quality issues
3. Workflow Orchestration Context
Integrate seamlessly with enterprise workflow platforms:
# Workflow example (pseudo-code)
# from prefect import flow, task
from adri_enterprise import adri_protected
@task
@adri_protected(
contract="payment_processing_step",
workflow_context={
"run_id": "{{run_id}}", # Prefect run ID
"workflow_id": "payment_pipeline", # Workflow identifier
"workflow_version": "2.1.0", # Version tracking
"step_id": "fraud_detection", # Current step
"step_sequence": 3, # Step number in workflow
"run_at_utc": "{{run_timestamp}}" # Execution timestamp
}
)
def detect_payment_fraud(payment_transactions):
fraud_scores = fraud_model.predict(payment_transactions)
return fraud_scores
@flow(name="payment_processing_pipeline")
def payment_pipeline():
raw_data = extract_payment_data()
validated_data = detect_payment_fraud(raw_data) # Enterprise ADRI protection
process_validated_payments(validated_data)
Workflow Benefits:
- End-to-End Visibility: See data quality in context of complete business process
- Failure Attribution: Identify whether failures are data quality or business logic issues
- Performance Analytics: Understand data quality impact on overall workflow performance
- Operational Debugging: Trace issues across complex multi-step workflows
Performance & Reliability
Performance (Tested)
| Metric | Open Source | Enterprise | Improvement |
|---|---|---|---|
| Assessment ID availability | 30-60 seconds | <10ms | Faster assessment-id availability |
| License validation | N/A | <5 seconds (cached: <0.1s) | Enterprise security |
| Reasoning log writes | N/A | <0.5 seconds per step | Real-time audit |
| Memory overhead | Baseline | +<100MB | Production acceptable |
Reliability Features
Fault Tolerance:
- Network failures don't block data processing
- API timeouts are handled gracefully with retries
- Local logging continues even if cloud integration fails
- License validation cached to handle temporary API outages
Scalability:
- Batch processing for high-volume operations
- Concurrent workflow support with shared license caching
- Memory-efficient logging with configurable retention
- Horizontal scaling with Redis backend support
Getting Started
1. Quick Win: Centralized Logging
# Install and configure
pip install verodat-adri
export VERODAT_API_KEY="your-key"
# Your existing code automatically gets centralized logging
# No code changes needed!
2. Enable Environment Separation
# Create ADRI/config.yaml with dev/prod environments
# Set ADRI_ENV=production in production deployment
# Automatic environment-aware contract resolution
3. Add AI Reasoning Audit Trail
# Add reasoning_mode=True to AI processing functions
@adri_protected(contract="ai_decisions", reasoning_mode=True)
def ai_processing_function(data):
return ai_results
4. Integrate with Workflows
# Add workflow_context to track end-to-end data flow
@adri_protected(
contract="workflow_step",
workflow_context={"run_id": run_id, "step": "validation"}
)
def workflow_step(data):
return results
Support & Resources
Documentation
- Complete Documentation - Comprehensive feature documentation
- API Reference - Complete API documentation
- Enterprise Setup Guide - Detailed deployment guide
- Migration Guide - Step-by-step upgrade instructions
Community & Support
- Issues: GitHub Issues
- Enterprise Support: support@verodat.com
- Community: ADRI Standard Community
Training & Onboarding
- Free Onboarding Session: Schedule with Verodat Customer Success
- Best Practices Guide: docs/BEST_PRACTICES.md
- Workshop Materials: Contact enterprise support for team training
License & Pricing
- Enterprise License: Apache 2.0 + Valid Verodat API Key Required
- Open Source: Always free at github.com/adri-standard/adri
- Enterprise Pricing: Contact Verodat Sales for enterprise pricing
When to Choose Enterprise vs Open Source
Choose Open Source ADRI when:
- Individual developer or small team (≤3 people)
- Development/testing environments only
- Basic data quality validation needs
- Cost is primary concern
- No compliance requirements
Choose Enterprise verodat-adri when:
- Production AI workflows requiring reliability
- Team collaboration and centralized monitoring needs
- Regulated industry with compliance requirements
- Complex workflows requiring integration with orchestration platforms
- Need for audit trails and data provenance tracking
- Enterprise security and access control requirements
Ready to upgrade your data quality infrastructure?
Get started: verodat.com/adri-enterprise
Built by Verodat - Enterprise data infrastructure for AI workflows
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file verodat_adri-7.2.1.tar.gz.
File metadata
- Download URL: verodat_adri-7.2.1.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
582cfafd81c3f385841b41fda54a96ab3672961aa43ee570525c31339d79c480
|
|
| MD5 |
0e5530f90d074afe46839bd28e1027d4
|
|
| BLAKE2b-256 |
bd0d4d03b26745275126d7f9e2ea2f397066ee3dbec55ded42d54d07a8485f89
|
File details
Details for the file verodat_adri-7.2.1-py3-none-any.whl.
File metadata
- Download URL: verodat_adri-7.2.1-py3-none-any.whl
- Upload date:
- Size: 326.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f30a76290945a21f917d65a2f9dad2bcb53d0adbaca626a958773fb0b9429ff
|
|
| MD5 |
c9377c18c46465ae93445e76f681ce7f
|
|
| BLAKE2b-256 |
1138f1f5f80a657b57cbb9ff08d472a5b73ff452f5f1c1e4e68c4d37098d4204
|