A comprehensive medical data validation library for healthcare datasets
Project description
Medical Data Validator
A comprehensive Python library and web application for validating healthcare datasets with advanced compliance checking, data quality analysis, and interactive visualizations.
🌟 Features
Core Validation
- Multi-format Support: CSV, Excel, JSON, Parquet files
- Medical Standards Compliance: HIPAA, GDPR, FDA 21 CFR Part 11, ICD-10, LOINC, CPT
- Data Quality Checks: Completeness, accuracy, consistency, timeliness
- PHI/PII Detection: Automatic identification of sensitive health information
- Custom Validation Rules: Extensible rule system for domain-specific requirements
Advanced Analytics (v1.2)
- Real-time Monitoring: System health, performance metrics, alert management
- Risk Assessment: Automated risk scoring and recommendations
- Compliance Templates: Pre-configured templates for clinical trials, EHR, imaging, lab data
- Interactive Dashboards: Rich visualizations with Plotly charts
- API Versioning: Backward-compatible v1.2 endpoints with enhanced features
Web Interface
- Modern UI: Responsive design with Bootstrap 5
- Interactive Charts: Missing values, data types, issue severity distributions
- Real-time Validation: Instant feedback with progress indicators
- Compliance Reports: Detailed compliance summaries with actionable insights
- File Upload: Drag-and-drop interface with format validation
🚀 Quick Start
Live Demo
Try the Medical Data Validator online: https://medical-data-validator-production.up.railway.app/home
Installation
# Clone the repository
git clone https://github.com/RanaEhtashamAli/medical-data-validator.git
cd medical-data-validator
# Install dependencies
pip install -r requirements.txt
# Run the web application
python launch_medical_validator_web_ui.py
Usage
Web Interface
-
Start the application:
python launch_medical_validator_web_ui.py -
Open your browser and go to: https://medical-data-validator-production.up.railway.app/home
-
Upload your medical dataset (CSV, Excel, JSON, Parquet)
-
View results with interactive charts and compliance reports
Python Library
from medical_data_validator import MedicalDataValidator
import pandas as pd
# Create validator with v1.2 features
validator = MedicalDataValidator(
enable_compliance=True,
compliance_template='clinical_trials'
)
# Load your data
data = pd.read_csv('your_medical_data.csv')
# Validate with comprehensive checks
result = validator.validate(data)
# Check results
print(f"Valid: {result.is_valid}")
print(f"Issues: {len(result.issues)}")
# Access v1.2 compliance report
if 'compliance_report' in result.summary:
compliance = result.summary['compliance_report']
print(f"Overall Score: {compliance['overall_score']:.1f}%")
print(f"Risk Level: {compliance['risk_level']}")
📊 API Endpoints
v1.2 Enhanced Endpoints
- Health Check:
https://medical-data-validator-production.up.railway.app/api/v1.2/health - File Validation:
https://medical-data-validator-production.up.railway.app/api/v1.2/validate/file - Data Validation:
https://medical-data-validator-production.up.railway.app/api/v1.2/validate/data - Compliance Check:
https://medical-data-validator-production.up.railway.app/api/v1.2/compliance/check - Compliance Templates:
https://medical-data-validator-production.up.railway.app/api/v1.2/compliance/templates - Analytics:
https://medical-data-validator-production.up.railway.app/api/v1.2/analytics/quality - Monitoring:
https://medical-data-validator-production.up.railway.app/api/v1.2/monitoring/status
Legacy Endpoints (v1.0)
- Health Check:
https://medical-data-validator-production.up.railway.app/api/health - File Validation:
https://medical-data-validator-production.up.railway.app/api/validate/file - Data Validation:
https://medical-data-validator-production.up.railway.app/api/validate/data - Compliance Check:
https://medical-data-validator-production.up.railway.app/api/compliance/check
Example API Usage
import requests
# Validate file with v1.2 features
files = {'file': open('medical_data.csv', 'rb')}
data = {
'compliance_template': 'clinical_trials',
'risk_assessment': 'true'
}
response = requests.post(
'https://medical-data-validator-production.up.railway.app/api/v1.2/validate/file',
files=files,
data=data
)
result = response.json()
print(f"Validation successful: {result['success']}")
print(f"Compliance score: {result['compliance_report']['overall_score']}%")
// JavaScript example
const response = await fetch('https://medical-data-validator-production.up.railway.app/api/v1.2/analytics/quality', {
method: 'POST',
body: formData
});
const analytics = await response.json();
console.log('Data quality score:', analytics.data_quality_score);
🏥 Supported Medical Standards
Compliance Standards
- HIPAA: Protected Health Information detection and handling
- GDPR: European data protection compliance
- FDA 21 CFR Part 11: Electronic records and signatures
- ICD-10: International Classification of Diseases
- LOINC: Logical Observation Identifiers Names and Codes
- CPT: Current Procedural Terminology
Data Quality Metrics
- Completeness: Missing value analysis
- Accuracy: Data validation and format checking
- Consistency: Cross-field validation and business rules
- Timeliness: Data freshness and update frequency
🔧 Configuration
Environment Variables
# Enable v1.2 features
ENABLE_COMPLIANCE=true
COMPLIANCE_TEMPLATE=clinical_trials
ENABLE_MONITORING=true
ENABLE_ANALYTICS=true
# Security settings
ALLOWED_ORIGINS=https://medical-data-validator-production.up.railway.app
SECRET_KEY=your-secret-key
# Performance
MAX_FILE_SIZE=16777216 # 16MB
WORKER_PROCESSES=4
Docker Deployment
# Build and run with Docker
docker-compose up -d
# Access the application
# - https://medical-data-validator-production.up.railway.app/home (Dashboard)
# - https://medical-data-validator-production.up.railway.app/api (API)
📈 Monitoring & Analytics
Real-time Monitoring
# Get system status
status = requests.get('https://medical-data-validator-production.up.railway.app/api/v1.2/monitoring/status').json()
print(f"System health: {status['health']}")
print(f"Active alerts: {status['active_alerts']}")
Quality Trends
# Get compliance score trends
trends = requests.get('https://medical-data-validator-production.up.railway.app/api/v1.2/monitoring/trends/compliance_score').json()
print(f"Average compliance: {trends['average_score']}%")
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
# Clone and setup
git clone https://github.com/RanaEhtashamAli/medical-data-validator.git
cd medical-data-validator
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
python -m pytest tests/
# Start development server
python launch_medical_validator_web_ui.py
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Healthcare Data Standards: HL7, FHIR, OMOP
- Open Source Libraries: Pandas, Plotly, Flask, Bootstrap
- Community: Contributors and users who provide feedback
📞 Support
- Documentation: https://medical-data-validator-production.up.railway.app/docs
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Developed with ❤️ for the healthcare community
Medical Data Validator - Making healthcare data validation simple, secure, and compliant.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file medical_data_validator-1.2.1.tar.gz.
File metadata
- Download URL: medical_data_validator-1.2.1.tar.gz
- Upload date:
- Size: 91.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa73e09b7b01d81c142c4178621149ae8e72cd9f70419a7835e267dfcdbeaf67
|
|
| MD5 |
d09abd000cafdbf69bcb2099233de97b
|
|
| BLAKE2b-256 |
e6e38a4e70b4880d973e38d8f96bc413de6628885f6c6fa4c8766ded63f0e6e6
|
File details
Details for the file medical_data_validator-1.2.1-py3-none-any.whl.
File metadata
- Download URL: medical_data_validator-1.2.1-py3-none-any.whl
- Upload date:
- Size: 62.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5852b037bbc440368bc5ee2a8e2f8889759fc97d43da214271e9606a2bd669c
|
|
| MD5 |
d91640c999685c8d118efc1eb73c782b
|
|
| BLAKE2b-256 |
d88fbfaad6592087fe83069c8a1382392448b33eba044fd01ef82081f2704321
|