A modern, configuration-driven X12 EDI parser and validator for healthcare transactions with optional LLM-powered explanations and structured data extraction
Project description
ValidEDI
A modern, configuration-driven X12 EDI parser and validator for healthcare transactions with optional LLM-powered explanations.
What is ValidEDI?
ValidEDI is a Python library for parsing and validating healthcare EDI (Electronic Data Interchange) files. It supports the most common X12 transaction types used in healthcare:
- 837P - Professional Health Care Claims
- 837I - Institutional Health Care Claims
- 835 - Health Care Claim Payment/Remittance Advice
- 834 - Benefit Enrollment and Maintenance
Key Features
✅ Parse EDI Files - Convert X12 EDI into structured Python objects
✅ Validate Transactions - 60+ validation rules with plain-English error messages
✅ Configuration-Driven - All rules and codes in YAML files, not hardcoded
✅ LLM-Powered Explanations - Optional AI-powered plain-English reports (works with ANY LLM)
✅ Loop Navigation - Hierarchical loop structure for easy data access
✅ Type-Safe - Full Pydantic v2 models with type hints
✅ Extensible - Add custom validation rules and code sets
✅ Production-Ready - Thread-safe, well-tested, comprehensive error handling
Quick Start
Installation
pip install validedi
Parse an EDI File
from validedi import parse
# Parse from file path
result = parse('claim.edi')
# Access parsed data
print(f"Transaction: {result.envelope.transaction_type}")
print(f"From: {result.envelope.sender_id}")
print(f"To: {result.envelope.receiver_id}")
print(f"Loops: {len(result.loops)}")
Validate an EDI File
from validedi import validate
# Validate the file
result = validate('claim.edi')
# Check results
if result.is_valid:
print("✅ File is valid!")
else:
print(f"❌ Found {result.error_count} errors")
for error in result.errors:
print(f" • {error.message}")
Get Plain-English Explanations (Optional)
from validedi import parse, validate
from validedi.llm import explain
# Parse and validate
edi_result = parse('claim.edi')
val_result = validate(edi_result)
# Option 1: With your own LLM (OpenAI, Groq, Bedrock, Gemini, etc.)
from openai import OpenAI
client = OpenAI(api_key="your-key")
def my_llm(prompt: str) -> str:
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
result = explain(edi_result, val_result, llm=my_llm)
print(result.report)
# Option 2: Without LLM (rule-based templates)
result = explain(edi_result, val_result)
print(result.report)
Ask Questions About Your EDI File
from validedi.llm import ask_followup
# Ask questions in plain English
answer = ask_followup(
"What is the total billed amount?",
edi_result,
val_result,
llm=my_llm
)
print(answer)
Why ValidEDI?
vs. Other EDI Libraries
| Feature | ValidEDI | Others |
|---|---|---|
| Configuration-Driven | ✅ YAML files | ❌ Hardcoded |
| Plain-English Errors | ✅ Yes | ❌ Technical codes |
| LLM Integration | ✅ Any provider | ❌ None |
| Loop Navigation | ✅ Hierarchical | ⚠️ Flat segments |
| Type Safety | ✅ Pydantic v2 | ⚠️ Dicts |
| Extensible | ✅ Easy | ⚠️ Difficult |
| Modern Python | ✅ 3.8+ | ⚠️ Legacy |
Configuration-Driven Architecture
Unlike other libraries that hardcode validation rules, ValidEDI uses YAML configuration files:
# validedi/config/rules/rules_837.yaml
rules:
- id: 'CLM01_REQUIRED'
type: 'required_element'
target: 'CLM01'
severity: 'error'
message: 'CLM01 (Patient Control Number) is blank'
suggestion: 'CLM01 must be a unique identifier for the claim'
This means you can:
- Add new validation rules without changing code
- Customize rules for your specific needs
- Version control your validation logic
- Share configurations across teams
Supported Transaction Types
837P - Professional Claims
Outpatient and office-based healthcare services billed with CPT codes.
Use Cases:
- Doctor office visits
- Outpatient procedures
- Laboratory services
- Durable medical equipment
837I - Institutional Claims
Inpatient and facility-based healthcare services billed with revenue codes.
Use Cases:
- Hospital inpatient stays
- Emergency room visits
- Skilled nursing facilities
- Home health services
835 - Remittance Advice
Payer explanation of claim payments and adjustments.
Use Cases:
- Payment reconciliation
- Claim status tracking
- Adjustment reason analysis
- Accounts receivable management
834 - Benefit Enrollment
Member insurance enrollment, changes, and terminations.
Use Cases:
- New member enrollment
- Coverage changes
- Dependent additions
- Terminations
Documentation
Quick Start
- Quick Start Guide - Get started in 5 minutes
- Basic Usage Examples - Common use cases
Deep Dive
- Architecture Deep Dive - How ValidEDI works internally
- Configuration Guide - Customize validation rules
LLM Integration
- LLM Guide - AI-powered explanations
- LLM Examples - 8+ provider examples
- Interactive Chatbot - CLI chatbot
Advanced
- API Reference - Complete API documentation
- Custom Validation - Add your own rules
- Publishing Guide - Deploy to PyPI
Examples
Parse and Navigate Loops
from validedi import parse
result = parse('claim.edi')
# Navigate hierarchical loops
for loop in result.loops:
if loop.loop_id == '2000A': # Billing Provider
for segment in loop.segments:
if segment.segment_id == 'NM1':
print(f"Provider: {segment.elements[2]}")
Custom Validation
from validedi import validate
# Validate with custom rules
result = validate('claim.edi')
# Access detailed validation results
for error in result.errors:
print(f"[{error.code}] {error.segment_id}: {error.message}")
if error.suggestion:
print(f" Fix: {error.suggestion}")
Batch Processing
from validedi import parse, validate
import glob
for filepath in glob.glob('*.edi'):
edi_result = parse(filepath)
val_result = validate(edi_result)
print(f"{filepath}: {val_result.summary}")
Interactive Chatbot
python examples/llm_chatbot.py claim.edi
You: What is the total billed amount?
Bot: The total billed amount is $4,250.00 across 3 claims.
You: Are there any errors?
Bot: There is 1 warning: CLM06 is missing on Claim #2.
You: quit
👋 Goodbye!
Architecture Highlights
1. Configuration-Driven Design
All validation rules, code sets, and transaction definitions are in YAML files:
validedi/config/
├── transactions/ # Transaction definitions (837P, 837I, 835, 834)
├── rules/ # Validation rules
├── code_sets/ # Code value lists
└── registry.yaml # Transaction registry
2. Hierarchical Loop Structure
EDI segments are organized into hierarchical loops for easy navigation:
Loop 2000A (Billing Provider)
├── Segment NM1 (Name)
├── Segment N3 (Address)
└── Loop 2000B (Subscriber)
├── Segment NM1 (Name)
└── Loop 2300 (Claim)
├── Segment CLM (Claim Info)
└── Loop 2400 (Service Line)
└── Segment SV1 (Service)
3. Type-Safe Models
Full Pydantic v2 models with type hints:
class ParsedEDI(BaseModel):
envelope: EnvelopeMeta
loops: List[Loop]
segments: List[Segment]
raw_content: str
class ValidationResult(BaseModel):
is_valid: bool
error_count: int
errors: List[ValidationError]
summary: str
4. LLM-Agnostic Integration
Works with ANY LLM through a simple callable interface:
# Works with OpenAI, Groq, Bedrock, Gemini, Anthropic, local models, etc.
def my_llm(prompt: str) -> str:
return response
result = explain(edi_result, val_result, llm=my_llm)
Validation Rules
ValidEDI includes 60+ validation rules across 4 categories:
Envelope Validation
- ISA/IEA segment presence and structure
- GS/GE functional group pairing
- ST/SE transaction set pairing
- Control number matching
- Segment count validation
Format Validation
- Date format (CCYYMMDD)
- NPI format and Luhn check
- ZIP code format
- Monetary amount format
- Element length validation
Business Rules
- Required entity presence (submitter, billing provider, payer)
- Charge total consistency
- Service line validation
- Diagnosis code format
- Procedure code validation
Code Set Validation
- Place of service codes
- Claim adjustment reason codes
- Entity identifier codes
- Relationship codes
- Coverage level codes
Code Sets
ValidEDI includes comprehensive code sets:
- 200+ Adjustment Reason Codes (CARC)
- 50+ Segment Descriptions
- 30+ Entity Codes
- 35+ Date Qualifiers
- 13 Coverage Level Codes
- 10 Relationship Codes
- 8 Claim Status Codes
- Common ICD-10 Codes
- Common CPT Codes
All code sets are in YAML files and can be extended.
LLM Integration
ValidEDI's LLM integration is provider-agnostic - works with ANY LLM:
Supported Providers (Examples Included)
- OpenAI (GPT-4, GPT-3.5)
- Groq (Llama 3.1, Mixtral, Gemma) - Free tier available
- AWS Bedrock (Claude, Llama)
- Google Gemini - Free tier available
- Anthropic Claude
- Azure OpenAI
- Local Models (Ollama, LM Studio, etc.)
- Custom Implementations
Features
- Plain-English explanations of EDI files
- Interactive Q&A about your data
- Validation error explanations with fix instructions
- Rule-based fallback (works without LLM)
- Zero dependencies (bring your own LLM)
Requirements
- Python 3.8+
- pydantic >= 2.0
- pyyaml >= 6.0
Optional (for LLM features)
- openai (for OpenAI)
- groq (for Groq)
- anthropic (for Anthropic)
- google-generativeai (for Gemini)
- boto3 (for AWS Bedrock)
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Areas for Contribution
- Additional transaction types (270/271, 276/277, 278, 997, 999)
- More validation rules
- Additional code sets
- Performance optimizations
- Documentation improvements
- Bug fixes
License
MIT License - see LICENSE file for details.
Support
- Documentation: docs/
- Examples: examples/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Roadmap
v0.2.0 (Planned)
- Additional transaction types (270/271, 276/277)
- Performance optimizations
- Streaming parser for large files
- Web UI for validation
v0.3.0 (Planned)
- Real-time validation API
- Batch processing optimizations
- Custom rule DSL
- Report generation
Acknowledgments
- Built with inspiration from Shaunak's X12-EDI-PARSER
- Validation rules based on X12 5010 implementation guides
- Code sets from HIPAA standards
Quick Links
Made with ❤️ for the healthcare tech community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file validedi-0.4.0.tar.gz.
File metadata
- Download URL: validedi-0.4.0.tar.gz
- Upload date:
- Size: 150.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f39b06e957d5cd460b9f15718cd0b94414a2c147a748ae8fdfca82ae8fce220a
|
|
| MD5 |
54de0d673ef23db23eb9bb44a1ac6d53
|
|
| BLAKE2b-256 |
f8378cb36cc4ca8191737b83a163cb961203ad86551f2a81fe1a903e7a2e049c
|
File details
Details for the file validedi-0.4.0-py3-none-any.whl.
File metadata
- Download URL: validedi-0.4.0-py3-none-any.whl
- Upload date:
- Size: 134.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a95a1b6320d8f3e26846b99b5140d805a8dfc54eaff88209e3ac6874fc0b6021
|
|
| MD5 |
7b7494c95271e088ff11cfc71729a153
|
|
| BLAKE2b-256 |
642d51484bc464007091c290c8873dbdd4a6ccb519d72cfef099c1fcd7fa52d3
|