Skip to main content

A modern, configuration-driven X12 EDI parser and validator for healthcare transactions with optional LLM-powered explanations and structured data extraction

Project description

ValidEDI

A modern, configuration-driven X12 EDI parser and validator for healthcare transactions with optional LLM-powered explanations.

Python 3.8+ License: MIT


What is ValidEDI?

ValidEDI is a Python library for parsing and validating healthcare EDI (Electronic Data Interchange) files. It supports the most common X12 transaction types used in healthcare:

  • 837P - Professional Health Care Claims
  • 837I - Institutional Health Care Claims
  • 835 - Health Care Claim Payment/Remittance Advice
  • 834 - Benefit Enrollment and Maintenance

Key Features

Parse EDI Files - Convert X12 EDI into structured Python objects
Validate Transactions - 60+ validation rules with plain-English error messages
Configuration-Driven - All rules and codes in YAML files, not hardcoded
LLM-Powered Explanations - Optional AI-powered plain-English reports (works with ANY LLM)
Loop Navigation - Hierarchical loop structure for easy data access
Type-Safe - Full Pydantic v2 models with type hints
Extensible - Add custom validation rules and code sets
Production-Ready - Thread-safe, well-tested, comprehensive error handling


Quick Start

Installation

pip install validedi

Parse an EDI File

from validedi import parse

# Parse from file path
result = parse('claim.edi')

# Access parsed data
print(f"Transaction: {result.envelope.transaction_type}")
print(f"From: {result.envelope.sender_id}")
print(f"To: {result.envelope.receiver_id}")
print(f"Loops: {len(result.loops)}")

Validate an EDI File

from validedi import validate

# Validate the file
result = validate('claim.edi')

# Check results
if result.is_valid:
    print("✅ File is valid!")
else:
    print(f"❌ Found {result.error_count} errors")
    for error in result.errors:
        print(f"  • {error.message}")

Get Plain-English Explanations (Optional)

from validedi import parse, validate
from validedi.llm import explain

# Parse and validate
edi_result = parse('claim.edi')
val_result = validate(edi_result)

# Option 1: With your own LLM (OpenAI, Groq, Bedrock, Gemini, etc.)
from openai import OpenAI
client = OpenAI(api_key="your-key")

def my_llm(prompt: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

result = explain(edi_result, val_result, llm=my_llm)
print(result.report)

# Option 2: Without LLM (rule-based templates)
result = explain(edi_result, val_result)
print(result.report)

Ask Questions About Your EDI File

from validedi.llm import ask_followup

# Ask questions in plain English
answer = ask_followup(
    "What is the total billed amount?",
    edi_result,
    val_result,
    llm=my_llm
)
print(answer)

Why ValidEDI?

vs. Other EDI Libraries

Feature ValidEDI Others
Configuration-Driven ✅ YAML files ❌ Hardcoded
Plain-English Errors ✅ Yes ❌ Technical codes
LLM Integration ✅ Any provider ❌ None
Loop Navigation ✅ Hierarchical ⚠️ Flat segments
Type Safety ✅ Pydantic v2 ⚠️ Dicts
Extensible ✅ Easy ⚠️ Difficult
Modern Python ✅ 3.8+ ⚠️ Legacy

Configuration-Driven Architecture

Unlike other libraries that hardcode validation rules, ValidEDI uses YAML configuration files:

# validedi/config/rules/rules_837.yaml
rules:
  - id: 'CLM01_REQUIRED'
    type: 'required_element'
    target: 'CLM01'
    severity: 'error'
    message: 'CLM01 (Patient Control Number) is blank'
    suggestion: 'CLM01 must be a unique identifier for the claim'

This means you can:

  • Add new validation rules without changing code
  • Customize rules for your specific needs
  • Version control your validation logic
  • Share configurations across teams

Supported Transaction Types

837P - Professional Claims

Outpatient and office-based healthcare services billed with CPT codes.

Use Cases:

  • Doctor office visits
  • Outpatient procedures
  • Laboratory services
  • Durable medical equipment

837I - Institutional Claims

Inpatient and facility-based healthcare services billed with revenue codes.

Use Cases:

  • Hospital inpatient stays
  • Emergency room visits
  • Skilled nursing facilities
  • Home health services

835 - Remittance Advice

Payer explanation of claim payments and adjustments.

Use Cases:

  • Payment reconciliation
  • Claim status tracking
  • Adjustment reason analysis
  • Accounts receivable management

834 - Benefit Enrollment

Member insurance enrollment, changes, and terminations.

Use Cases:

  • New member enrollment
  • Coverage changes
  • Dependent additions
  • Terminations

Documentation

Quick Start

Deep Dive

LLM Integration

Advanced


Examples

Parse and Navigate Loops

from validedi import parse

result = parse('claim.edi')

# Navigate hierarchical loops
for loop in result.loops:
    if loop.loop_id == '2000A':  # Billing Provider
        for segment in loop.segments:
            if segment.segment_id == 'NM1':
                print(f"Provider: {segment.elements[2]}")

Custom Validation

from validedi import validate

# Validate with custom rules
result = validate('claim.edi')

# Access detailed validation results
for error in result.errors:
    print(f"[{error.code}] {error.segment_id}: {error.message}")
    if error.suggestion:
        print(f"  Fix: {error.suggestion}")

Batch Processing

from validedi import parse, validate
import glob

for filepath in glob.glob('*.edi'):
    edi_result = parse(filepath)
    val_result = validate(edi_result)
    
    print(f"{filepath}: {val_result.summary}")

Interactive Chatbot

python examples/llm_chatbot.py claim.edi
You: What is the total billed amount?
Bot: The total billed amount is $4,250.00 across 3 claims.

You: Are there any errors?
Bot: There is 1 warning: CLM06 is missing on Claim #2.

You: quit
👋 Goodbye!

Architecture Highlights

1. Configuration-Driven Design

All validation rules, code sets, and transaction definitions are in YAML files:

validedi/config/
├── transactions/     # Transaction definitions (837P, 837I, 835, 834)
├── rules/           # Validation rules
├── code_sets/       # Code value lists
└── registry.yaml    # Transaction registry

2. Hierarchical Loop Structure

EDI segments are organized into hierarchical loops for easy navigation:

Loop 2000A (Billing Provider)
  ├── Segment NM1 (Name)
  ├── Segment N3 (Address)
  └── Loop 2000B (Subscriber)
      ├── Segment NM1 (Name)
      └── Loop 2300 (Claim)
          ├── Segment CLM (Claim Info)
          └── Loop 2400 (Service Line)
              └── Segment SV1 (Service)

3. Type-Safe Models

Full Pydantic v2 models with type hints:

class ParsedEDI(BaseModel):
    envelope: EnvelopeMeta
    loops: List[Loop]
    segments: List[Segment]
    raw_content: str

class ValidationResult(BaseModel):
    is_valid: bool
    error_count: int
    errors: List[ValidationError]
    summary: str

4. LLM-Agnostic Integration

Works with ANY LLM through a simple callable interface:

# Works with OpenAI, Groq, Bedrock, Gemini, Anthropic, local models, etc.
def my_llm(prompt: str) -> str:
    return response

result = explain(edi_result, val_result, llm=my_llm)

Validation Rules

ValidEDI includes 60+ validation rules across 4 categories:

Envelope Validation

  • ISA/IEA segment presence and structure
  • GS/GE functional group pairing
  • ST/SE transaction set pairing
  • Control number matching
  • Segment count validation

Format Validation

  • Date format (CCYYMMDD)
  • NPI format and Luhn check
  • ZIP code format
  • Monetary amount format
  • Element length validation

Business Rules

  • Required entity presence (submitter, billing provider, payer)
  • Charge total consistency
  • Service line validation
  • Diagnosis code format
  • Procedure code validation

Code Set Validation

  • Place of service codes
  • Claim adjustment reason codes
  • Entity identifier codes
  • Relationship codes
  • Coverage level codes

Code Sets

ValidEDI includes comprehensive code sets:

  • 200+ Adjustment Reason Codes (CARC)
  • 50+ Segment Descriptions
  • 30+ Entity Codes
  • 35+ Date Qualifiers
  • 13 Coverage Level Codes
  • 10 Relationship Codes
  • 8 Claim Status Codes
  • Common ICD-10 Codes
  • Common CPT Codes

All code sets are in YAML files and can be extended.


LLM Integration

ValidEDI's LLM integration is provider-agnostic - works with ANY LLM:

Supported Providers (Examples Included)

  1. OpenAI (GPT-4, GPT-3.5)
  2. Groq (Llama 3.1, Mixtral, Gemma) - Free tier available
  3. AWS Bedrock (Claude, Llama)
  4. Google Gemini - Free tier available
  5. Anthropic Claude
  6. Azure OpenAI
  7. Local Models (Ollama, LM Studio, etc.)
  8. Custom Implementations

Features

  • Plain-English explanations of EDI files
  • Interactive Q&A about your data
  • Validation error explanations with fix instructions
  • Rule-based fallback (works without LLM)
  • Zero dependencies (bring your own LLM)

Requirements

  • Python 3.8+
  • pydantic >= 2.0
  • pyyaml >= 6.0

Optional (for LLM features)

  • openai (for OpenAI)
  • groq (for Groq)
  • anthropic (for Anthropic)
  • google-generativeai (for Gemini)
  • boto3 (for AWS Bedrock)

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Areas for Contribution

  • Additional transaction types (270/271, 276/277, 278, 997, 999)
  • More validation rules
  • Additional code sets
  • Performance optimizations
  • Documentation improvements
  • Bug fixes

License

MIT License - see LICENSE file for details.


Support


Roadmap

v0.2.0 (Planned)

  • Additional transaction types (270/271, 276/277)
  • Performance optimizations
  • Streaming parser for large files
  • Web UI for validation

v0.3.0 (Planned)

  • Real-time validation API
  • Batch processing optimizations
  • Custom rule DSL
  • Report generation

Acknowledgments

  • Built with inspiration from Shaunak's X12-EDI-PARSER
  • Validation rules based on X12 5010 implementation guides
  • Code sets from HIPAA standards

Quick Links


Made with ❤️ for the healthcare tech community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

validedi-0.4.0.tar.gz (150.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

validedi-0.4.0-py3-none-any.whl (134.9 kB view details)

Uploaded Python 3

File details

Details for the file validedi-0.4.0.tar.gz.

File metadata

  • Download URL: validedi-0.4.0.tar.gz
  • Upload date:
  • Size: 150.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for validedi-0.4.0.tar.gz
Algorithm Hash digest
SHA256 f39b06e957d5cd460b9f15718cd0b94414a2c147a748ae8fdfca82ae8fce220a
MD5 54de0d673ef23db23eb9bb44a1ac6d53
BLAKE2b-256 f8378cb36cc4ca8191737b83a163cb961203ad86551f2a81fe1a903e7a2e049c

See more details on using hashes here.

File details

Details for the file validedi-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: validedi-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 134.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for validedi-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a95a1b6320d8f3e26846b99b5140d805a8dfc54eaff88209e3ac6874fc0b6021
MD5 7b7494c95271e088ff11cfc71729a153
BLAKE2b-256 642d51484bc464007091c290c8873dbdd4a6ccb519d72cfef099c1fcd7fa52d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page