Skip to main content

A modern, configuration-driven X12 EDI parser and validator for healthcare transactions with optional LLM-powered explanations and structured data extraction

Project description

ValidEDI

A modern, configuration-driven X12 EDI parser and validator for healthcare transactions with optional LLM-powered explanations.

Python 3.8+ License: MIT


What is ValidEDI?

ValidEDI is a Python library for parsing and validating healthcare EDI (Electronic Data Interchange) files. It supports the most common X12 transaction types used in healthcare:

  • 837P - Professional Health Care Claims
  • 837I - Institutional Health Care Claims
  • 835 - Health Care Claim Payment/Remittance Advice
  • 834 - Benefit Enrollment and Maintenance

Key Features

Parse EDI Files - Convert X12 EDI into structured Python objects
Validate Transactions - 60+ validation rules with plain-English error messages
Configuration-Driven - All rules and codes in YAML files, not hardcoded
LLM-Powered Explanations - Optional AI-powered plain-English reports (works with ANY LLM)
Loop Navigation - Hierarchical loop structure for easy data access
Type-Safe - Full Pydantic v2 models with type hints
Extensible - Add custom validation rules and code sets
Production-Ready - Thread-safe, well-tested, comprehensive error handling


Quick Start

Installation

pip install validedi

Parse an EDI File

from validedi import parse

# Parse from file path
result = parse('claim.edi')

# Access parsed data
print(f"Transaction: {result.envelope.transaction_type}")
print(f"From: {result.envelope.sender_id}")
print(f"To: {result.envelope.receiver_id}")
print(f"Loops: {len(result.loops)}")

Validate an EDI File

from validedi import validate

# Validate the file
result = validate('claim.edi')

# Check results
if result.is_valid:
    print("✅ File is valid!")
else:
    print(f"❌ Found {result.error_count} errors")
    for error in result.errors:
        print(f"  • {error.message}")

Get Plain-English Explanations (Optional)

from validedi import parse, validate
from validedi.llm import explain

# Parse and validate
edi_result = parse('claim.edi')
val_result = validate(edi_result)

# Option 1: With your own LLM (OpenAI, Groq, Bedrock, Gemini, etc.)
from openai import OpenAI
client = OpenAI(api_key="your-key")

def my_llm(prompt: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

result = explain(edi_result, val_result, llm=my_llm)
print(result.report)

# Option 2: Without LLM (rule-based templates)
result = explain(edi_result, val_result)
print(result.report)

Ask Questions About Your EDI File

from validedi.llm import ask_followup

# Ask questions in plain English
answer = ask_followup(
    "What is the total billed amount?",
    edi_result,
    val_result,
    llm=my_llm
)
print(answer)

Why ValidEDI?

vs. Other EDI Libraries

Feature ValidEDI Others
Configuration-Driven ✅ YAML files ❌ Hardcoded
Plain-English Errors ✅ Yes ❌ Technical codes
LLM Integration ✅ Any provider ❌ None
Loop Navigation ✅ Hierarchical ⚠️ Flat segments
Type Safety ✅ Pydantic v2 ⚠️ Dicts
Extensible ✅ Easy ⚠️ Difficult
Modern Python ✅ 3.8+ ⚠️ Legacy

Configuration-Driven Architecture

Unlike other libraries that hardcode validation rules, ValidEDI uses YAML configuration files:

# validedi/config/rules/rules_837.yaml
rules:
  - id: 'CLM01_REQUIRED'
    type: 'required_element'
    target: 'CLM01'
    severity: 'error'
    message: 'CLM01 (Patient Control Number) is blank'
    suggestion: 'CLM01 must be a unique identifier for the claim'

This means you can:

  • Add new validation rules without changing code
  • Customize rules for your specific needs
  • Version control your validation logic
  • Share configurations across teams

Supported Transaction Types

837P - Professional Claims

Outpatient and office-based healthcare services billed with CPT codes.

Use Cases:

  • Doctor office visits
  • Outpatient procedures
  • Laboratory services
  • Durable medical equipment

837I - Institutional Claims

Inpatient and facility-based healthcare services billed with revenue codes.

Use Cases:

  • Hospital inpatient stays
  • Emergency room visits
  • Skilled nursing facilities
  • Home health services

835 - Remittance Advice

Payer explanation of claim payments and adjustments.

Use Cases:

  • Payment reconciliation
  • Claim status tracking
  • Adjustment reason analysis
  • Accounts receivable management

834 - Benefit Enrollment

Member insurance enrollment, changes, and terminations.

Use Cases:

  • New member enrollment
  • Coverage changes
  • Dependent additions
  • Terminations

Documentation

Quick Start

Deep Dive

LLM Integration

Advanced


Examples

Parse and Navigate Loops

from validedi import parse

result = parse('claim.edi')

# Navigate hierarchical loops
for loop in result.loops:
    if loop.loop_id == '2000A':  # Billing Provider
        for segment in loop.segments:
            if segment.segment_id == 'NM1':
                print(f"Provider: {segment.elements[2]}")

Custom Validation

from validedi import validate

# Validate with custom rules
result = validate('claim.edi')

# Access detailed validation results
for error in result.errors:
    print(f"[{error.code}] {error.segment_id}: {error.message}")
    if error.suggestion:
        print(f"  Fix: {error.suggestion}")

Batch Processing

from validedi import parse, validate
import glob

for filepath in glob.glob('*.edi'):
    edi_result = parse(filepath)
    val_result = validate(edi_result)
    
    print(f"{filepath}: {val_result.summary}")

Interactive Chatbot

python examples/llm_chatbot.py claim.edi
You: What is the total billed amount?
Bot: The total billed amount is $4,250.00 across 3 claims.

You: Are there any errors?
Bot: There is 1 warning: CLM06 is missing on Claim #2.

You: quit
👋 Goodbye!

Architecture Highlights

1. Configuration-Driven Design

All validation rules, code sets, and transaction definitions are in YAML files:

validedi/config/
├── transactions/     # Transaction definitions (837P, 837I, 835, 834)
├── rules/           # Validation rules
├── code_sets/       # Code value lists
└── registry.yaml    # Transaction registry

2. Hierarchical Loop Structure

EDI segments are organized into hierarchical loops for easy navigation:

Loop 2000A (Billing Provider)
  ├── Segment NM1 (Name)
  ├── Segment N3 (Address)
  └── Loop 2000B (Subscriber)
      ├── Segment NM1 (Name)
      └── Loop 2300 (Claim)
          ├── Segment CLM (Claim Info)
          └── Loop 2400 (Service Line)
              └── Segment SV1 (Service)

3. Type-Safe Models

Full Pydantic v2 models with type hints:

class ParsedEDI(BaseModel):
    envelope: EnvelopeMeta
    loops: List[Loop]
    segments: List[Segment]
    raw_content: str

class ValidationResult(BaseModel):
    is_valid: bool
    error_count: int
    errors: List[ValidationError]
    summary: str

4. LLM-Agnostic Integration

Works with ANY LLM through a simple callable interface:

# Works with OpenAI, Groq, Bedrock, Gemini, Anthropic, local models, etc.
def my_llm(prompt: str) -> str:
    return response

result = explain(edi_result, val_result, llm=my_llm)

Validation Rules

ValidEDI includes 60+ validation rules across 4 categories:

Envelope Validation

  • ISA/IEA segment presence and structure
  • GS/GE functional group pairing
  • ST/SE transaction set pairing
  • Control number matching
  • Segment count validation

Format Validation

  • Date format (CCYYMMDD)
  • NPI format and Luhn check
  • ZIP code format
  • Monetary amount format
  • Element length validation

Business Rules

  • Required entity presence (submitter, billing provider, payer)
  • Charge total consistency
  • Service line validation
  • Diagnosis code format
  • Procedure code validation

Code Set Validation

  • Place of service codes
  • Claim adjustment reason codes
  • Entity identifier codes
  • Relationship codes
  • Coverage level codes

Code Sets

ValidEDI includes comprehensive code sets:

  • 200+ Adjustment Reason Codes (CARC)
  • 50+ Segment Descriptions
  • 30+ Entity Codes
  • 35+ Date Qualifiers
  • 13 Coverage Level Codes
  • 10 Relationship Codes
  • 8 Claim Status Codes
  • Common ICD-10 Codes
  • Common CPT Codes

All code sets are in YAML files and can be extended.


LLM Integration

ValidEDI's LLM integration is provider-agnostic - works with ANY LLM:

Supported Providers (Examples Included)

  1. OpenAI (GPT-4, GPT-3.5)
  2. Groq (Llama 3.1, Mixtral, Gemma) - Free tier available
  3. AWS Bedrock (Claude, Llama)
  4. Google Gemini - Free tier available
  5. Anthropic Claude
  6. Azure OpenAI
  7. Local Models (Ollama, LM Studio, etc.)
  8. Custom Implementations

Features

  • Plain-English explanations of EDI files
  • Interactive Q&A about your data
  • Validation error explanations with fix instructions
  • Rule-based fallback (works without LLM)
  • Zero dependencies (bring your own LLM)

Requirements

  • Python 3.8+
  • pydantic >= 2.0
  • pyyaml >= 6.0

Optional (for LLM features)

  • openai (for OpenAI)
  • groq (for Groq)
  • anthropic (for Anthropic)
  • google-generativeai (for Gemini)
  • boto3 (for AWS Bedrock)

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Areas for Contribution

  • Additional transaction types (270/271, 276/277, 278, 997, 999)
  • More validation rules
  • Additional code sets
  • Performance optimizations
  • Documentation improvements
  • Bug fixes

License

MIT License - see LICENSE file for details.


Support


Roadmap

v0.2.0 (Planned)

  • Additional transaction types (270/271, 276/277)
  • Performance optimizations
  • Streaming parser for large files
  • Web UI for validation

v0.3.0 (Planned)

  • Real-time validation API
  • Batch processing optimizations
  • Custom rule DSL
  • Report generation

Acknowledgments

  • Built with inspiration from Shaunak's X12-EDI-PARSER
  • Validation rules based on X12 5010 implementation guides
  • Code sets from HIPAA standards

Quick Links


Made with ❤️ for the healthcare tech community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

validedi-0.3.4.tar.gz (135.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

validedi-0.3.4-py3-none-any.whl (124.3 kB view details)

Uploaded Python 3

File details

Details for the file validedi-0.3.4.tar.gz.

File metadata

  • Download URL: validedi-0.3.4.tar.gz
  • Upload date:
  • Size: 135.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for validedi-0.3.4.tar.gz
Algorithm Hash digest
SHA256 ca9b28427014385320d62a63dc7cdb6e43aa0474941425aa87aace10731cb946
MD5 0da4555ad0a6029b6a0f4dcd8ad86bde
BLAKE2b-256 cdb9751c613d04d62ec070f0f536ad9e69025fe93a6c4ebb80fbf5e946338558

See more details on using hashes here.

File details

Details for the file validedi-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: validedi-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 124.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for validedi-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 f4ff228de20b4e05a55c14b0eca57bf3939c70073532841a5a3124c3815ee13c
MD5 8a5cef92faff5985676e594eb6e6088f
BLAKE2b-256 d76dfcc2fb2c5137e128977bb49b2c6370978e473f41ea893d411deb81c01dea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page