Skip to main content

High-performance MS-DRG (Medicare Severity Diagnosis Related Groups) grouper

Project description

mz-drg

High-performance CMS claim processing tools written in Zig with Python bindings.

License: MIT Zig Python Docs


mz-drg provides open-source reimplementations of CMS tools:

  • MS-DRG Grouper — assigns Diagnosis Related Groups based on diagnoses, procedures, and demographics
  • Medicare Code Editor (MCE) — validates ICD diagnosis and procedure codes against CMS edit rules
  • ICD-10 Converter — maps codes between fiscal year versions using CMS conversion tables

All are written in Zig, callable from Python, and validated against the CMS reference Java implementations with a 100% match rate on 50,000+ claims.

Why mz-drg?

The official CMS tools are Java applications. While accurate, they come with practical limitations:

Java (CMS) mz-drg
Startup JVM warmup, seconds Instant
Throughput (Ryzen 5 5600U) ~500 claims/sec ~11,000+ claims/sec
Memory JVM heap overhead Minimal, memory-mapped data
Dependencies JRE 17+, classpath management Single shared library
Python integration JPype bridge (fragile) Native ctypes (simple)
Embedding Requires JVM process C ABI, any language

Both engines are ported line-by-line from the decompiled Java source and validated claim-by-claim against the original.

Quick start

Install

pip install msdrg

MS-DRG Grouper

import msdrg

with msdrg.MsdrgGrouper() as grouper:
    result = grouper.group({
        "version": 431,
        "age": 65,
        "sex": 0,
        "discharge_status": 1,
        "pdx": {"code": "I5020"},
        "sdx": [{"code": "E1165"}],
        "procedures": []
    })

print(result["final_drg"])            # 293
print(result["final_mdc"])            # 5
print(result["final_drg_description"])  # "Heart Failure and Shock without CC/MCC"

Medicare Code Editor

import msdrg

with msdrg.MceEditor() as mce:
    result = mce.edit({
        "discharge_date": 20250101,
        "age": 65, "sex": 0, "discharge_status": 1,
        "pdx": {"code": "I5020"},
        "sdx": [{"code": "E1165"}],
        "procedures": []
    })

print(result["edit_type"])  # "NONE"
print(result["edits"])      # [] — no edits triggered

Unified claim — same dict for both

import msdrg

claim = {
    "version": 431,
    "discharge_date": 20250101,
    "age": 65, "sex": 0, "discharge_status": 1,
    "pdx": {"code": "I5020"},
    "sdx": [{"code": "E1165"}],
    "procedures": []
}

with msdrg.MsdrgGrouper() as g, msdrg.MceEditor() as mce:
    drg = g.group(claim)
    mce_result = mce.edit(claim)

ICD-10 Code Conversion

import msdrg

with msdrg.IcdConverter() as conv:
    # Convert a diagnosis code from FY2025 to FY2026
    new_code = conv.convert_dx("B880", source_year=2025, target_year=2026)
    print(new_code)  # "B8801"

    # Batch convert
    results = conv.convert_dx_batch(
        ["B880", "I5020", "A047"],
        source_year=2025, target_year=2026,
    )

Grouper with auto-conversion

with msdrg.MsdrgGrouper() as g:
    result = g.group({
        "version": 431,               # Target: FY2026
        "source_icd_version": 2025,   # Source: FY2025 codes
        "age": 65, "sex": 0, "discharge_status": 1,
        "pdx": {"code": "B880"},       # Auto-converted to B8801
    })

print(result["conversions"])
# [{"original": "B880", "converted": "B8801", "code_type": "dx", "field": "pdx"}]

MS-DRG Grouper

Input format

{
    "version": 431,              # MS-DRG version (e.g. 400, 410, 421, 431)
    "age": 65,                   # Patient age in years
    "sex": 0,                    # 0=Male, 1=Female, 2=Unknown
    "discharge_status": 1,       # 1=Home/Self Care, 20=Died
    "hospital_status": "NOT_EXEMPT",  # "NOT_EXEMPT" (default), "EXEMPT", or "UNKNOWN"
    "tie_breaker": "CLINICAL_SIGNIFICANCE",  # "CLINICAL_SIGNIFICANCE" (default) or "ALPHABETICAL"
    "source_icd_version": 2025,  # Source ICD-10 year for code conversion (optional)
    "pdx": {                     # Principal diagnosis (required)
        "code": "I5020",
        "poa": "Y"               # Present on Admission: Y/N/U/W (optional)
    },
    "admit_dx": {                # Admission diagnosis (optional)
        "code": "R0602"
    },
    "sdx": [                     # Secondary diagnoses (optional)
        {"code": "E1165", "poa": "Y"},
        {"code": "I10", "poa": "Y"}
    ],
    "procedures": [              # Procedure codes (optional)
        {"code": "02703DZ"}
    ]
}

Hospital status

The hospital_status field controls how Hospital-Acquired Condition (HAC) processing is applied, per CMS rules:

Value Behavior
"NOT_EXEMPT" Standard HAC processing. Default.
"EXEMPT" Hospital is exempt from POA reporting. No HAC/POA ungroupable conditions.
"UNKNOWN" Stricter POA validation with specific ungroupable return codes.

Tie breaker

The tie_breaker field controls how the grouper resolves attribute matches when multiple secondary diagnoses could match the same DRG formula attribute. This determines which diagnosis "wins" during the marking phase.

Value Behavior
"CLINICAL_SIGNIFICANCE" MCC diagnoses get first pick over CC, then by ICD code string. Default, matches CMS Java grouper.
"ALPHABETICAL" Sort by ICD code string only, ignoring severity.

Output format

{
    "initial_drg": 293,
    "final_drg": 293,
    "initial_mdc": 5,
    "final_mdc": 5,
    "initial_drg_description": "Heart Failure and Shock without CC/MCC",
    "final_drg_description": "Heart Failure and Shock without CC/MCC",
    "initial_mdc_description": "Diseases and Disorders of the Circulatory System",
    "final_mdc_description": "Diseases and Disorders of the Circulatory System",
    "return_code": "OK",
    "pdx_output": {
        "code": "I5020",
        "mdc": 5,
        "severity": "CC",
        "drg_impact": "BOTH",
        "poa_error": "POA_NOT_CHECKED",
        "flags": ["VALID", "MARKED_FOR_INITIAL", "MARKED_FOR_FINAL"]
    },
    "sdx_output": [...],
    "proc_output": [...],
    "conversions": []  # ICD version conversions (empty if source_icd_version not set)
}

Supported DRG versions

Version CMS Fiscal Year
400 FY 2023 (Oct 2022 – Apr 2023)
401 FY 2023 (Apr 2023 – Sep 2023)
410 FY 2024 (Oct 2023 – Apr 2024)
411 FY 2024 (Apr 2024 – Sep 2024)
420 FY 2025 (Oct 2024 – Apr 2025)
421 FY 2025 (Apr 2025 – Sep 2025)
430 FY 2026 (Oct 2025 – Apr 2026)
431 FY 2026 (Apr 2026 – Sep 2026)

Medicare Code Editor (MCE)

The MCE validates ICD diagnosis and procedure codes against CMS edit rules. It checks for sex conflicts, age conflicts, unacceptable principal diagnoses, E-codes as PDX, non-covered procedures, bilateral procedures, and more.

Input format

{
    "discharge_date": 20250101,  # YYYYMMDD integer (required for MCE)
    "icd_version": 10,           # 9 or 10 (default: 10)
    "age": 65,
    "sex": 0,                    # 0=Male, 1=Female, 2=Unknown
    "discharge_status": 1,
    "pdx": {"code": "I5020"},
    "admit_dx": {"code": "R0602"},
    "sdx": [{"code": "E1165"}],
    "procedures": [{"code": "02703DZ"}]
}

Output format

{
    "version": 20260930,
    "edit_type": "PREPAYMENT",    # NONE, PREPAYMENT, POSTPAYMENT, or BOTH
    "edits": [                    # List of triggered edits (empty if NONE)
        {
            "name": "E_CODE_AS_PDX",
            "count": 1,
            "code_type": "DIAGNOSIS",
            "edit_type": "PREPAYMENT"
        }
    ]
}

Example — E-code as principal diagnosis

import msdrg

with msdrg.MceEditor() as mce:
    result = mce.edit({
        "discharge_date": 20250101,
        "age": 65, "sex": 0, "discharge_status": 1,
        "pdx": {"code": "V0001XA"},  # E-code
        "sdx": [], "procedures": []
    })

print(result["edit_type"])  # "PREPAYMENT"
print(result["edits"][0]["name"])  # "E_CODE_AS_PDX"

Supported edit types

The MCE detects ~35 edit types including:

  • INVALID_CODE — code not in CMS master for date range
  • SEX_CONFLICT — code restricted by patient sex
  • AGE_CONFLICT — code restricted by patient age
  • E_CODE_AS_PDX — E-code used as principal diagnosis
  • MANIFESTATION_AS_PDX — manifestation code used as PDX
  • UNACCEPTABLE_PDX — code unacceptable as principal diagnosis
  • NON_COVERED — procedure not covered by Medicare
  • BILATERAL — bilateral procedure without bilateral PDX
  • OPEN_BIOPSY — open biopsy without prior biopsy

MCE validation

The MCE implementation is validated against the CMS Java MCE 2.0 v43.1 with a 100% match rate on 50,000 test claims.

Architecture

┌──────────────────────────────────────────────────────────────┐
│  Python (msdrg)                                              │
│  ctypes ──► C API (c_api.zig, mce_c_api.zig)                │
│                │                                             │
│    ┌───────────┼─────────────────┐                           │
│    ▼           ▼                 ▼                           │
│  MS-DRG     MCE Editor     ICD-10 Converter                 │
│  Grouper    (MceComponent)  (ConversionData)                │
│    │           │                 │                           │
│    ▼           ▼                 ▼                           │
│  Chain of    Validation     Code Lookup:                    │
│  Links:      Pipeline:      Binary search                   │
│  Preprocess  Code Check     on sorted                       │
│  → Group     → Edit Rules   conversion                      │
│  → HAC       → Output       entries                         │
│  → Final DRG   Counts                                        │
│    │           │                 │                           │
│    ▼           ▼                 ▼                           │
│  Memory-mapped LMDB database (msdrg.mdb)                    │
└──────────────────────────────────────────────────────────────┘

Both engines share the same shared library and data files. The grouping pipeline is a chain of composable processors; the MCE is a linear validation pipeline. Both mirror the original Java architecture for validation purposes.

Building from source

Prerequisites

  • Zig 0.16+download or via package manager
  • Python 3.11+
  • uv (recommended) or pip

Setup

git clone https://github.com/Bedrock-Billing/mz-drg.git
cd mz-drg

# Create venv and install
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

This compiles the Zig shared library and bundles all data files into the Python package.

Run tests

# Zig unit tests (60+ tests)
cd zig_src && zig build test

# Python tests (MS-DRG + MCE)
python -m pytest tests/

Data pipeline

The binary data files are prebuilt and included in the monolithic data/msdrg.mdb database. To regenerate it from the raw CMS CSVs:

bash scripts/setup_data.sh

This runs extract → import → compile → zig build in sequence. See scripts/ for individual steps.

Comparison testing

The tests/ directory contains tools for validating mz-drg against the reference Java implementations.

# Generate random test claims
python tests/generate_test_claims.py --count 1000 --out tests/claims.json

# Compare MS-DRG grouper
python tests/compare_groupers.py --file tests/claims.json

# Compare MCE editor
python tests/compare_mce.py --file tests/claims.json

# Benchmark
python tests/compare_groupers.py --file tests/claims.json --benchmark

The Java comparisons require JDK 17+ and the reference JARs in jars/. This is only needed for validation — the Python package itself has no Java dependency.

C API

mz-drg exposes a C ABI for integration with any language. A complete header is auto-generated at zig-out/include/msdrg.h after building.

JSON API (simple — single call)

#include "msdrg.h"

void* ctx = msdrg_context_init("/path/to/data");
const char* result = msdrg_group_json(ctx, "{\"version\":431,...}");
msdrg_string_free(result);
msdrg_context_free(ctx);

MCE Editor

#include "msdrg.h"

MceContext mce = mce_context_init("/path/to/data");
const char* result = mce_edit_json(mce, "{\"discharge_date\":20250101,...}");
msdrg_string_free(result);
mce_context_free(mce);

ICD-10 Code Conversion

#include "msdrg.h"

MsdrgContext ctx = msdrg_context_init("/path/to/data");

// Convert a diagnosis code (FY2025 → FY2026)
const char* converted = msdrg_convert_dx(ctx, "B880", 2025, 2026);
// converted = "B8801"

// Convert a procedure code
const char* pr_conv = msdrg_convert_pr(ctx, "02703DZ", 2025, 2026);

msdrg_string_free(converted);
msdrg_string_free(pr_conv);
msdrg_context_free(ctx);

Functions are thread-safe after initialization. The context is immutable and can be shared across threads.

Project structure

mz-drg/
├── msdrg/                       # Python package
│   ├── __init__.py
│   ├── grouper.py               # MsdrgGrouper class
│   ├── mce.py                   # MceEditor class
│   └── converter.py             # IcdConverter class
├── zig_src/                     # Zig source
│   ├── build.zig
│   ├── main.zig
│   └── src/
│       ├── c_api.zig            # MS-DRG C ABI exports
│       ├── json_api.zig         # MS-DRG JSON in/out
│       ├── msdrg.zig            # GrouperChain + version routing
│       ├── chain.zig            # Composable processor chain
│       ├── models.zig           # Data models
│       ├── preprocess.zig       # Exclusion & attribute handling
│       ├── grouping.zig         # DRG formula matching
│       ├── marking.zig          # Code marking logic
│       ├── hac.zig              # Hospital-Acquired Conditions
│       ├── conversion.zig       # ICD-10 code conversion
│       ├── mce.zig              # MCE main editor
│       ├── mce_c_api.zig        # MCE C ABI exports
│       ├── mce_json_api.zig     # MCE JSON in/out
│       ├── mce_data.zig         # MCE data loading
│       ├── mce_enums.zig        # MCE attributes & edits
│       ├── mce_editing.zig      # MCE edit rules
│       └── mce_validation.zig   # MCE validation logic
├── data/                        # Consolidated LMDB database (msdrg.mdb)
├── scripts/                     # Data extraction & compilation
│   ├── compile_icd_conversions.py  # ICD conversion table compiler
│   └── ...
├── tests/                       # Tests & comparison tools
│   ├── example.py               # All-components example
│   └── ...
├── pyproject.toml
└── setup.py

License

MIT — see LICENSE.

Documentation

Full documentation is available at Bedrock-Billing.github.io/mz-drg.

Acknowledgments

This project is intended for healthcare IT professionals who need fast, embeddable, and auditable claim processing tools.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

msdrg-1.0.0.tar.gz (6.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

msdrg-1.0.0-py3-none-win_amd64.whl (6.5 MB view details)

Uploaded Python 3Windows x86-64

msdrg-1.0.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

msdrg-1.0.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

msdrg-1.0.0-py3-none-macosx_11_0_arm64.whl (6.4 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

msdrg-1.0.0-py3-none-macosx_10_13_x86_64.whl (6.4 MB view details)

Uploaded Python 3macOS 10.13+ x86-64

File details

Details for the file msdrg-1.0.0.tar.gz.

File metadata

  • Download URL: msdrg-1.0.0.tar.gz
  • Upload date:
  • Size: 6.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for msdrg-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d56722e030f79e3e0fb5130437b2dca30ffd7df42823042e6a6d9100913e03e9
MD5 f1531f4f977acdea74e07c1fffd46094
BLAKE2b-256 0bb497e9429f76f905b1c54330ec1482748a767ae74b44825b00632ecb503c14

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-1.0.0.tar.gz:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-1.0.0-py3-none-win_amd64.whl.

File metadata

  • Download URL: msdrg-1.0.0-py3-none-win_amd64.whl
  • Upload date:
  • Size: 6.5 MB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for msdrg-1.0.0-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 4cbcc3e0cc3a198c8a547acd25eae1b1b1340ca1b76ca6f6445d4ee67f3e487e
MD5 d39eda5f6befa7d8b4eee79449f16e63
BLAKE2b-256 fa18a842a23c479c0138e58746ed94adc08f8ab520e44c5776a4de182878a50f

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-1.0.0-py3-none-win_amd64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-1.0.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for msdrg-1.0.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 63006edff4849048a8aa2813bdfa203d5f3c89988c8f9c6f65561c12f42e4b77
MD5 a2f57bcad6b86bad69ab394401691ea8
BLAKE2b-256 308d6d6d08681d2a03b60255e8c071f554f9e780ebdd41bb4fb97c102004ab8a

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-1.0.0-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-1.0.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for msdrg-1.0.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3de6e655e9030ec62e4d573e58e2632b17c4af7736965f2c72b35db86e131a9f
MD5 90caba69acba26f3535827265eb5d562
BLAKE2b-256 0960268716e59f7595431d227828a58440e85853b7babf62976833ca69114781

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-1.0.0-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-1.0.0-py3-none-macosx_11_0_arm64.whl.

File metadata

  • Download URL: msdrg-1.0.0-py3-none-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 6.4 MB
  • Tags: Python 3, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for msdrg-1.0.0-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4978326e6d115efba127a030d91b4fe500786de40d16fb59efcd4fb7ce92622b
MD5 e989cb6c6053e9ccbf25008bfa949313
BLAKE2b-256 cd30b3a5ace3232c799b66ca562a155fe77f28abbdae88c44d37b9483fe4e40f

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-1.0.0-py3-none-macosx_11_0_arm64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-1.0.0-py3-none-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for msdrg-1.0.0-py3-none-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 83ca89829ea2d4575aef2a53271b8b2d98a83b62d50616e638b271b3dc54e92d
MD5 73668c585b8ea41079f8cdc6ad8e9c55
BLAKE2b-256 4287452dd17e783225d46f291c35602234dd1adb887fc50548352efe44f75ab2

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-1.0.0-py3-none-macosx_10_13_x86_64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page