Skip to main content

High-performance MS-DRG (Medicare Severity Diagnosis Related Groups) grouper

Project description

mz-drg

A high-performance MS-DRG grouper written in Zig with Python bindings.

License: MIT Zig Python


mz-drg is an open-source reimplementation of the CMS MS-DRG (Medicare Severity Diagnosis Related Groups) classification engine, written in Zig and callable from Python. It takes patient claim data — diagnoses, procedures, demographics — and assigns the appropriate DRG, MDC, severity, and return codes.

It is validated against 50,000+ claims against the reference Java grouper with a 100% match rate.

Why mz-drg?

The official CMS MS-DRG grouper is a Java application. While accurate, it comes with practical limitations:

Java Grouper mz-drg
Startup JVM warmup, seconds Instant
Throughput (tested on a Ryzen 5 5600U laptop) ~500 claims/sec ~7,000+ claims/sec
Memory JVM heap overhead Minimal, memory-mapped data
Dependencies JRE 17+, classpath management Single shared library
Python integration JPype bridge (fragile) Native ctypes (simple)
Embedding Requires JVM process C ABI, any language

mz-drg is not a black-box reimplementation. The grouping logic — preprocessing, exclusion handling, diagnosis clustering, severity assignment, formula evaluation, rerouting, marking, and final grouping — is ported line-by-line from the decompiled Java source and validated claim-by-claim against the original.

Quick start

Install

pip install msdrg

Use

import msdrg

with msdrg.MsdrgGrouper() as grouper:
    result = grouper.group({
        "version": 431,
        "age": 65,
        "sex": 0,
        "discharge_status": 1,
        "pdx": {"code": "I5020"},
        "sdx": [{"code": "E1165"}],
        "procedures": []
    })

print(result["final_drg"])            # 293
print(result["final_mdc"])            # 5
print(result["final_drg_description"])  # "Heart Failure and Shock without CC/MCC"

Helper function

import msdrg

claim = msdrg.create_claim(
    version=431,
    age=65,
    sex=0,
    discharge_status=1,
    pdx="I5020",
    sdx=["E1165", "I10"],
    procedures=["02703DZ"],
)

with msdrg.MsdrgGrouper() as g:
    result = g.group(claim)

Input format

The group() method accepts a dictionary:

{
    "version": 431,              # MS-DRG version (e.g. 400, 410, 421, 431)
    "age": 65,                   # Patient age in years
    "sex": 0,                    # 0=Male, 1=Female, 2=Unknown
    "discharge_status": 1,       # 1=Home/Self Care, 20=Died
    "hospital_status": "NOT_EXEMPT",  # "NOT_EXEMPT" (default), "EXEMPT", or "UNKNOWN"
    "pdx": {                     # Principal diagnosis (required)
        "code": "I5020",
        "poa": "Y"               # Present on Admission: Y/N/U/W (optional)
    },
    "admit_dx": {                # Admission diagnosis (optional)
        "code": "R0602"
    },
    "sdx": [                     # Secondary diagnoses (optional)
        {"code": "E1165", "poa": "Y"},
        {"code": "I10", "poa": "Y"}
    ],
    "procedures": [              # Procedure codes (optional)
        {"code": "02703DZ"}
    ]
}

Hospital status

The hospital_status field controls how Hospital-Acquired Condition (HAC) processing is applied, per CMS rules:

Value Behavior
"NOT_EXEMPT" Standard HAC processing. Codes with invalid POA on HAC-eligible diagnoses may mark the claim ungroupable. Claims meeting HAC criteria may have DRG assignment impacted. Default.
"EXEMPT" Hospital is exempt from POA reporting. All HACs are set to HAC_NOT_APPLICABLE_EXEMPT with POA error HOSPITAL_EXEMPT. No ungroupable conditions from HAC/POA.
"UNKNOWN" Stricter POA validation. Multiple codes with non-Y/W POA, or individual codes with N/U or invalid POA, trigger specific ungroupable return codes.

This is a per-request setting — each call to group() can use a different hospital_status without reconfiguring the grouper.

Output format

group() returns a dictionary:

{
    "initial_drg": 293,
    "final_drg": 293,
    "initial_mdc": 5,
    "final_mdc": 5,
    "initial_drg_description": "Heart Failure and Shock without CC/MCC",
    "final_drg_description": "Heart Failure and Shock without CC/MCC",
    "initial_mdc_description": "Diseases and Disorders of the Circulatory System",
    "final_mdc_description": "Diseases and Disorders of the Circulatory System",
    "return_code": "OK",
    "pdx_output": {
        "code": "I5020",
        "mdc": 5,
        "severity": "CC",
        "drg_impact": "BOTH",
        "poa_error": "POA_NOT_CHECKED",
        "flags": ["VALID", "MARKED_FOR_INITIAL", "MARKED_FOR_FINAL"]
    },
    "sdx_output": [...],
    "proc_output": [...]
}

Supported DRG versions

Version CMS Fiscal Year
400 FY 2023 (Oct 2022 – Apr 2023)
401 FY 2023 (Apr 2023 – Sep 2023)
410 FY 2024 (Oct 2023 – Apr 2024)
411 FY 2024 (Apr 2024 – Sep 2024)
420 FY 2025 (Oct 2024 – Apr 2025)
421 FY 2025 (Apr 2025 – Sep 2025)
430 FY 2026 (Oct 2025 – Apr 2026)
431 FY 2026 (Apr 2026 – Sep 2026)

Pass the version number in the claim's version field.

Architecture

┌─────────────────────────────────────────────────┐
│  Python (msdrg)                                 │
│  ctypes ──► C API (c_api.zig)                   │
│                │                                │
│                ▼                                │
│  GrouperChain  (data loader + version router)   │
│       │                                         │
│       ▼                                         │
│  Chain of Links:                                │
│  ┌──────────────────────────────────────────┐   │
│  │ Preprocess  →  Exclusions  →  Grouping   │   │
│  │    ↓              ↓           (initial)  │   │
│  │ Attributes    Cluster Map        ↓       │   │
│  │                              Marking     │   │
│  │                                 ↓        │   │
│  │              HAC Processing ◄────────    │   │
│  │              (EXEMPT/NON_EXEMPT/UNKNOWN) │   │
│  │                                 ↓        │   │
│  │                           Grouping       │   │
│  │                           (final)        │   │
│  │                                 ↓        │   │
│  │                           Final DRG      │   │
│  └──────────────────────────────────────────┘   │
│       │                                         │
│       ▼                                         │
│  Memory-mapped binary data (16 .bin files)      │
└─────────────────────────────────────────────────┘

The grouper loads 16 precompiled binary data files at startup (diagnosis definitions, DRG formulas, cluster maps, exclusion groups, etc.) via memory mapping. The grouping pipeline is a chain of composable processors, each transforming the claim context. This design mirrors the original Java architecture for validation purposes.

Building from source

Prerequisites

  • Zig 0.16+download or via package manager
  • Python 3.11+
  • uv (recommended) or pip

Setup

git clone https://github.com/Bedrock-Billing/mz-drg.git
cd mz-drg

# Create venv and install
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

This compiles the Zig shared library and bundles the data files into the Python package.

Run tests

# Zig unit tests (27+ tests)
cd zig_src && zig build test

# Python tests
python -m pytest tests/test_grouper.py

Data pipeline

The binary data files (data/bin/*.bin) are prebuilt and included in the repository. To regenerate them from the raw CMS CSVs:

bash scripts/setup_data.sh

This runs extract → import → compile → zig build in sequence. See scripts/ for individual steps.

Comparison testing

The tests/ directory contains tools for validating mz-drg against the reference Java grouper.

# Generate random test claims
python tests/generate_test_claims.py --count 1000 --out tests/claims.json

# Compare Java vs Zig output
python tests/compare_groupers.py --file tests/claims.json

# Benchmark both
python tests/compare_groupers.py --file tests/claims.json --benchmark

The Java comparison requires JDK 17+ and the reference JARs in jars/. This is only needed for validation — the Python package itself has no Java dependency.

C API

mz-drg exposes a C ABI for integration with any language. See zig_src/src/c_api.zig for the full API.

// Initialize (loads all data, pre-builds chains)
void* ctx = msdrg_context_init("/path/to/data/bin");

// Group a claim via JSON
const char* result_json = msdrg_group_json(ctx, "{\"version\":431,...}");

// Free
msdrg_string_free(result_json);
msdrg_context_free(ctx);

Functions are thread-safe after initialization. The context is immutable and can be shared across threads.

Project structure

mz-drg/
├── msdrg/                    # Python package
│   ├── __init__.py
│   └── grouper.py            # MsdrgGrouper class
├── zig_src/                  # Zig source
│   ├── build.zig
│   ├── main.zig
│   └── src/
│       ├── c_api.zig         # C ABI exports
│       ├── json_api.zig      # JSON in/out
│       ├── msdrg.zig         # GrouperChain + version routing
│       ├── chain.zig         # Composable processor chain
│       ├── models.zig        # Data models
│       ├── preprocess.zig    # Exclusion & attribute handling
│       ├── grouping.zig      # DRG formula matching
│       ├── marking.zig       # Code marking logic
│       ├── hac.zig           # Hospital-Acquired Conditions
│       └── ...               # 20+ modules, ~8,500 lines
├── data/bin/                 # Prebuilt binary data (16 files)
├── scripts/                  # Data extraction & compilation
├── tests/                    # Comparison & benchmark tools
├── python_client/            # Legacy Python wrapper
├── pyproject.toml
└── setup.py

License

MIT — see LICENSE.

Acknowledgments

This project is intended for healthcare IT professionals who need a fast, embeddable, and auditable DRG classification engine.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

msdrg-0.1.3.tar.gz (1.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

msdrg-0.1.3-py3-none-win_amd64.whl (1.3 MB view details)

Uploaded Python 3Windows x86-64

msdrg-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.3 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

msdrg-0.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.3 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ ARM64

msdrg-0.1.3-py3-none-macosx_11_0_arm64.whl (1.2 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

msdrg-0.1.3-py3-none-macosx_10_13_x86_64.whl (1.2 MB view details)

Uploaded Python 3macOS 10.13+ x86-64

File details

Details for the file msdrg-0.1.3.tar.gz.

File metadata

  • Download URL: msdrg-0.1.3.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for msdrg-0.1.3.tar.gz
Algorithm Hash digest
SHA256 a5d540cb9147862e9f04cf52a116ac60bc6a4835ef43872295b29e74732d070d
MD5 5821be47211d6d1a538f14662d474d8d
BLAKE2b-256 ac41e4959ca4728ac63ce7be747996b3e22bddb2a9c1085ed73a726085b99470

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-0.1.3.tar.gz:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-0.1.3-py3-none-win_amd64.whl.

File metadata

  • Download URL: msdrg-0.1.3-py3-none-win_amd64.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for msdrg-0.1.3-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 4e42782d71d51d938e39a345aa842e89ff70371483cc2b32a819194129afbc3d
MD5 ae35b42b23c8d7b462467131bf9abfd5
BLAKE2b-256 d774edf0dbc09bb9bac729cc71e7ec8a49a0f126ec2f09d4fcf1a0a3e8227b09

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-0.1.3-py3-none-win_amd64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for msdrg-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 581a155426227b152ac3b6127ecb2bb85722b3599f7402d553bd5e4fc248a14a
MD5 b3f6f82448a05d0b9ef28d48acb38d91
BLAKE2b-256 12954824886048b83301c465eec4ceeee6a52b694ce4911dae1596583b74dea1

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-0.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for msdrg-0.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7991ec9285b028b38a1f176fd67db1dfdbab9bb00f35f36317b4183e13041c0a
MD5 0a0508ad2d34cf8918b5119ce8be4828
BLAKE2b-256 d19d90ef0d49d0b51d7d94e91efdb1c461690597d50fc6e8c3af13d1904d3652

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-0.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-0.1.3-py3-none-macosx_11_0_arm64.whl.

File metadata

  • Download URL: msdrg-0.1.3-py3-none-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for msdrg-0.1.3-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0ee7c1fd387b6be4caba7803341b6f14ac4712c96f16ffd947e7672d2c73c5ac
MD5 fe7135f29251468293d9f20843167a07
BLAKE2b-256 459229c28b4ad2c543f76c5731eb5560ad312598ab4ce9e603ca535e94691c1e

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-0.1.3-py3-none-macosx_11_0_arm64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file msdrg-0.1.3-py3-none-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for msdrg-0.1.3-py3-none-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 da2b82d477370eb2be14ae9ae937a6f6152bd3e1051b05645193bad6859a6c1c
MD5 82aff455513974d422ec037f2a23e780
BLAKE2b-256 2cff571d6d5dc8e04cd3c929a6a4d83b7aef76fc0367eb60ee782d3231adf844

See more details on using hashes here.

Provenance

The following attestation bundles were made for msdrg-0.1.3-py3-none-macosx_10_13_x86_64.whl:

Publisher: build.yml on Bedrock-Billing/mz-drg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page