High-performance MS-DRG (Medicare Severity Diagnosis Related Groups) grouper
Project description
mz-drg
A high-performance MS-DRG grouper written in Zig with Python bindings.
mz-drg is an open-source reimplementation of the CMS MS-DRG (Medicare Severity Diagnosis Related Groups) classification engine, written in Zig and callable from Python. It takes patient claim data — diagnoses, procedures, demographics — and assigns the appropriate DRG, MDC, severity, and return codes.
It is validated against 50,000+ claims against the reference Java grouper with a 100% match rate.
Why mz-drg?
The official CMS MS-DRG grouper is a Java application. While accurate, it comes with practical limitations:
| Java Grouper | mz-drg | |
|---|---|---|
| Startup | JVM warmup, seconds | Instant |
| Throughput (tested on a Ryzen 5 5600U laptop) | ~500 claims/sec | ~7,000+ claims/sec |
| Memory | JVM heap overhead | Minimal, memory-mapped data |
| Dependencies | JRE 17+, classpath management | Single shared library |
| Python integration | JPype bridge (fragile) | Native ctypes (simple) |
| Embedding | Requires JVM process | C ABI, any language |
mz-drg is not a black-box reimplementation. The grouping logic — preprocessing, exclusion handling, diagnosis clustering, severity assignment, formula evaluation, rerouting, marking, and final grouping — is ported line-by-line from the decompiled Java source and validated claim-by-claim against the original.
Quick start
Install
pip install msdrg
Use
import msdrg
with msdrg.MsdrgGrouper() as grouper:
result = grouper.group({
"version": 431,
"age": 65,
"sex": 0,
"discharge_status": 1,
"pdx": {"code": "I5020"},
"sdx": [{"code": "E1165"}],
"procedures": []
})
print(result["final_drg"]) # 293
print(result["final_mdc"]) # 5
print(result["final_drg_description"]) # "Heart Failure and Shock without CC/MCC"
Helper function
import msdrg
claim = msdrg.create_claim(
version=431,
age=65,
sex=0,
discharge_status=1,
pdx="I5020",
sdx=["E1165", "I10"],
procedures=["02703DZ"],
)
with msdrg.MsdrgGrouper() as g:
result = g.group(claim)
Input format
The group() method accepts a dictionary:
{
"version": 431, # MS-DRG version (e.g. 400, 410, 421, 431)
"age": 65, # Patient age in years
"sex": 0, # 0=Male, 1=Female, 2=Unknown
"discharge_status": 1, # 1=Home/Self Care, 20=Died
"hospital_status": "NOT_EXEMPT", # "NOT_EXEMPT" (default), "EXEMPT", or "UNKNOWN"
"pdx": { # Principal diagnosis (required)
"code": "I5020",
"poa": "Y" # Present on Admission: Y/N/U/W (optional)
},
"admit_dx": { # Admission diagnosis (optional)
"code": "R0602"
},
"sdx": [ # Secondary diagnoses (optional)
{"code": "E1165", "poa": "Y"},
{"code": "I10", "poa": "Y"}
],
"procedures": [ # Procedure codes (optional)
{"code": "02703DZ"}
]
}
Hospital status
The hospital_status field controls how Hospital-Acquired Condition (HAC) processing is applied, per CMS rules:
| Value | Behavior |
|---|---|
"NOT_EXEMPT" |
Standard HAC processing. Codes with invalid POA on HAC-eligible diagnoses may mark the claim ungroupable. Claims meeting HAC criteria may have DRG assignment impacted. Default. |
"EXEMPT" |
Hospital is exempt from POA reporting. All HACs are set to HAC_NOT_APPLICABLE_EXEMPT with POA error HOSPITAL_EXEMPT. No ungroupable conditions from HAC/POA. |
"UNKNOWN" |
Stricter POA validation. Multiple codes with non-Y/W POA, or individual codes with N/U or invalid POA, trigger specific ungroupable return codes. |
This is a per-request setting — each call to group() can use a different hospital_status without reconfiguring the grouper.
Output format
group() returns a dictionary:
{
"initial_drg": 293,
"final_drg": 293,
"initial_mdc": 5,
"final_mdc": 5,
"initial_drg_description": "Heart Failure and Shock without CC/MCC",
"final_drg_description": "Heart Failure and Shock without CC/MCC",
"initial_mdc_description": "Diseases and Disorders of the Circulatory System",
"final_mdc_description": "Diseases and Disorders of the Circulatory System",
"return_code": "OK",
"pdx_output": {
"code": "I5020",
"mdc": 5,
"severity": "CC",
"drg_impact": "BOTH",
"poa_error": "POA_NOT_CHECKED",
"flags": ["VALID", "MARKED_FOR_INITIAL", "MARKED_FOR_FINAL"]
},
"sdx_output": [...],
"proc_output": [...]
}
Supported DRG versions
| Version | CMS Fiscal Year |
|---|---|
| 400 | FY 2023 (Oct 2022 – Apr 2023) |
| 401 | FY 2023 (Apr 2023 – Sep 2023) |
| 410 | FY 2024 (Oct 2023 – Apr 2024) |
| 411 | FY 2024 (Apr 2024 – Sep 2024) |
| 420 | FY 2025 (Oct 2024 – Apr 2025) |
| 421 | FY 2025 (Apr 2025 – Sep 2025) |
| 430 | FY 2026 (Oct 2025 – Apr 2026) |
| 431 | FY 2026 (Apr 2026 – Sep 2026) |
Pass the version number in the claim's version field.
Architecture
┌─────────────────────────────────────────────────┐
│ Python (msdrg) │
│ ctypes ──► C API (c_api.zig) │
│ │ │
│ ▼ │
│ GrouperChain (data loader + version router) │
│ │ │
│ ▼ │
│ Chain of Links: │
│ ┌──────────────────────────────────────────┐ │
│ │ Preprocess → Exclusions → Grouping │ │
│ │ ↓ ↓ (initial) │ │
│ │ Attributes Cluster Map ↓ │ │
│ │ Marking │ │
│ │ ↓ │ │
│ │ HAC Processing ◄──────── │ │
│ │ (EXEMPT/NON_EXEMPT/UNKNOWN) │ │
│ │ ↓ │ │
│ │ Grouping │ │
│ │ (final) │ │
│ │ ↓ │ │
│ │ Final DRG │ │
│ └──────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Memory-mapped binary data (16 .bin files) │
└─────────────────────────────────────────────────┘
The grouper loads 16 precompiled binary data files at startup (diagnosis definitions, DRG formulas, cluster maps, exclusion groups, etc.) via memory mapping. The grouping pipeline is a chain of composable processors, each transforming the claim context. This design mirrors the original Java architecture for validation purposes.
Building from source
Prerequisites
- Zig 0.16+ — download or via package manager
- Python 3.11+
- uv (recommended) or pip
Setup
git clone https://github.com/Bedrock-Billing/mz-drg.git
cd mz-drg
# Create venv and install
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
This compiles the Zig shared library and bundles the data files into the Python package.
Run tests
# Zig unit tests (27+ tests)
cd zig_src && zig build test
# Python tests
python -m pytest tests/test_grouper.py
Data pipeline
The binary data files (data/bin/*.bin) are prebuilt and included in the repository. To regenerate them from the raw CMS CSVs:
bash scripts/setup_data.sh
This runs extract → import → compile → zig build in sequence. See scripts/ for individual steps.
Comparison testing
The tests/ directory contains tools for validating mz-drg against the reference Java grouper.
# Generate random test claims
python tests/generate_test_claims.py --count 1000 --out tests/claims.json
# Compare Java vs Zig output
python tests/compare_groupers.py --file tests/claims.json
# Benchmark both
python tests/compare_groupers.py --file tests/claims.json --benchmark
The Java comparison requires JDK 17+ and the reference JARs in
jars/. This is only needed for validation — the Python package itself has no Java dependency.
C API
mz-drg exposes a C ABI for integration with any language. See zig_src/src/c_api.zig for the full API.
// Initialize (loads all data, pre-builds chains)
void* ctx = msdrg_context_init("/path/to/data/bin");
// Group a claim via JSON
const char* result_json = msdrg_group_json(ctx, "{\"version\":431,...}");
// Free
msdrg_string_free(result_json);
msdrg_context_free(ctx);
Functions are thread-safe after initialization. The context is immutable and can be shared across threads.
Project structure
mz-drg/
├── msdrg/ # Python package
│ ├── __init__.py
│ └── grouper.py # MsdrgGrouper class
├── zig_src/ # Zig source
│ ├── build.zig
│ ├── main.zig
│ └── src/
│ ├── c_api.zig # C ABI exports
│ ├── json_api.zig # JSON in/out
│ ├── msdrg.zig # GrouperChain + version routing
│ ├── chain.zig # Composable processor chain
│ ├── models.zig # Data models
│ ├── preprocess.zig # Exclusion & attribute handling
│ ├── grouping.zig # DRG formula matching
│ ├── marking.zig # Code marking logic
│ ├── hac.zig # Hospital-Acquired Conditions
│ └── ... # 20+ modules, ~8,500 lines
├── data/bin/ # Prebuilt binary data (16 files)
├── scripts/ # Data extraction & compilation
├── tests/ # Comparison & benchmark tools
├── python_client/ # Legacy Python wrapper
├── pyproject.toml
└── setup.py
License
MIT — see LICENSE.
Acknowledgments
This project is intended for healthcare IT professionals who need a fast, embeddable, and auditable DRG classification engine.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file msdrg-0.1.3.tar.gz.
File metadata
- Download URL: msdrg-0.1.3.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5d540cb9147862e9f04cf52a116ac60bc6a4835ef43872295b29e74732d070d
|
|
| MD5 |
5821be47211d6d1a538f14662d474d8d
|
|
| BLAKE2b-256 |
ac41e4959ca4728ac63ce7be747996b3e22bddb2a9c1085ed73a726085b99470
|
Provenance
The following attestation bundles were made for msdrg-0.1.3.tar.gz:
Publisher:
build.yml on Bedrock-Billing/mz-drg
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
msdrg-0.1.3.tar.gz -
Subject digest:
a5d540cb9147862e9f04cf52a116ac60bc6a4835ef43872295b29e74732d070d - Sigstore transparency entry: 1181344572
- Sigstore integration time:
-
Permalink:
Bedrock-Billing/mz-drg@734445def562a11a9d71deca782fa7c833b37b9a -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/Bedrock-Billing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@734445def562a11a9d71deca782fa7c833b37b9a -
Trigger Event:
push
-
Statement type:
File details
Details for the file msdrg-0.1.3-py3-none-win_amd64.whl.
File metadata
- Download URL: msdrg-0.1.3-py3-none-win_amd64.whl
- Upload date:
- Size: 1.3 MB
- Tags: Python 3, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e42782d71d51d938e39a345aa842e89ff70371483cc2b32a819194129afbc3d
|
|
| MD5 |
ae35b42b23c8d7b462467131bf9abfd5
|
|
| BLAKE2b-256 |
d774edf0dbc09bb9bac729cc71e7ec8a49a0f126ec2f09d4fcf1a0a3e8227b09
|
Provenance
The following attestation bundles were made for msdrg-0.1.3-py3-none-win_amd64.whl:
Publisher:
build.yml on Bedrock-Billing/mz-drg
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
msdrg-0.1.3-py3-none-win_amd64.whl -
Subject digest:
4e42782d71d51d938e39a345aa842e89ff70371483cc2b32a819194129afbc3d - Sigstore transparency entry: 1181344608
- Sigstore integration time:
-
Permalink:
Bedrock-Billing/mz-drg@734445def562a11a9d71deca782fa7c833b37b9a -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/Bedrock-Billing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@734445def562a11a9d71deca782fa7c833b37b9a -
Trigger Event:
push
-
Statement type:
File details
Details for the file msdrg-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: msdrg-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 2.3 MB
- Tags: Python 3, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
581a155426227b152ac3b6127ecb2bb85722b3599f7402d553bd5e4fc248a14a
|
|
| MD5 |
b3f6f82448a05d0b9ef28d48acb38d91
|
|
| BLAKE2b-256 |
12954824886048b83301c465eec4ceeee6a52b694ce4911dae1596583b74dea1
|
Provenance
The following attestation bundles were made for msdrg-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
build.yml on Bedrock-Billing/mz-drg
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
msdrg-0.1.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
581a155426227b152ac3b6127ecb2bb85722b3599f7402d553bd5e4fc248a14a - Sigstore transparency entry: 1181344581
- Sigstore integration time:
-
Permalink:
Bedrock-Billing/mz-drg@734445def562a11a9d71deca782fa7c833b37b9a -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/Bedrock-Billing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@734445def562a11a9d71deca782fa7c833b37b9a -
Trigger Event:
push
-
Statement type:
File details
Details for the file msdrg-0.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: msdrg-0.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 2.3 MB
- Tags: Python 3, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7991ec9285b028b38a1f176fd67db1dfdbab9bb00f35f36317b4183e13041c0a
|
|
| MD5 |
0a0508ad2d34cf8918b5119ce8be4828
|
|
| BLAKE2b-256 |
d19d90ef0d49d0b51d7d94e91efdb1c461690597d50fc6e8c3af13d1904d3652
|
Provenance
The following attestation bundles were made for msdrg-0.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
build.yml on Bedrock-Billing/mz-drg
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
msdrg-0.1.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
7991ec9285b028b38a1f176fd67db1dfdbab9bb00f35f36317b4183e13041c0a - Sigstore transparency entry: 1181344598
- Sigstore integration time:
-
Permalink:
Bedrock-Billing/mz-drg@734445def562a11a9d71deca782fa7c833b37b9a -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/Bedrock-Billing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@734445def562a11a9d71deca782fa7c833b37b9a -
Trigger Event:
push
-
Statement type:
File details
Details for the file msdrg-0.1.3-py3-none-macosx_11_0_arm64.whl.
File metadata
- Download URL: msdrg-0.1.3-py3-none-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.2 MB
- Tags: Python 3, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ee7c1fd387b6be4caba7803341b6f14ac4712c96f16ffd947e7672d2c73c5ac
|
|
| MD5 |
fe7135f29251468293d9f20843167a07
|
|
| BLAKE2b-256 |
459229c28b4ad2c543f76c5731eb5560ad312598ab4ce9e603ca535e94691c1e
|
Provenance
The following attestation bundles were made for msdrg-0.1.3-py3-none-macosx_11_0_arm64.whl:
Publisher:
build.yml on Bedrock-Billing/mz-drg
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
msdrg-0.1.3-py3-none-macosx_11_0_arm64.whl -
Subject digest:
0ee7c1fd387b6be4caba7803341b6f14ac4712c96f16ffd947e7672d2c73c5ac - Sigstore transparency entry: 1181344590
- Sigstore integration time:
-
Permalink:
Bedrock-Billing/mz-drg@734445def562a11a9d71deca782fa7c833b37b9a -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/Bedrock-Billing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@734445def562a11a9d71deca782fa7c833b37b9a -
Trigger Event:
push
-
Statement type:
File details
Details for the file msdrg-0.1.3-py3-none-macosx_10_13_x86_64.whl.
File metadata
- Download URL: msdrg-0.1.3-py3-none-macosx_10_13_x86_64.whl
- Upload date:
- Size: 1.2 MB
- Tags: Python 3, macOS 10.13+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
da2b82d477370eb2be14ae9ae937a6f6152bd3e1051b05645193bad6859a6c1c
|
|
| MD5 |
82aff455513974d422ec037f2a23e780
|
|
| BLAKE2b-256 |
2cff571d6d5dc8e04cd3c929a6a4d83b7aef76fc0367eb60ee782d3231adf844
|
Provenance
The following attestation bundles were made for msdrg-0.1.3-py3-none-macosx_10_13_x86_64.whl:
Publisher:
build.yml on Bedrock-Billing/mz-drg
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
msdrg-0.1.3-py3-none-macosx_10_13_x86_64.whl -
Subject digest:
da2b82d477370eb2be14ae9ae937a6f6152bd3e1051b05645193bad6859a6c1c - Sigstore transparency entry: 1181344593
- Sigstore integration time:
-
Permalink:
Bedrock-Billing/mz-drg@734445def562a11a9d71deca782fa7c833b37b9a -
Branch / Tag:
refs/tags/v0.1.3 - Owner: https://github.com/Bedrock-Billing
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@734445def562a11a9d71deca782fa7c833b37b9a -
Trigger Event:
push
-
Statement type: