A complete parser for Microsoft Interface Definition Language (MIDL/IDL) files
Project description
Classic MIDL Parser
A complete Python parser for classic Microsoft Interface Definition Language (MIDL/IDL) files. Parses IDL files into rich, structured Python objects suitable for code generation, analysis, and documentation.
Scope: This parser targets classic MIDL (used for COM, RPC, and DirectX headers). MIDL 3.0 / MIDLRT (the WinRT dialect) is not supported.
Installation
No external dependencies required. Python 3.10+ (uses X | Y union syntax).
pip install midl-classic
Or for development:
# Just copy or symlink midl_classic/ into your project, or:
pip install -e .
Quick Start
from midl_classic import parse_file, parse_string
# Parse an IDL file
midl = parse_file("path/to/file.idl")
# Or parse from a string
midl = parse_string('''
typedef enum Color { RED = 0, GREEN = 1, BLUE = 2 } COLOR;
''')
# Access parsed elements by type
for iface in midl.interfaces:
print(f"Interface: {iface.name} (UUID: {iface.uuid})")
for method in iface.methods:
print(f" {method.return_type.format()} {method.name}()")
for param in method.params:
print(f" [{param.direction_str()}] {param.type_spec.format()} {param.name}")
for enum in midl.enums:
print(f"Enum: {enum.name} ({len(enum.members)} values)")
for struct in midl.structs:
print(f"Struct: {struct.name} ({len(struct.members)} fields)")
API Reference
Top-Level Functions
parse_file(path: str) -> MidlFile
Parse an IDL file from disk. Handles UTF-8 with BOM.
parse_string(source: str, filename: str = "<string>") -> MidlFile
Parse MIDL source text from a string.
MidlFile
The root node returned by both parse functions. Contains all parsed elements.
| Property | Type | Description |
|---|---|---|
filename |
str |
Source filename |
elements |
list[MidlElement] |
All parsed elements in order |
imports |
list[ImportStatement] |
import statements |
constants |
list[Constant] |
const declarations |
enums |
list[EnumDef] |
Enum definitions |
structs |
list[StructDef] |
Struct definitions |
unions |
list[UnionDef] |
Union definitions |
interfaces |
list[InterfaceDef] |
Interface definitions |
typedefs |
list[TypeAlias] |
Simple type aliases |
libraries |
list[LibraryDef] |
Type library definitions |
coclasses |
list[CoclassDef] |
Coclass definitions |
forward_decls |
list[ForwardDecl] |
Forward declarations |
cpp_quotes |
list[CppQuote] |
cpp_quote() directives |
TypeSpec — Structured Type Representation
Types are not stored as strings. Each type is a TypeSpec with:
@dataclass
class TypeSpec:
base_name: str # "UINT", "void", "ID3D11Device"
is_const: bool # leading 'const'
is_signed: bool | None # 'signed' modifier
is_unsigned: bool # 'unsigned' modifier
is_volatile: bool # 'volatile' modifier
pointer_levels: list[PointerLevel] # each * indirection
calling_convention: str | None # __stdcall, __cdecl, etc.
Each PointerLevel tracks whether it's a const pointer (*const):
# const void* -> is_const=True, base="void", ptrs=[PL(False)]
# IUnknown*const* -> base="IUnknown", ptrs=[PL(is_const=True), PL(False)]
# unsigned long -> is_unsigned=True, base="long"
Use ts.format() to get a readable string, or inspect fields directly:
if param.type_spec.is_pointer:
print(f"Pointer depth: {param.type_spec.pointer_depth}")
Attributes
MIDL attributes ([in], [out], [uuid(...)], etc.) are parsed into Attribute objects:
@dataclass
class Attribute:
name: AttributeName # Enum value (IN, OUT, UUID, SIZE_IS, ...)
raw_name: str # Original text
value: str | None # Parenthesized content if any
Over 100 well-known attributes are recognized via the AttributeName enum. Unknown attributes get AttributeName.CUSTOM.
Expressions
Enum values, constants, and array sizes are parsed into expression trees:
IntegerLiteral(value=255, base=16, suffix="", raw="0xff")
IdentifierRef(name="D3D_PRIMITIVE_TOPOLOGY_UNDEFINED")
BinaryOp(op="|", left=IdentifierRef("A"), right=IdentifierRef("B"))
UnaryOp(op="-", operand=IntegerLiteral(value=10))
Method Parameters — Rich Annotations
MethodParam provides convenience properties:
param.is_in # True if [in]
param.is_out # True if [out]
param.is_retval # True if [retval]
param.is_optional # True if [optional] or annotation contains "_opt_"
param.is_string # True if [string]
param.size_is # "count" from [size_is(count)]
param.max_is # value from [max_is(...)]
param.length_is # value from [length_is(...)]
param.iid_is # "riid" from [iid_is(riid)]
param.annotation # SAL annotation string
param.direction_str() # "in", "out", "in, out", "out, retval", etc.
Struct Members
Struct fields support:
- Attributes:
[annotation("_Field_size_(n)")] - Array dimensions:
WCHAR Name[128],FLOAT Transform[3][4] - Bitfields:
UINT Flags : 8 - Anonymous unions/structs: Nested unnamed aggregates
for member in struct_def.members:
if isinstance(member, StructField):
print(f"{member.type_spec.format()} {member.name}")
if member.bitfield_width:
print(f" bitfield: {member.bitfield_width} bits")
for dim in member.array_dimensions:
print(f" array dim: {dim.size}")
elif isinstance(member, AnonymousUnion):
print(" anonymous union { ... }")
elif isinstance(member, AnonymousStruct):
print(" anonymous struct { ... }")
Enum Members
for member in enum_def.members:
if member.value is None:
print(f"{member.name} (auto)")
elif isinstance(member.value, IntegerLiteral):
print(f"{member.name} = {member.value.value}")
elif isinstance(member.value, IdentifierRef):
print(f"{member.name} = {member.value.name}")
elif isinstance(member.value, BinaryOp):
print(f"{member.name} = (expression)")
Discriminated Unions
Discriminated unions with [case()] / [default] arms:
for union_def in midl.unions:
for case in union_def.cases:
if case.is_default:
print("default:")
else:
values = [str(v.value) for v in case.case_values
if isinstance(v, IntegerLiteral)]
print(f"case({', '.join(values)}):")
if case.member:
print(f" {case.member.type_spec.format()} {case.member.name}")
Supported Constructs
| Construct | Example |
|---|---|
| Import | import "oaidl.idl"; |
| cpp_quote | cpp_quote("#include <windows.h>") |
| Preprocessor | #define, #pragma, #ifdef/#endif |
| Constants | const UINT VAL = 0xff; with hex/decimal/suffixed integers |
| Enums | typedef enum { A=0, B=1 } E; with expressions, cross-refs |
| Structs | Nested unions, bitfields, arrays, multi-dim, annotations |
| Unions | Simple and discriminated ([case]/[default]) |
| Typedefs | Simple aliases, attributed, pipe types |
| Function pointers | typedef void(__stdcall *PFN)(void*); |
| Interfaces | With inheritance, UUID, methods, inline typedefs |
| Methods | With [in]/[out]/[retval], [size_is], [annotation] |
| Forward decls | interface IFoo; |
| Libraries | library Name { importlib; coclass; } |
| Coclasses | coclass Name { [default] interface IFoo; } |
| Property methods | [propget], [propput] |
| RPC attributes | [idempotent], [maybe], [broadcast], [callback] |
| Context handles | [context_handle] |
CLI Tool
The included midl_dump.py prints all elements with rich annotations:
# Dump everything
python midl_dump.py examples/dxgi.idl
# Filter by type
python midl_dump.py examples/d3d11.idl --filter interfaces
# Verbose mode (enum values, struct fields, etc.)
python midl_dump.py examples/d3d12.idl --filter enums --verbose
# Available filters: all, imports, constants, enums, structs, unions,
# typedefs, interfaces, forward_decls, libraries, coclasses, cpp_quotes
Sample output:
=== examples/dxgi.idl ===
Total elements: 106
--- Interfaces (14) ---
interface IDXGIObject : IUnknown
uuid: aec22fb8-76f3-4639-9be0-28eb43a67a2e [object, local, pointer_default(unique)]
methods: 4
HRESULT SetPrivateData(3 params)
[in] REFGUID Name {annotation("_In_")}
[in] UINT DataSize
[in] const void* pData {annotation("_In_reads_bytes_(DataSize)")}
HRESULT GetParent(2 params)
[in] REFIID riid {annotation("_In_")}
[out, retval] void** ppParent {annotation("_COM_Outptr_")}
Tested On
Successfully parses all DirectX IDL headers:
d3d11.idl(5,480 lines, 41 interfaces, 133 structs, 72 enums)d3d12.idl(6,528 lines, 73 interfaces, 266 structs, 156 enums)dxgi.idlthroughdxgi1_6.idld3dcommon.idl,d3d11_1.idlthroughd3d11_4.idl- COM type library examples with
library/coclass - RPC examples with discriminated unions, pipes, context handles
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file midl_classic-0.1.0.tar.gz.
File metadata
- Download URL: midl_classic-0.1.0.tar.gz
- Upload date:
- Size: 32.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27607eefd4714bc05b898360cd9fe7b2dbbd84e58a53b680bf5c507523a26e3f
|
|
| MD5 |
53d4e3f6ff2b15263abd31ed9dd94e23
|
|
| BLAKE2b-256 |
bed543209f2fa1a8b42635832a6a4c7ed66d678787ed0b1287beee1107f87b8a
|
Provenance
The following attestation bundles were made for midl_classic-0.1.0.tar.gz:
Publisher:
publish.yml on osy/midl-classic
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
midl_classic-0.1.0.tar.gz -
Subject digest:
27607eefd4714bc05b898360cd9fe7b2dbbd84e58a53b680bf5c507523a26e3f - Sigstore transparency entry: 1488328883
- Sigstore integration time:
-
Permalink:
osy/midl-classic@130e10dae1299519014e1a2e324511b0585f9813 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/osy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@130e10dae1299519014e1a2e324511b0585f9813 -
Trigger Event:
push
-
Statement type:
File details
Details for the file midl_classic-0.1.0-py3-none-any.whl.
File metadata
- Download URL: midl_classic-0.1.0-py3-none-any.whl
- Upload date:
- Size: 32.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c74e91fa170dd477d4e899a88b0bfcfe304be30fd9eca2f9b89e00ab39cb3c1
|
|
| MD5 |
97052ba66cf1a396a66d1a4578e0cb7d
|
|
| BLAKE2b-256 |
396c7505fa42994ba80653948fde4b09245031e54c2ae0c7bd5ea679deb8f00e
|
Provenance
The following attestation bundles were made for midl_classic-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on osy/midl-classic
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
midl_classic-0.1.0-py3-none-any.whl -
Subject digest:
5c74e91fa170dd477d4e899a88b0bfcfe304be30fd9eca2f9b89e00ab39cb3c1 - Sigstore transparency entry: 1488328979
- Sigstore integration time:
-
Permalink:
osy/midl-classic@130e10dae1299519014e1a2e324511b0585f9813 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/osy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@130e10dae1299519014e1a2e324511b0585f9813 -
Trigger Event:
push
-
Statement type: