Skip to main content

A complete parser for Microsoft Interface Definition Language (MIDL/IDL) files

Project description

Classic MIDL Parser

A complete Python parser for classic Microsoft Interface Definition Language (MIDL/IDL) files. Parses IDL files into rich, structured Python objects suitable for code generation, analysis, and documentation.

Scope: This parser targets classic MIDL (used for COM, RPC, and DirectX headers). MIDL 3.0 / MIDLRT (the WinRT dialect) is not supported.

Installation

No external dependencies required. Python 3.10+ (uses X | Y union syntax).

pip install midl-classic

Or for development:

# Just copy or symlink midl_classic/ into your project, or:
pip install -e .

Quick Start

from midl_classic import parse_file, parse_string

# Parse an IDL file
midl = parse_file("path/to/file.idl")

# Or parse from a string
midl = parse_string('''
    typedef enum Color { RED = 0, GREEN = 1, BLUE = 2 } COLOR;
''')

# Access parsed elements by type
for iface in midl.interfaces:
    print(f"Interface: {iface.name} (UUID: {iface.uuid})")
    for method in iface.methods:
        print(f"  {method.return_type.format()} {method.name}()")
        for param in method.params:
            print(f"    [{param.direction_str()}] {param.type_spec.format()} {param.name}")

for enum in midl.enums:
    print(f"Enum: {enum.name} ({len(enum.members)} values)")

for struct in midl.structs:
    print(f"Struct: {struct.name} ({len(struct.members)} fields)")

API Reference

Top-Level Functions

parse_file(path: str) -> MidlFile

Parse an IDL file from disk. Handles UTF-8 with BOM.

parse_string(source: str, filename: str = "<string>") -> MidlFile

Parse MIDL source text from a string.

MidlFile

The root node returned by both parse functions. Contains all parsed elements.

Property Type Description
filename str Source filename
elements list[MidlElement] All parsed elements in order
imports list[ImportStatement] import statements
constants list[Constant] const declarations
enums list[EnumDef] Enum definitions
structs list[StructDef] Struct definitions
unions list[UnionDef] Union definitions
interfaces list[InterfaceDef] Interface definitions
typedefs list[TypeAlias] Simple type aliases
libraries list[LibraryDef] Type library definitions
coclasses list[CoclassDef] Coclass definitions
forward_decls list[ForwardDecl] Forward declarations
cpp_quotes list[CppQuote] cpp_quote() directives

TypeSpec — Structured Type Representation

Types are not stored as strings. Each type is a TypeSpec with:

@dataclass
class TypeSpec:
    base_name: str              # "UINT", "void", "ID3D11Device"
    is_const: bool              # leading 'const'
    is_signed: bool | None      # 'signed' modifier
    is_unsigned: bool           # 'unsigned' modifier
    is_volatile: bool           # 'volatile' modifier
    pointer_levels: list[PointerLevel]  # each * indirection
    calling_convention: str | None      # __stdcall, __cdecl, etc.

Each PointerLevel tracks whether it's a const pointer (*const):

# const void*         -> is_const=True, base="void", ptrs=[PL(False)]
# IUnknown*const*     -> base="IUnknown", ptrs=[PL(is_const=True), PL(False)]
# unsigned long       -> is_unsigned=True, base="long"

Use ts.format() to get a readable string, or inspect fields directly:

if param.type_spec.is_pointer:
    print(f"Pointer depth: {param.type_spec.pointer_depth}")

Attributes

MIDL attributes ([in], [out], [uuid(...)], etc.) are parsed into Attribute objects:

@dataclass
class Attribute:
    name: AttributeName   # Enum value (IN, OUT, UUID, SIZE_IS, ...)
    raw_name: str         # Original text
    value: str | None     # Parenthesized content if any

Over 100 well-known attributes are recognized via the AttributeName enum. Unknown attributes get AttributeName.CUSTOM.

Expressions

Enum values, constants, and array sizes are parsed into expression trees:

IntegerLiteral(value=255, base=16, suffix="", raw="0xff")
IdentifierRef(name="D3D_PRIMITIVE_TOPOLOGY_UNDEFINED")
BinaryOp(op="|", left=IdentifierRef("A"), right=IdentifierRef("B"))
UnaryOp(op="-", operand=IntegerLiteral(value=10))

Method Parameters — Rich Annotations

MethodParam provides convenience properties:

param.is_in          # True if [in]
param.is_out         # True if [out]
param.is_retval      # True if [retval]
param.is_optional    # True if [optional] or annotation contains "_opt_"
param.is_string      # True if [string]
param.size_is        # "count" from [size_is(count)]
param.max_is         # value from [max_is(...)]
param.length_is      # value from [length_is(...)]
param.iid_is         # "riid" from [iid_is(riid)]
param.annotation     # SAL annotation string
param.direction_str() # "in", "out", "in, out", "out, retval", etc.

Struct Members

Struct fields support:

  • Attributes: [annotation("_Field_size_(n)")]
  • Array dimensions: WCHAR Name[128], FLOAT Transform[3][4]
  • Bitfields: UINT Flags : 8
  • Anonymous unions/structs: Nested unnamed aggregates
for member in struct_def.members:
    if isinstance(member, StructField):
        print(f"{member.type_spec.format()} {member.name}")
        if member.bitfield_width:
            print(f"  bitfield: {member.bitfield_width} bits")
        for dim in member.array_dimensions:
            print(f"  array dim: {dim.size}")
    elif isinstance(member, AnonymousUnion):
        print("  anonymous union { ... }")
    elif isinstance(member, AnonymousStruct):
        print("  anonymous struct { ... }")

Enum Members

for member in enum_def.members:
    if member.value is None:
        print(f"{member.name} (auto)")
    elif isinstance(member.value, IntegerLiteral):
        print(f"{member.name} = {member.value.value}")
    elif isinstance(member.value, IdentifierRef):
        print(f"{member.name} = {member.value.name}")
    elif isinstance(member.value, BinaryOp):
        print(f"{member.name} = (expression)")

Discriminated Unions

Discriminated unions with [case()] / [default] arms:

for union_def in midl.unions:
    for case in union_def.cases:
        if case.is_default:
            print("default:")
        else:
            values = [str(v.value) for v in case.case_values
                      if isinstance(v, IntegerLiteral)]
            print(f"case({', '.join(values)}):")
        if case.member:
            print(f"  {case.member.type_spec.format()} {case.member.name}")

Supported Constructs

Construct Example
Import import "oaidl.idl";
cpp_quote cpp_quote("#include <windows.h>")
Preprocessor #define, #pragma, #ifdef/#endif
Constants const UINT VAL = 0xff; with hex/decimal/suffixed integers
Enums typedef enum { A=0, B=1 } E; with expressions, cross-refs
Structs Nested unions, bitfields, arrays, multi-dim, annotations
Unions Simple and discriminated ([case]/[default])
Typedefs Simple aliases, attributed, pipe types
Function pointers typedef void(__stdcall *PFN)(void*);
Interfaces With inheritance, UUID, methods, inline typedefs
Methods With [in]/[out]/[retval], [size_is], [annotation]
Forward decls interface IFoo;
Libraries library Name { importlib; coclass; }
Coclasses coclass Name { [default] interface IFoo; }
Property methods [propget], [propput]
RPC attributes [idempotent], [maybe], [broadcast], [callback]
Context handles [context_handle]

CLI Tool

The included midl_dump.py prints all elements with rich annotations:

# Dump everything
python midl_dump.py examples/dxgi.idl

# Filter by type
python midl_dump.py examples/d3d11.idl --filter interfaces

# Verbose mode (enum values, struct fields, etc.)
python midl_dump.py examples/d3d12.idl --filter enums --verbose

# Available filters: all, imports, constants, enums, structs, unions,
#   typedefs, interfaces, forward_decls, libraries, coclasses, cpp_quotes

Sample output:

=== examples/dxgi.idl ===
Total elements: 106

--- Interfaces (14) ---
  interface IDXGIObject : IUnknown
    uuid: aec22fb8-76f3-4639-9be0-28eb43a67a2e  [object, local, pointer_default(unique)]
    methods: 4
      HRESULT SetPrivateData(3 params)
        [in] REFGUID Name  {annotation("_In_")}
        [in] UINT DataSize
        [in] const void* pData  {annotation("_In_reads_bytes_(DataSize)")}
      HRESULT GetParent(2 params)
        [in] REFIID riid  {annotation("_In_")}
        [out, retval] void** ppParent  {annotation("_COM_Outptr_")}

Tested On

Successfully parses all DirectX IDL headers:

  • d3d11.idl (5,480 lines, 41 interfaces, 133 structs, 72 enums)
  • d3d12.idl (6,528 lines, 73 interfaces, 266 structs, 156 enums)
  • dxgi.idl through dxgi1_6.idl
  • d3dcommon.idl, d3d11_1.idl through d3d11_4.idl
  • COM type library examples with library/coclass
  • RPC examples with discriminated unions, pipes, context handles

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

midl_classic-0.1.0.tar.gz (32.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

midl_classic-0.1.0-py3-none-any.whl (32.3 kB view details)

Uploaded Python 3

File details

Details for the file midl_classic-0.1.0.tar.gz.

File metadata

  • Download URL: midl_classic-0.1.0.tar.gz
  • Upload date:
  • Size: 32.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for midl_classic-0.1.0.tar.gz
Algorithm Hash digest
SHA256 27607eefd4714bc05b898360cd9fe7b2dbbd84e58a53b680bf5c507523a26e3f
MD5 53d4e3f6ff2b15263abd31ed9dd94e23
BLAKE2b-256 bed543209f2fa1a8b42635832a6a4c7ed66d678787ed0b1287beee1107f87b8a

See more details on using hashes here.

Provenance

The following attestation bundles were made for midl_classic-0.1.0.tar.gz:

Publisher: publish.yml on osy/midl-classic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file midl_classic-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: midl_classic-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 32.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for midl_classic-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5c74e91fa170dd477d4e899a88b0bfcfe304be30fd9eca2f9b89e00ab39cb3c1
MD5 97052ba66cf1a396a66d1a4578e0cb7d
BLAKE2b-256 396c7505fa42994ba80653948fde4b09245031e54c2ae0c7bd5ea679deb8f00e

See more details on using hashes here.

Provenance

The following attestation bundles were made for midl_classic-0.1.0-py3-none-any.whl:

Publisher: publish.yml on osy/midl-classic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page