Skip to main content

dwarffi is a Python library for parsing ISF files and providing an interface to access kernel symbols and types.

Project description

dwarffi

A debug-symbol-powered type interface for Python.

dwarffi lets you interact with real-world memory layouts using Intermediate Symbol Files (ISF)—portable JSON representations of compiled type information. It provides a CFFI-like experience without requiring header files: instead, it uses the structures as they exist in the compiled binary’s debug data.

ISF files encode the exact memory layout (offsets, padding, alignment, bitfields, pointer size, endianness) as defined by the toolchain and target architecture.

  • Linux / embedded workflows: ISF generated from DWARF in ELF binaries (e.g., via dwarf2json).
  • Windows workflows: ISF generated from PDB symbols (e.g., Volatility3-style Windows ISFs generated from PDBs).
  • MacOS / Mach-O workflows: ISF generated from DWARF in Mach-O binaries.

Read more about dwarf2json and ISF in the dwarf2json README.

For Windows ISF context, see Volatility3.

You can also find many symbol tables in ISF format on Volatility3's symbols repository.

This project builds on a tremendous amount of prior work in the volatility community and takes inspiration from projects like ctypes, cffi, pyelftools, and volatility3's symbol handling. The core innovation is the seamless integration of ISF as a first-class type system in Python, with powerful features for live memory access and dynamic type generation.

On dwarf2json

We've forked dwarf2json to add support for function signatures and to generate more complete ISF files that include function metadata. This allows dwarffi to provide a richer experience when working with functions, including parameter types and return types. See the fork here

You may still use the original volatility version of dwarf2json if you don't need function signatures, but we recommend using our fork for the best experience with dwarffi.


🚀 Features

  • ISF-Native: No header rewriting. Point to a .json or .json.xz ISF file and use types immediately.
  • Cross-Platform Symbols:
    • DWARF-backed ISFs (common on Linux/ELF, embedded targets)
    • PDB-backed ISFs (common for Windows kernel/user-mode analysis toolchains)
  • Architecture & ABI Aware: Handles big-endian and little-endian layouts transparently, respects pointer width and packing.
  • Dynamic cdef: Compile C code on the fly to generate types, with automatic debug-type retention to prevent compilers from stripping unused definitions.
  • Recursive Typedef Handling: Automatic resolution and decay of typedef chains.
  • C-Style Magic:
    • Pointer arithmetic (ptr + 5)
    • Pointer subtraction (ptr2 - ptr1)
    • Array slicing (arr[1:5])
    • Deep struct initialization via nested dictionaries
  • Safety Semantics:
    • Automatic bit-masking
    • Sign extension
    • C-style integer overflow/underflow behavior
  • Anonymous Struct/Union Flattening: Access anonymous union members directly (ideal for register maps).
  • ISF Export Support: Save dynamically generated ISFs to .json or .json.xz.
  • Introspection Utilities:
    • inspect_layout() for pahole-style field offsets/padding
    • pretty_print() and to_dict() for human-readable / JSON-friendly inspection of instances
  • Live Memory Backends: Interface directly with QEMU, GDB, Volatility, or raw firmware dumps using a simple read/write API.
  • Pointer Chaining: Recursively dereference pointers (ptr.deref()) and stride through remote memory using C-style array indexing (ptr[5]).
  • Zero-Copy Performance: High-performance handler binding that automatically switches between zero-copy native buffer access and backend proxying.
  • Fuzzy Search: Find symbols and types across massive ISFs using glob or regex patterns.

📦 Installation

pip install dwarffi

Requirements for cdef()

To use dynamic compilation:

  • A C compiler (gcc, clang, or cross-compiler)
  • dwarf2json available in your PATH

NOTE: Some compilers may optimize away unused debug types. For example, with gcc, use: -fno-eliminate-unused-debug-types.


🛠️ Quick Start

Load an ISF (Linux/ELF DWARF)

from dwarffi import DFFI

# Accepts .json or .json.xz
ffi = DFFI("ubuntu:5.4.0-26-generic:64.json.xz")


list_head_type = ffi.typeof("list_head")
print("list_head sizeof:", ffi.sizeof(list_head_type))
print(list_head_type)

''' prints out:
struct list_head (size: 16 bytes) {
  [+0  ] pointer next;
  [+8  ] pointer prev;
}
'''

# make a new complex type
proc = ffi.new("struct task_struct", init={"pid": 1234, "comm": b"my_process"})


print(proc.pid)              # 1234
print(bytes(proc.comm))      # b'my_process\x00\x00\x00\x00\x00\x00'
print(ffi.string(proc.comm)) # b'my_process'

Download this example .json.xz here.


Load an ISF (Windows PDB-derived / Volatility-style)

from dwarffi import DFFI

# Volatility-style Windows symbols are typically .json.xz ISFs
ffi = DFFI("ntkrnlmp.pdb/<GUID>-<AGE>.json.xz")

le = ffi.typeof("struct _LIST_ENTRY")
buf = bytearray(ffi.sizeof(le))
inst = ffi.from_buffer("struct _LIST_ENTRY", buf)

inst.Flink = 0x1122334455667788
inst.Blink = 0x8877665544332211

print(ffi.pretty_print(inst))
print(ffi.to_dict(inst))

ffi.inspect_layout("struct _UNICODE_STRING")

CFFI-style cdef

We do support inline C definitions that compile down to DWARF and ISF on the fly. This is ideal for quick prototyping or when you have a small struct definition that isn't already in your ISF.

from dwarffi import DFFI

ffi = DFFI()
ffi.cdef("""
    struct sensor_data {
        uint32_t timestamp;
        int16_t  readings[3];
        uint8_t  status;
    };
""")

sensor = ffi.new("struct sensor_data", {
    "timestamp": 1234567,
    "readings": [10, -5, 20],
    "status": 0x01
})

print(f"Bytes: {ffi.to_bytes(sensor).hex()}")
print(f"Reading[1]: {sensor.readings[1]}")  # -5

🧩 Advanced Usage

Anonymous Unions

ffi.cdef("""
struct reg_map {
    union {
        uint32_t ALL;
        struct {
            uint16_t LOW;
            uint16_t HIGH;
        };
    };
};
""")

reg = ffi.new("struct reg_map")
reg.ALL = 0x12345678
print(hex(reg.HIGH))  # 0x1234

Pointer Arithmetic

ptr = ffi.cast("int *", 0x4000)
next_ptr = ptr + 1
print(hex(next_ptr.address))

🧠 Memory Backends & Live Data

dwarffi can bind to live external memory (debuggers, emulators, or remote targets). Instead of snapshotting memory into a local bytearray, you can use from_address() to interact with the target in real-time.

Using Raw Bytes (Mapping at Address 0) If you provide raw bytes or a bytearray as a backend, dwarffi treats it as a physical memory map starting at address 0x0.

# firmware.bin is 1MB
with open("firmware.bin", "rb") as f:
    ffi = DFFI(isf, backend=f.read())

# Map a header at its specific physical address
header = ffi.from_address("struct fw_header", 0x4000)
print(f"Magic: {hex(header.magic)}")

Implementing a Custom Backend (e.g., GDB)

You can wrap any memory access API by implementing the MemoryBackend interface.

from dwarffi.backend import MemoryBackend

class GDBBackend(MemoryBackend):
    def read(self, address: int, size: int) -> bytes:
        return gdb.selected_inferior().read_memory(address, size).tobytes()

    def write(self, address: int, data: bytes) -> None:
        gdb.selected_inferior().write_memory(address, data)

ffi = DFFI(isf, backend=GDBBackend())

# Now 'task' reads memory from GDB on-demand when you access fields
task = ffi.from_address("struct task_struct", 0xffff888000000000)
print(f"Current PID: {task.pid}")

Live Pointer Traversal

When a MemoryBackend is configured, Ptr objects become "live." Calling .deref() or using array indexing fetches the target memory from the backend automatically.

# Get a pointer to an array of nodes in kernel memory
list_ptr = ffi.from_address("struct node *", 0x2000)

# Chained dereferencing (node->next->next)
# Each .deref() triggers a backend read
third_node = list_ptr.deref().next.deref().next.deref()

# C-style array indexing on the backend
fifth_node = list_ptr[4]

Fuzzy Symbol Discovery

# Find all kernel syscall table entries
syscalls = ffi.search_symbols("__x64_sys_*")

for name, sym in syscalls.items():
    print(f"Found {name} at {hex(sym.address)}")

Walking a Process List

# Simulating a container_of walk through a kernel task list
init_task = ffi.from_address("struct task_struct", ffi.get_symbol("init_task").address)

# Walk the circular 'tasks' list_head
curr_list = init_task.tasks.next.deref()

while curr_list.address != init_task.tasks.address:
    # Use cast with address arithmetic to get the parent task_struct
    task_addr = curr_list.address - ffi.offsetof("struct task_struct", "tasks")
    task = ffi.cast("struct task_struct", task_addr)
    
    print(f"Process: {ffi.string(task.comm)} [PID: {task.pid}]")
    curr_list = curr_list.next.deref()

⚙️ How It Works

dwarffi operates in three phases:

1. Parsing

Loads one or more ISF files (.json or .json.xz) that represent a compiled type tree (e.g., derived from DWARF or PDB symbols).

2. Type Synthesis

Builds Python representations of:

  • Base types
  • Structs / unions
  • Enums
  • Typedef chains
  • Arrays
  • Pointers
  • Bitfields

3. Memory Mapping

Uses Python’s struct.pack / struct.unpack to:

  • Convert Python integers into architecture-accurate byte layouts
  • Apply endianness rules and pointer size
  • Respect alignment, padding, and bitfield masks

Instances are bound views into bytearray buffers. Field access directly reads/writes into the underlying buffer.


📚 Core API Reference

DFFI(path: str | Path | dict | None = None)

Create an empty instance or load an ISF (from a dict, .json, or .json.xz).

cdef(source, compiler="gcc", save_isf_to=None)

Compile C source → DWARF → ISF → load into current FFI.

new(ctype, init=None)

Allocate a new instance of a C type.

from_buffer(ctype, buffer)

Bind a type to existing memory.

sizeof(ctype)

Return size in bytes.

offsetof(ctype, field)

Return byte offset of a field.

cast(ctype, value)

Reinterpret memory or create pointer instances.

addressof(instance)

Return pointer to instance.

inspect_layout(ctype)

Print pahole-style offsets and padding.

pretty_print(cdata)

Recursively format bound instances/arrays/pointers as a readable tree.

to_dict(cdata)

Convert bound instances/arrays/pointers to Python-native structures.

from_address(ctype, address)

Binds a type to a specific address in the configured MemoryBackend. Returns a BoundTypeInstance or a Ptr.

search_symbols(pattern, use_regex=False)

Searches for symbols matching a glob (e.g., sys_call) or regex pattern across all loaded ISFs.

addressof(instance, *fields)

Returns a Ptr to an instance or a nested field. If the instance is backend-backed, the pointer address will be the absolute address in that backend.

Ptr.deref()

The core dereference operator. If the pointer targets another pointer, it actively resolves the chain by reading from the backend.


🤝 Contributing

Contributions are welcome!

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dwarffi-0.0.31.tar.gz (124.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dwarffi-0.0.31-py3-none-any.whl (51.2 kB view details)

Uploaded Python 3

dwarffi-0.0.31-cp310-abi3-win_amd64.whl (834.0 kB view details)

Uploaded CPython 3.10+Windows x86-64

dwarffi-0.0.31-cp310-abi3-manylinux2014_x86_64.whl (816.2 kB view details)

Uploaded CPython 3.10+

dwarffi-0.0.31-cp310-abi3-manylinux2014_aarch64.whl (717.1 kB view details)

Uploaded CPython 3.10+

dwarffi-0.0.31-cp310-abi3-macosx_11_0_arm64.whl (1.0 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

dwarffi-0.0.31-cp310-abi3-macosx_10_9_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10+macOS 10.9+ x86-64

File details

Details for the file dwarffi-0.0.31.tar.gz.

File metadata

  • Download URL: dwarffi-0.0.31.tar.gz
  • Upload date:
  • Size: 124.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dwarffi-0.0.31.tar.gz
Algorithm Hash digest
SHA256 b6bf10ca0f46839a0d7ec1ab92a294b07705cd2f0995832e1473264bd5a907fc
MD5 80eea3d4f3cf57169127971f4302ab85
BLAKE2b-256 ece21fa7a8f2775d47677811a784097864f5e33f1e696fd9d867dc6ffd7e222f

See more details on using hashes here.

Provenance

The following attestation bundles were made for dwarffi-0.0.31.tar.gz:

Publisher: publish.yml on rehosting/dwarffi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dwarffi-0.0.31-py3-none-any.whl.

File metadata

  • Download URL: dwarffi-0.0.31-py3-none-any.whl
  • Upload date:
  • Size: 51.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dwarffi-0.0.31-py3-none-any.whl
Algorithm Hash digest
SHA256 47b3223a291b0a3984b73aa4b6b2a291e18eeee465aa41ebe1d8f7b3b1174ff7
MD5 5cab382eaa455f59faf80ed577b7bab6
BLAKE2b-256 f603f3fa9d4a8d0b31c005ba08ea4277625bb2ad26ab1bf5d6d296c9d0e4c479

See more details on using hashes here.

Provenance

The following attestation bundles were made for dwarffi-0.0.31-py3-none-any.whl:

Publisher: publish.yml on rehosting/dwarffi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dwarffi-0.0.31-cp310-abi3-win_amd64.whl.

File metadata

  • Download URL: dwarffi-0.0.31-cp310-abi3-win_amd64.whl
  • Upload date:
  • Size: 834.0 kB
  • Tags: CPython 3.10+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dwarffi-0.0.31-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 cf35d7f16bdd11683285cca965c52775a16c0af01fc1f862870648e9fe574e04
MD5 9773fb7cb36b4d7974049c24c595cec5
BLAKE2b-256 9a7c6c3cf8a27903ee2521e8e726f86d82e07294559c7ec6ddd85242f061d14b

See more details on using hashes here.

Provenance

The following attestation bundles were made for dwarffi-0.0.31-cp310-abi3-win_amd64.whl:

Publisher: publish.yml on rehosting/dwarffi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dwarffi-0.0.31-cp310-abi3-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dwarffi-0.0.31-cp310-abi3-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b11d73d3d82e0a306ef3aa8a2409b16f9d180350ff29c691ee3450076a8bd644
MD5 a724e468642eb1ccb0fe4f94e69e6020
BLAKE2b-256 74d093c1a3b0447db0d5fb0cde47085e1436823e46f9a66aaeec7d4c13163000

See more details on using hashes here.

Provenance

The following attestation bundles were made for dwarffi-0.0.31-cp310-abi3-manylinux2014_x86_64.whl:

Publisher: publish.yml on rehosting/dwarffi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dwarffi-0.0.31-cp310-abi3-manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dwarffi-0.0.31-cp310-abi3-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 5c157f23dc1576aacd138e20acfd6038b3b8e858345ec2a72d1d1a4e684cc1d9
MD5 5b4d0c6e2b58054ac2402eca509b6c49
BLAKE2b-256 b6f1057d4f4419f46dbf76d936db4fdc2639e6a8b5239c98eb68ca462a9257e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for dwarffi-0.0.31-cp310-abi3-manylinux2014_aarch64.whl:

Publisher: publish.yml on rehosting/dwarffi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dwarffi-0.0.31-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dwarffi-0.0.31-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 97eafbedec97bdf3e5a7832bab18db69d5146c344d9b1c71796806b75daad23d
MD5 fa3f39a7c355d96392db86cc88e50ab3
BLAKE2b-256 fd9c84ab04581fa47e6441113da88939f761d428183c09675f4f5100ed05fae5

See more details on using hashes here.

Provenance

The following attestation bundles were made for dwarffi-0.0.31-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: publish.yml on rehosting/dwarffi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dwarffi-0.0.31-cp310-abi3-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for dwarffi-0.0.31-cp310-abi3-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 87de9cd7148299cf5379b741c6c1b511463fbd93109370133312b0cb7fed8ff7
MD5 607a9098c9da6cb4d1a253db823c3881
BLAKE2b-256 89ed35f4d6b75feee6b039d610009c535a8bf3515d09a48e73d21ef3a557d308

See more details on using hashes here.

Provenance

The following attestation bundles were made for dwarffi-0.0.31-cp310-abi3-macosx_10_9_x86_64.whl:

Publisher: publish.yml on rehosting/dwarffi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page