Skip to main content

C/C++ header analysis toolkit with pluggable backends and writers for ctypes, Cython, CFFI, LuaJIT FFI, and more

Project description

headerkit

CI Docs PyPI Python License

headerkit: A CLI tool and Python library for parsing C/C++ headers.

Generates:

  • Bindings: ctypes modules, CFFI definitions, Cython .pxd files, and LuaJIT FFI.
  • Data: JSON Intermediate Representation (IR) and API diffs.
  • LLMs: Token-optimized header summaries for prompt windows.
  • Builds: PEP 517 backend for standard Python packaging.

Parse once. Build anywhere.

Quick examples

Every example below assumes this input header:

// mylib.h
typedef struct { int x, y; } Point;
int distance(const Point *a, const Point *b);

ctypes -- drop-in Python module, no build step:

headerkit mylib.h -w ctypes -o ctypes:bindings.py
# generated bindings.py
class Point(ctypes.Structure):
    _fields_ = [
        ("x", ctypes.c_int),
        ("y", ctypes.c_int),
    ]

_lib.distance.argtypes = [ctypes.POINTER(Point), ctypes.POINTER(Point)]
_lib.distance.restype = ctypes.c_int

CFFI -- declarations for ffibuilder.cdef():

headerkit mylib.h -w cffi -o cffi:_defs.py
/* generated  _defs.py */
typedef struct Point {
    int x;
    int y;
} Point;
int distance(const Point *a, const Point *b);

Cython -- .pxd for compiled C/C++ interop:

headerkit mylib.h -w cython -o cython:mylib.pxd
# generated mylib.pxd
cdef extern from "mylib.h":

    ctypedef struct Point:
        int x
        int y

    int distance(const Point *a, const Point *b)

LuaJIT FFI -- ffi.cdef bindings for LuaJIT:

headerkit mylib.h -w lua -o lua:mylib_ffi.lua
/* generated mylib_ffi.lua */
local ffi = require("ffi")

ffi.cdef[[

/* Structs */
typedef struct {
    int x;
    int y;
} Point;

/* Functions */
int distance(const Point *a, const Point *b);

]]

JSON -- full IR for custom tooling:

headerkit mylib.h -w json -o json:mylib.json
{
  "path": "mylib.h",
  "declarations": [
    {"kind": "struct", "name": "Point", "fields": [
      {"name": "x", "type": {"kind": "ctype", "name": "int"}},
      {"name": "y", "type": {"kind": "ctype", "name": "int"}}
    ]},
    {"kind": "function", "name": "distance", ...}
  ]
}

Prompt -- token-optimized summary for LLM context windows:

headerkit mylib.h -w prompt
// mylib.h (headerkit compact)
STRUCT Point {x:int, y:int}
FUNC distance(a:const Point*, b:const Point*) -> int

Diff -- API compatibility reports between header versions:

from headerkit.backends import get_backend
from headerkit.writers.diff import DiffWriter

backend = get_backend("libclang")
old = backend.parse('#include "mylib_v1.h"', "v1.h")
new = backend.parse('#include "mylib_v2.h"', "v2.h")
print(DiffWriter(baseline=old, format="markdown").write(new))
## Breaking Changes
### function_signature_changed
- **distance**: parameter 0 type changed from 'const Point *' to 'const Point3D *'

Build backend -- generate cacheable bindings at pip install time:

# In your project's pyproject.toml:
[build-system]
requires = ["headerkit", "hatchling"]
build-backend = "headerkit.build_backend"

Python API -- parse and generate from code:

from headerkit import generate

output = generate("mylib.h", "cffi")
graph LR
    A[C/C++ headers] --> B[backend]
    B --> C[IR]
    C --> D[writer]
    D --> E[output]

Features

  • One parse, many outputs: generate multiple bindings in a single pass with -w ctypes -w cython -o ctypes:lib.py -o cython:lib.pxd
  • Config file support: .headerkit.toml or [tool.headerkit] in pyproject.toml
  • Multi-header merging: pass multiple .h files and they are merged into a single umbrella header

Installation

pip install headerkit

Requires Python 3.10+.

Then install libclang (if not already present):

headerkit install-libclang

Or install it manually:

Platform Command
macOS brew install llvm or Xcode Command Line Tools
Ubuntu sudo apt install libclang-dev
Fedora sudo dnf install clang-devel
Windows winget install LLVM.LLVM or LLVM installer

Supports LLVM 18, 19, 20, and 21.

CLI reference

headerkit [options] FILE [FILE ...]

Flags

Flag Description
-b NAME, --backend NAME Parser backend (default: libclang)
-I DIR Add include directory (repeatable)
-D MACRO[=VALUE] Define preprocessor macro (repeatable)
--backend-arg ARG Pass extra argument to the backend (repeatable)
-w WRITER Writer to use (repeatable)
-o WRITER:TEMPLATE Output path template for a writer (repeatable)
--exclude PATTERN Exclude headers matching glob pattern (repeatable)
--store-dir DIR Store directory (default: .headerkit/; env: HEADERKIT_STORE_DIR)
--writer-opt WRITER:KEY=VALUE Pass an option to a writer (repeatable)
--config PATH Load config from PATH instead of searching
--no-config Skip all config file loading
--version Print version and exit

When no -o flag is given for a writer, output goes to stdout. At most one writer may write to stdout.

Writers

Writer Output Notes
cffi CFFI cdef strings Declarations for ffibuilder.cdef()
ctypes Python module Complete ctypes binding module
cython .pxd file Cython declaration file with C++ support
diff JSON or Markdown API compatibility report between two header versions
json JSON Full IR serialization
lua LuaJIT FFI bindings ffi.cdef() declarations for LuaJIT
prompt Compact text Token-optimized IR for LLM context windows

Pass writer options with --writer-opt:

headerkit mylib.h -w cffi --writer-opt cffi:exclude_patterns=^__
headerkit mylib.h -w ctypes -o ctypes:mylib.py --writer-opt ctypes:lib_name=mylib

Config file

headerkit searches from the current directory upward for .headerkit.toml, or for a [tool.headerkit] section in pyproject.toml. Use --no-config to skip this.

# .headerkit.toml
backend = "libclang"
writers = ["cffi"]
include_dirs = ["/usr/local/include"]
plugins = ["mypkg.headerkit_plugin"]

[writer.cffi]
exclude_patterns = ["^__", "^_internal"]

[writer.ctypes]
lib_name = "mylib"

Config string values support ${VAR} environment variable expansion at load time, which is useful for build-time paths injected by CMake or similar tools (e.g., include_dirs = ["${MY_INCLUDE_DIR}"]).

Command-line flags override config file values.

For projects using CFFI with cffi_buildtool, see the CFFI Integration Guide.

Plugins

Register third-party backends and writers via Python entry points:

# In your package's pyproject.toml
[project.entry-points."headerkit.backends"]
mybackend = "mypkg.backend:MyBackend"

[project.entry-points."headerkit.writers"]
mywriter = "mypkg.writer:MyWriter"

Or load plugins explicitly from the config file:

# .headerkit.toml
plugins = ["mypkg.headerkit_plugin"]

Cache and build backend

headerkit includes a two-layer cache that stores parsed IR and generated output in .headerkit/. Commit the cache to version control and downstream consumers can build without libclang installed.

from headerkit import generate

# First run: parses with libclang, caches result
output = generate("mylib.h", "cffi")

# Second run: loads from cache, no libclang needed
output = generate("mylib.h", "cffi")
# CLI: generate with caching (on by default)
headerkit mylib.h -w cffi -o cffi:bindings.py --store-dir .headerkit

headerkit also ships a PEP 517 build backend. Consumer projects declare it in pyproject.toml and get bindings generated automatically during pip install or python -m build, with no libclang required when the cache is committed:

[build-system]
requires = ["headerkit", "hatchling"]
build-backend = "headerkit.build_backend"

Multi-platform cache population

Generate cache entries for multiple platforms using Docker:

# Populate for common Linux targets
headerkit cache populate mylib.h -w cffi \
    --platform linux/amd64 --platform linux/arm64

# Auto-detect platforms from cibuildwheel config
headerkit cache populate mylib.h -w cffi --cibuildwheel

# Commit the populated cache
git add .headerkit/
git commit -m "cache: populate for linux amd64 + arm64"

When .headerkit/ contains entries for all target platforms, downstream builds never need libclang installed.

See the Cache Strategy Guide for cache layout, bypass flags, and CI integration, and the Build Backend Guide for full setup instructions.

Python API

from headerkit.backends import get_backend
from headerkit.writers import get_writer

backend = get_backend("libclang")
header = backend.parse('#include "mylib.h"', "wrapper.h", include_dirs=["/path/to/include"])

writer = get_writer("cffi")
print(writer.write(header))

Full documentation, guides, and API reference: axiomantic.github.io/headerkit

CI Store Population

headerkit's build backend populates .headerkit/ during wheel builds. To keep the store updated across platforms in CI, see the CI Store Population guide. For projects using cibuildwheel, see the cibuildwheel Integration guide for Linux Docker volume mount configuration.

Development

git clone https://github.com/axiomantic/headerkit.git
cd headerkit
pip install -e '.[dev]'
pytest

License

This project is licensed under the MIT License.

The vendored clang Python bindings in headerkit/_clang/v*/ are from the LLVM Project and are licensed under the Apache License v2.0 with LLVM Exceptions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

headerkit-0.18.0.tar.gz (510.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

headerkit-0.18.0-py3-none-any.whl (304.3 kB view details)

Uploaded Python 3

File details

Details for the file headerkit-0.18.0.tar.gz.

File metadata

  • Download URL: headerkit-0.18.0.tar.gz
  • Upload date:
  • Size: 510.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for headerkit-0.18.0.tar.gz
Algorithm Hash digest
SHA256 1f018448790a8674dbfd8e1ca1de7826891dc64dd4c4f1251d51f89b1be0228e
MD5 1d4813e240b9bba93e6d049127deb8fd
BLAKE2b-256 ea4b3f6764a995028cf86e7808e5a8c5553d1ddc0292f9e7b9e389c8a2eb1139

See more details on using hashes here.

Provenance

The following attestation bundles were made for headerkit-0.18.0.tar.gz:

Publisher: release.yml on axiomantic/headerkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file headerkit-0.18.0-py3-none-any.whl.

File metadata

  • Download URL: headerkit-0.18.0-py3-none-any.whl
  • Upload date:
  • Size: 304.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for headerkit-0.18.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a7348093e29127ab5edd9ae7987f1b798670bab4241ed4c485bcb478e92ed1b2
MD5 57cf31be7073eaa5398d6604f4b2322b
BLAKE2b-256 db9f0dff32f8621b34faa546ba793ce7c9eaaf8c9f96d07d2e82fc1faad4fe98

See more details on using hashes here.

Provenance

The following attestation bundles were made for headerkit-0.18.0-py3-none-any.whl:

Publisher: release.yml on axiomantic/headerkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page