mojo-bindgen is a tool for generating Mojo bindings directly from C headers using libclang, — similar in spirit to rust-bindgen, but targeting Mojo.
It parses header files, lowers it into IR, and maps to Mojo code that try to preserve the layout and ABI.
Why this exists
This project initially started out of practical need to generate accurate bindings for Cairo beyond LLMs and bespoke bindgen scripts. In the end - after going through the rabbit-hole of C quirks - it evolved into its own project which aims to provide reliable bindgen.
Usage
Install from PyPI:
pip install mojo-bindgen
Note: make sure you also install libclang (check details in the readme)
The repository also includes working examples and functional tests for moderately complex C headers:
SQLite
Cairo
libpng
What works today
mojo-bindgen is still alpha and evolves quickly, but it already supports a useful slice of real C headers and is practical today as a starting point for generating bindings.
Core pipeline & Types
libclang-based parsing (full C frontend)
configurable compile args
primitives, typedef chains
pointers with const-aware mutability
arrays (fixed and incomplete)
typedef and typedef chains are presrved via comptime declarations
complex numbers and vector types (maps to ComplexSIMD and SIMD[dtype, n] vectors respectively)
Atomics (maps to std.atomic.Atomic)
Records
structs (including mixed layouts)
incomplete records are mapped to opaque structs
bitfields (backing storage + generated accessors)
unions → UnsafeUnion[...] or opaque fallback
layout-preserving opaque storage as a fallback.
Functions & FFI
non-variadic functions
function pointers and callbacks
two linkage modes:
external_call
owned_dl_handle
Globals & Macros
globals via wrapper structs (with load() / store())
constants and foldable macros → comptime
literals (int/float/string/char)
foldable expressions and sizeof
Current limitations
Macros: function-like macros, predefined macros, and more complex preprocessor behavior are preserved and emitted as comments to be reviewed by end-user.
Variadics: variadic C functions are not supported yet, mojo itself had some recent advancement in supporting variadic function, so this may change soon.
Non-prototype / K&R-style functions: older C function declaration are partially modeled, but honestly, don’t use those.
Records with hostile layouts: some packed, ABI-sensitive, or otherwise difficult record layouts cannot be emitted as fully typed Mojo structs and fall back to opaque storage or in some edgecases would emit the wrong layout .
Anonymous members : anonymous struct/union members are preserved, but they are not automatically promoted into a flattened parent record.
Linkage edge cases: inline, compiler-extension hints are stripped for now, which may result in linking against absent symbols, requires manual review.
This is fantastic! I’ve been poking at it a bit and I found a few quick issues:
It doesn’t handle libraries that import arch-specific headers well. Those are almost always compiler specific and gcc is the default compiler, and the way that cc is handled it defaults to gcc’s compiler-specific headers on Linux.
Could it be made to respect the CFLAGS and CC environment variables?
It looks like you don’t handle newlines in macros.
You might need some form of forward declaration handling. I got a LOT of duplicate definitions when dealing with libraries which heavily use PIMPL (many ABI stable C libraries).
I did throw it at a “use the whole language” library (DPDK), so that probably surfaces a lot of weird stuff.
You might need some form of forward declaration handling. I got a LOT of duplicate definitions when dealing with libraries which heavily use PIMPL (many ABI stable C libraries).
I added a small canonicalization pass now which should deduplicate multiple forward declaration, so they should now resolve to one opaque mojo struct. if you install the package again from source, this should be solved (should be also available in the next release).
It looks like you don’t handle newlines in macros.
could you give me an example?
Also, it would be great if you point me to a DPDK header file that can be used as a north star to benchmark progress? currently, I think I will work on stabilzing the internals first, and then I can start focusing on supporting more use cases and it will be great to be guided by practical examples.
Also, it would be great if you point me to a DPDK header file that can be used as a north star to benchmark progress? currently, I think I will work on stabilzing the internals first, and then I can start focusing on supporting more use cases and it will be great to be guided by practical examples.
Sadly different parts of DPDK are using different parts of C, so I don’t have one there. However, DPDK does use a very good subset of C so I like to use it to test binding generators.
mojo-bindgen v0.2 is out with new features and fixes for ABI correctness
Install with: pip install mojo-bindgen
Highligts:
Automated generation of layout tests (rust-bindgen style)
In addition to bindings, mojo-bindgen now generates *_layout_test.mojo file that verify clang-derived struct size and field offsets. the layout tests uses mojo’s new reflection API (requires mojo-nightly). layout tests should catch silent ABI mismatches between mojo and C.
Better integer type mapping
Fixed-width integer typedefs from stdint.h, along with size_t and ssize_t, now map directly to Mojo builtin integers (e.g. int32_t → Int32 instead of ffi.c_long).
Forward declaration unification
Multiple C forward declarations of the same struct are now collapsed into a single Mojo struct, eliminating multiple re-definition in mojo.
UnsafePointer is now emitted as Optional[UnsafePointer] following recent changes in UnsafePointer model
bitfield accessors now respect byte order
Generated bitfield accessors now respect with target byte order via sys.info.is_little_endian() / is_big_endian().
mojo-bindgen v0.3 is out, bringing many improvements to macro parsing, unions, and enum code emission.
Bindings for Vulkan via vulkan.h now work out of the box and pass 1407/1408 layout tests.
Highlights
Bindings are now generated for the primary header and all of its included headers recursively.
On one hand, parsing the whole translation unit results in a much larger bindgen surface than needed. On the other hand, it allows generating bindings from wrapper header files:
// wrapper.h
#include "lib1.h"
#include "lib2.h"
Union with multiple fields sharing the same type are aggregated and only the unique types are emitted (by @WolfDan#9)
Enumerants are now emitted as comptime aliases and the enum tag is handled following C name scoping rules, retaining typdef aliases if present, while renaming the Enum tag to avoid mojo naming collisions.
Flexible array members (FAMs) are now recognized and emitted as InlineArray[T, 0]. This also works when the containing struct is embedded by value in another record as the last member. (started by @WolfDan for io_uring bindgen)
Code comments are now captured and in some cases are also emitted as mojo docstrings.
Literal macro parsing is improved with optional fallback to clang when the macro fails to parse.
mojo-bindgen is now pinned to mojo 1.0.0b1 and will follow major mojo releases.
mojo-bindgen v0.3 is a big step up — layout tests for ABI sanity, recursive header parsing, better enum/union handling, and FAM → InlineArray mapping all make C interop way more reliable in Mojo now. main concern is still macro heavy headers + translation unit bloat. But clang fallback should help patch those edge cases.