HDF5 bindings in Mojo!

Photon · March 31, 2026, 5:18pm

hdf5-mojo: High level HDF5 bindings for Mojo (first release)

I’ve been working on porting a lot of my scientific stack and well-known computational & physics libraries to Mojo (look forward to some more library spam from me xD), and HDF5 is a core dependency in that workflow, so this happened

It’s honestly been really cool to see how much Mojo has evolved over the past ~2 years. Early on, porting scientific libraries was… painful . Now it’s almost smooth sailing, with a few hiccups here and there. Super excited for what’s coming next!!!

This is an early release of hdf5-mojo, a set of high-level bindings to the HDF5 C library with a Mojo-friendly API. It’s usable today, but still evolving.

What this is right now

A simple two-layer design:

Low-level (hdf5/bindings.mojo)
Thin FFI wrapper over the HDF5 C API (HDF5Lib) for cases where you need full control.
High-level (hdf5/api.mojo)
Ergonomic interface (H5File, NDArray) for most use cases.

In practice, you’ll almost always use the high-level API.

What works now

Read 1D / 2D datasets without knowing shapes ahead of time
Read scalar attributes from groups/datasets
Create files and write datasets (row-major)
Safe group creation via require_group
Automatic HDF5 library discovery via $CONDA_PREFIX (pixi-friendly)

Example

Check out the examples for a more complete walkthrough with sample data.

from hdf5 import H5File

def main() raises:
    var f = H5File("data.h5")

    # Read scalar attributes
    var emin   = f.read_scalar_attr[DType.float64]("/group", "min_energy")
    var nnodes = f.read_scalar_attr[DType.int32]("/group", "number_energy_nodes")

    # Read 1-D and 2-D datasets (sizes discovered automatically)
    var xs  = f.read_1d[DType.float64]("/group/dataset")
    var mat = f.read_2d[DType.float64]("/group/matrix")

    print(xs[0], mat[0, 0])

    xs.free()
    mat.free()
    f.close()

clattner · March 31, 2026, 5:29pm

Wow, this is really cool! I’m thrilled to see integration into the HDF world. I suspect Mojo is really useful to many folks in the space. Do you have an idea of what a cool demo built on top of HDF support could be?

Photon · March 31, 2026, 5:54pm

Thanks! Really appreciate it! Mojo has made this a lot more practical than it used to be.

I’m mainly using HDF5 right now to port some particle physics simulation libraries and other scientific libraries (I will be posting updates on this soon! some cool gsl stuff!) that depend heavily on it. On top of that, we just added some foundational layers for GPU support in NuMojo! So I think a pretty compelling demo would be:

loading HDF5 datasets directly in Mojo.
reading them straight into a NuMojo NDArray buffer
running processing on CPU/GPU seamlessly, and writing results back to HDF5.

HDF5 still has to materialize data from disk, but since it reads into a user-provided buffer, we can skip intermediate representations and write directly into our array memory. That keeps the pipeline pretty tight after I/O. Another direction I’m interested in is streaming large datasets (very common in particle physics) in chunks and applying GPU ops on the fly.

Still exploring, but definitely feels like there’s some fun stuff to build here!

Curious · April 1, 2026, 10:04pm

Ideally one day we would have an xarray like API so that calculations like mine written in Mojo could take advantage of parallel IO as well.

Topic		Replies	Views
HDF5 bindings v0.2.0 is out! (Mojo v26.2) Community Showcase	0	159	April 23, 2026
Cairo and Datashader rewrites in Mojo opinion needed Mojo	20	267	February 26, 2026
High performance, fixed size 1D and 2D arrays on a CPU General	4	198	July 29, 2025
DuckDB.mojo: Mojo bindings for DuckDB Community Showcase	7	582	February 25, 2026
MAV - FFmpeg bindings for Mojo Community Showcase	0	185	April 10, 2026

HDF5 bindings in Mojo!

What this is right now

What works now

Example

Related topics