Supporting New Accelerators in Mojo: The Case of the AMD MI300X

clattner · April 30, 2025, 12:47am

Hi Maximilian,

The work to support MI300X is ongoing, but I would summarize it as these general categories of work for any given piece of hardware:

Runtime/compiler integration: we need to talk to a code generator (e.g. an LLVM backend) and a low level driver (e.g. boot, enumerate devices, submit kernel, copy data).
We need to adapt our kernel library to work with the compiler. Many things are standardized here, so getting things working generally goes pretty smoothly with a high quality llvm backend.
Enable tools: debuggers, profilers, platform features like printf, etc are all optional (but important) and take work. In the case of AMD for example, we reimplemented printing going all the way to low level interfaces in pure mojo to make sure we didn’t bring in opencl dependencies. We have a whitepaper and can share with the world if interesting.
Performance: The biggest piece is unlocking the power of hardware, (e.g. novel tensor cores) and figuring out the performance characteristics of the chip. This highly varies based on the target silicon and how similar it is to what we already support. There is a lot of convergence in the design of many chips, but performance is never “done”. Mojo’s support for advanced parametric programming is a huge superpower for this work.

What I can tell you is that all of the above is many orders of magnitude less work than building an entire AI solution from scratch. #4 is generally the most work, and Modular doesn’t want to have to do all of it for all use cases :-). I’m excited about us open sourcing the kernels “real soon” now, because many folks can see what this looks like.

I hope this helps. Mojo and MAX aren’t “magic” so the cost above isn’t zero, but it is a major step forward for hardware enablement in my opinion. The cost is proportional to “how weird” the chip is (compared to other things that MAX already supports) so the cost goes down slowly over time.

-Chris

Topic		Replies	Views
Ask Ahmed anything about GPU programming with Mojo (LLVM Developers' Meeting 2024) Community Showcase modular-content , llvm , ask-me-anything	9	702	December 10, 2024
Mojo meets AMD MI300X. 🔥 Official Announcements	0	127	May 9, 2025
Compiling for different GPU architectures GPU Programming	1	71	April 17, 2025
Support for Turing Architecture? MAX	9	186	May 4, 2025
Examples of custom CPU / GPU operations in Mojo MAX discussion , 24_6	28	1064	April 9, 2025

Supporting New Accelerators in Mojo: The Case of the AMD MI300X

Related topics