Hand written kernels in MAX

Sarthak · December 9, 2024, 8:36pm

Hi, I’m curious about the implementation details of the MAX engine in Mojo. In your compilation pipeline, do you rely on pre-written libraries or hand-written kernels (like Torch Inductor or TensorRT), or is the approach entirely focused on code generation and transformation, using MLIR passes to ensure performance and portability across hardware?

clattner · December 9, 2024, 11:37pm

Hi Sarthak,

If you’re interested in implementation details, I’d check out some of the technical talks on the Modular blog, e.g. the LLVM DevMtg talk from a year ago.

To answer the question in a short way: our approach is based on hand-written kernels (more similar to triton lang) with a very fancy set of compiler and runtime technologies that include automatic fusion and memory planning etc, that benefit from being able to see into the IR representation of those kernels. This is a novel approach, that moves forward from the “put everything into the compiler” or “use vendor libraries” approach, and is uniquely extensible.

We are working on releasing our GPU support soon, and that does NOT use NVIDIA libraries like cudnn and cublas. We (intentionally) haven’t explained all of this yet, but will be sharing more as it becomes generally available.

-Chris

philbutler · December 10, 2024, 5:08am

I’m rooting for y’all, we can’t wait!

system · June 8, 2025, 5:09am

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Porting various models to MAX MAX	6	168	May 8, 2025
Examples of custom CPU / GPU operations in Mojo MAX discussion , 24_6	28	1135	April 9, 2025
MAX AI Kernels are now open for contributions 🎉 Official Announcements	1	257	June 1, 2025
Modular: Modverse #48: Modular Platform 25.3, MAX AI Kernels, and the Modular GPU Kernel Hackathon Content blog	1	39	May 30, 2025
Support for Turing Architecture? MAX	9	211	May 4, 2025

Hand written kernels in MAX

Related topics