It's here: MAX 24.6 and MAX GPU! 🚀

Modular · December 17, 2024, 6:45pm

Three years ago, we began reimagining AI development by rebuilding its infrastructure to be more performant, programmable, and portable. Today, we’re introducing MAX 24.6, featuring MAX GPU—a technology preview of the first vertically integrated generative AI serving stack that eliminates dependency on vendor-specific libraries like NVIDIA CUDA. MAX GPU is built on two groundbreaking technologies:

MAX Engine: A high-performance AI model compiler and runtime supporting vendor-agnostic Mojo GPU kernels for NVIDIA GPUs.
MAX Serve: A Python-native serving layer engineered for LLMs, handling complex request batching and scheduling for reliable performance under heavy workloads.

Run magic self-update to get started and check out the technology preview of MAX GPU.

Don’t miss:

Our release announcement, including hints about what’s in store for 2025

Our tutorial on building a continuous chat app using Llama 3.1 and MAX Serve

Our benchmarking deep dive to help you understand how MAX Serve stacks up

Drop your thoughts, questions, and hype in the official MAX 24.6 forum thread: MAX 24.6 and MAX GPU feedback.

Topic		Replies	Views
Modular: MAX 25.2: Unleash the power of your H200's–without CUDA! Content blog	0	46	March 25, 2025
Modular: Modverse #47: MAX 25.2 and an evening of GPU programming at Modular HQ Content blog	0	30	April 17, 2025
MAX 25.2: Unleash the power of your H200s without CUDA! Official Announcements	0	90	March 25, 2025
MAX on CPU doubt and request MAX discussion , feature-request , 24_6	3	115	July 9, 2025
NVIDIA hardware support in MAX 24.6 MAX discussion , 24_6	13	272	June 16, 2025

It's here: MAX 24.6 and MAX GPU! 🚀

Related topics