NVIDIA hardware support in MAX 24.6

BradLarson · December 17, 2024, 9:33pm

With the support for NVIDIA GPUs launching in MAX 24.6, a number of people have already asked what hardware is compatible with MAX in this release. This thread is intended to be a reference for our officially-supported hardware, as well as a place for community discussions about hardware compatibility.

Our officially-supported NVIDIA GPUs are listed in the Linux tab of our system requirements. At present, these are the datacenter-class NVIDIA GPU models of A10, A100, L4, and L40. These are the GPUs that Modular regularly tests MAX against, and if you encounter issues with these please let us know.

Unofficially, we’ve had reports that other Ampere or Ada Lovelace family GPUs may work with MAX in 24.6, but only the ones listed above are known to work. Note that a GPU with 24 GB of RAM is required to serve our flagship Llama 3.1 8B pipeline using bfloat16 weights.

owenhilyard · December 17, 2024, 11:11pm

Older Nvidia drivers may be an issue, make sure you are on a current version! I have 560 working on my 4090, and the current “production level” release is 550. 535 has caused issues for at least one person.

dmeaux · December 18, 2024, 9:53am

For others reading this, I have an NVIDIA A2000 in my laptop. MAX recognizes the GPU (i.e. accelerator_count() == 1), but the calculations take slightly longer than running on the CPU (EDIT: using the nightly build to run the addition sample in custom_ops). Whether that’s because MAX isn’t optimized for the processor or it’s falling back to the CPU, I’ll leave that up to the Modular team members to answer, if they wish to do so.

That aside, I’m shocked, in a good way, about how easy it was to write a custom kernel that could so easily be used for either device and how smoothly it just ran.

Thank you for the great work!

joshpeterson · December 18, 2024, 12:00pm

If you pass the --use-gpu flag to magic run MAX will execute on the GPU. That code doesn’t try to make a decision about which is “better”, CPU or GPU, at this point.

ivellapillil · December 18, 2024, 12:40pm

Does anyone know when the Apple GPU support comes?

joshpeterson · December 18, 2024, 1:34pm

That is definitely something on out minds, but we don’t have an ETA for it yet.

dmeaux · December 18, 2024, 1:43pm

I get the following error if I try that:
error: unexpected argument ‘–use-gpu’ found

FWIW, it’s a laptop GPU, not a datacenter-grade GPU on the list of supported GPUs, so I’m not expecting it to be supported.

joshpeterson · December 18, 2024, 1:55pm

What is your full command line? It sounds like there is an argument parsing error, which might be unexpected.

dmeaux · December 18, 2024, 2:12pm

magic run --use-gpu addition

where addition is set in in the mojoproject.toml file as:

[tasks]
addition = { cmd=“mojo package kernels/ -o kernels.mojopkg && python addition.py”, env={ MODULAR_ONLY_USE_NEW_EXTENSIBILITY_API=“true” }}

joshpeterson · December 18, 2024, 2:31pm

Ahh , thanks! Sorry, I’ve confused things here. The --use-gpu option is a specific command line option for the llama3 pipeline, as documented here: Deploy Llama 3 on GPU with MAX Serve | Modular Docs.

However, it is not a command line option for magic, which is why this error occurs (correctly). I was getting mixed up!

The custom op example chooses the device based on this code:

devices=[CPU() if accelerator_count() == 0 else CUDA()],

To be sure it is running on GPU, you could remove the if completely, and just used CUDA().

You could also print the result of the accelerator_count() call to see if your GPU is correctly detected by MAX.

And one more note - we will be renaming CUDA() to the more general Accelerator in a nightly build soon!

dmeaux · December 18, 2024, 2:34pm

Yeah, I did that. I’m responding to you and Brad in Discord. Let’s pick up the conversation there.

owenhilyard · December 18, 2024, 5:21pm

Is this setting an environment variable somewhere that we can use without magic?

BradLarson · December 18, 2024, 6:28pm

I’ve created a new thread about the custom Mojo GPU operation examples, if people want to carry on discussions about those in that topic: [Experimental] Examples of custom CPU / GPU operations in Mojo

system · June 16, 2025, 6:28pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Will Max support cerebras.ai hardware? MAX discussion , gpu , modular-content , 24_6	4	257	December 28, 2024
Multi GPU support for Gemma 3 MAX	3	244	June 24, 2025
Support for Turing Architecture? MAX	9	212	May 4, 2025
MAX on CPU doubt and request MAX discussion , feature-request , 24_6	3	115	July 9, 2025
It's here: MAX 24.6 and MAX GPU! :rocket: Official Announcements	0	166	December 17, 2024

NVIDIA hardware support in MAX 24.6

Related topics