NVIDIA hardware support in MAX 24.6

With the support for NVIDIA GPUs launching in MAX 24.6, a number of people have already asked what hardware is compatible with MAX in this release. This thread is intended to be a reference for our officially-supported hardware, as well as a place for community discussions about hardware compatibility.

Our officially-supported NVIDIA GPUs are listed in the Linux tab of our system requirements. At present, these are the datacenter-class NVIDIA GPU models of A10, A100, L4, and L40. These are the GPUs that Modular regularly tests MAX against, and if you encounter issues with these please let us know.

Unofficially, weā€™ve had reports that other Ampere or Ada Lovelace family GPUs may work with MAX in 24.6, but only the ones listed above are known to work. Note that a GPU with 24 GB of RAM is required to serve our flagship Llama 3.1 8B pipeline using bfloat16 weights.

1 Like

Older Nvidia drivers may be an issue, make sure you are on a current version! I have 560 working on my 4090, and the current ā€œproduction levelā€ release is 550. 535 has caused issues for at least one person.

For others reading this, I have an NVIDIA A2000 in my laptop. MAX recognizes the GPU (i.e. accelerator_count() == 1), but the calculations take slightly longer than running on the CPU (EDIT: using the nightly build to run the addition sample in custom_ops). Whether thatā€™s because MAX isnā€™t optimized for the processor or itā€™s falling back to the CPU, Iā€™ll leave that up to the Modular team members to answer, if they wish to do so.

That aside, Iā€™m shocked, in a good way, about how easy it was to write a custom kernel that could so easily be used for either device and how smoothly it just ran. :tada: :tada: :tada:

:pray:Thank you for the great work!

If you pass the --use-gpu flag to magic run MAX will execute on the GPU. That code doesnā€™t try to make a decision about which is ā€œbetterā€, CPU or GPU, at this point.

1 Like

Does anyone know when the Apple GPU support comes?

1 Like

That is definitely something on out minds, but we donā€™t have an ETA for it yet.

I get the following error if I try that:
error: unexpected argument ā€˜ā€“use-gpuā€™ found

FWIW, itā€™s a laptop GPU, not a datacenter-grade GPU on the list of supported GPUs, so Iā€™m not expecting it to be supported.

What is your full command line? It sounds like there is an argument parsing error, which might be unexpected.

magic run --use-gpu addition

where addition is set in in the mojoproject.toml file as:

[tasks]
addition = { cmd=ā€œmojo package kernels/ -o kernels.mojopkg && python addition.pyā€, env={ MODULAR_ONLY_USE_NEW_EXTENSIBILITY_API=ā€œtrueā€ }}

Ahh , thanks! Sorry, Iā€™ve confused things here. The --use-gpu option is a specific command line option for the llama3 pipeline, as documented here: Deploy Llama 3 on GPU with MAX Serve | Modular Docs.

However, it is not a command line option for magic, which is why this error occurs (correctly). I was getting mixed up!

The custom op example chooses the device based on this code:

devices=[CPU() if accelerator_count() == 0 else CUDA()],

To be sure it is running on GPU, you could remove the if completely, and just used CUDA().

You could also print the result of the accelerator_count() call to see if your GPU is correctly detected by MAX.

And one more note - we will be renaming CUDA() to the more general Accelerator in a nightly build soon!

1 Like

Yeah, I did that. Iā€™m responding to you and Brad in Discord. Letā€™s pick up the conversation there.

1 Like

Is this setting an environment variable somewhere that we can use without magic?

Iā€™ve created a new thread about the custom Mojo GPU operation examples, if people want to carry on discussions about those in that topic: [Experimental] Examples of custom CPU / GPU operations in Mojo

1 Like