With the support for NVIDIA GPUs launching in MAX 24.6, a number of people have already asked what hardware is compatible with MAX in this release. This thread is intended to be a reference for our officially-supported hardware, as well as a place for community discussions about hardware compatibility.
Our officially-supported NVIDIA GPUs are listed in the Linux tab of our system requirements. At present, these are the datacenter-class NVIDIA GPU models of A10, A100, L4, and L40. These are the GPUs that Modular regularly tests MAX against, and if you encounter issues with these please let us know.
Unofficially, weāve had reports that other Ampere or Ada Lovelace family GPUs may work with MAX in 24.6, but only the ones listed above are known to work. Note that a GPU with 24 GB of RAM is required to serve our flagship Llama 3.1 8B pipeline using bfloat16 weights.
Older Nvidia drivers may be an issue, make sure you are on a current version! I have 560 working on my 4090, and the current āproduction levelā release is 550. 535 has caused issues for at least one person.
For others reading this, I have an NVIDIA A2000 in my laptop. MAX recognizes the GPU (i.e. accelerator_count() == 1), but the calculations take slightly longer than running on the CPU (EDIT: using the nightly build to run the addition sample in custom_ops). Whether thatās because MAX isnāt optimized for the processor or itās falling back to the CPU, Iāll leave that up to the Modular team members to answer, if they wish to do so.
That aside, Iām shocked, in a good way, about how easy it was to write a custom kernel that could so easily be used for either device and how smoothly it just ran.
If you pass the --use-gpu flag to magic run MAX will execute on the GPU. That code doesnāt try to make a decision about which is ābetterā, CPU or GPU, at this point.