New MAX Docker Containers in 24.6

hogepodge · December 20, 2024, 10:43pm

We’re excited to announce a major addition in the 24.6 release: new Docker containers that provide optimized MAX installations across cloud platforms. While these containers are featured in our Deploy Llama3 on GPU tutorial, they deserve special attention as standalone packages.

You can find two distinct images in the Modular Docker Hub repository:

Both containers come pre-configured with everything needed to run MAX on NVIDIA GPUs locally and in cloud environments like AWS, GCP, and Azure. You can learn more about how to use the containers directly from the Docker Hub README files. We welcome your feedback on these containers here. Looking ahead to 2025, we’re excited to bring to further optimizations - not just for LLM performance, but also to reduce container size and streamline deployment times in your infrastructure.

owenhilyard · December 21, 2024, 5:24pm

Is there any chance of a quay.io mirror? I think as long as you’re only using it with public repositories it’s free, but the main customer benefit is that they don’t rate limit. Academic institutions often have both restrictive access to an organizational docker hub account and get rate limited by IP address due to it only taking a few students pulling down containers to hit rate limits.

Secondly, is there a planned image where we can hand the container a volume mount with a checkout of the repo, so that we can avoid sending model weights to huggingface?

hogepodge · December 21, 2024, 7:48pm

Thanks for the feedback Owen! I’ll share this with the product team in charge of containers. Rate limiting from Docker Hub is something that’s been on our mind. Have you experienced this firsthand?

It’ll be a while before we can fully respond. The company is out on holiday leave right now, so there won’t be any new containers or mirrors coming in the next few weeks.

owenhilyard · December 21, 2024, 7:55pm

Yes, my university is more or less always at the rate limit unless it’s 10pm to ~4 am. A lot of universities are using NAT, so everyone on campus shows up as 1 IP address, which docker hub allows 100 pulls per 6 hours, so one person playing with kubernetes is enough to use substantial amounts of the rate limit on their own. Combine that with classes that teach containers and require you to use them for homework, and you end up with very limited access. This is probably less of a concern for companies which can set up and tell everyone to use a caching proxy like artifactory, but it is an issue for academia.

This is a “when it’s convenient” request, no rush.

system · June 19, 2025, 7:55pm

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Modular: MAX 25.2: Unleash the power of your H200's–without CUDA! Content blog	0	46	March 25, 2025
Modular: Modverse #47: MAX 25.2 and an evening of GPU programming at Modular HQ Content blog	0	30	April 17, 2025
Having trouble instantiating a GPU instance with correct driver (>= 555) on GCP MAX discussion , gpu , 24_6	1	58	February 8, 2025
Upcoming changes to our GitHub repositories Official Announcements	4	222	February 27, 2025
MAX on CPU doubt and request MAX discussion , feature-request , 24_6	2	113	January 10, 2025

New MAX Docker Containers in 24.6

Related topics