MAX Nightly 26.4.0.dev2026051506 (Mojo 1.0.0b2.dev2026051506) Released

:astronaut: A new nightly version has been released! :astronaut:

See the quickstart guide for installation instructions: Quickstart | Modular

MAX changelog updates:

  • Increased the default allreduce signal buffer size from 513 MiB to 1025 MiB
    per GPU (max.nn.comm.allreduce.Signals.NUM_BYTES and the matching constant
    in max.experimental.realization_context). The previous 512 MiB scratch
    could not hold the per-peer allgather intermediate for models with large
    hidden dimensions (for example, Kimi-K2.5 at hidden_dim=20480 with
    max-batch-input-tokens=16384 needs 640 MiB in bf16). This adds ~512 MiB
    of per-GPU memory use for any multi-GPU model.

  • KV cache management has moved from max.kv_cache to max.pipelines.kv_cache.
    Update imports accordingly:

    # Before
    from max.kv_cache import PagedKVCacheManager, DummyKVCache
    
    # After
    from max.pipelines.kv_cache import PagedKVCacheManager, DummyKVCache
    

    Deprecation shims with DeprecationWarning remain at the old path.

Mojo changelog updates:

Raw MAX diff: https://github.com/modular/modular/compare/029c62231217a4995365e103899018816c08f6ef...9c8162262e80ee94586803812bf7d45e083a3f83)>
Current Mojo changelog: https://github.com/modular/modular/blob/main/mojo/docs/nightly-changelog.md