A new nightly version has been released! ![]()
See the quickstart guide for installation instructions: Quickstart | Modular
MAX changelog updates:
-
Increased the default allreduce signal buffer size from 513 MiB to 1025 MiB
per GPU (max.nn.comm.allreduce.Signals.NUM_BYTESand the matching constant
inmax.experimental.realization_context). The previous 512 MiB scratch
could not hold the per-peer allgather intermediate for models with large
hidden dimensions (for example, Kimi-K2.5 athidden_dim=20480with
max-batch-input-tokens=16384needs 640 MiB in bf16). This adds ~512 MiB
of per-GPU memory use for any multi-GPU model. -
KV cache management has moved from
max.kv_cachetomax.pipelines.kv_cache.
Update imports accordingly:# Before from max.kv_cache import PagedKVCacheManager, DummyKVCache # After from max.pipelines.kv_cache import PagedKVCacheManager, DummyKVCacheDeprecation shims with
DeprecationWarningremain at the old path.
Mojo changelog updates:
Raw MAX diff: https://github.com/modular/modular/compare/029c62231217a4995365e103899018816c08f6ef...9c8162262e80ee94586803812bf7d45e083a3f83)>
Current Mojo changelog: https://github.com/modular/modular/blob/main/mojo/docs/nightly-changelog.md