Help compiling MLP in MAX

Hi,

I went over the really good content on the MAX LLM book but I can’t seem to get the model to compile. To simplify the problem right now I’m just trying to get the MLP part of the model to compile but without any success.

Here is the code to reproduce the error

import max.functional as F
from max.graph import DeviceRef
from max.nn import Linear, Module
from max.tensor import Tensor, TensorType, defaults


class MLP(Module):
    def __init__(self, emb: int, factor: int) -> None:
        super().__init__()
        self.up = Linear(in_dim=emb, out_dim=factor * emb, bias=True)
        self.down = Linear(in_dim=factor * emb, out_dim=emb, bias=True)

    def forward(self, x: Tensor) -> Tensor:
        # x: [B,S,D]
        x = self.up(x)
        x = F.gelu(x, approximate="tanh")
        x = self.down(x)
        return x


def main() -> None:
    dtype, device = defaults()
    print(f"{dtype=} | {device=}")
    model = MLP(emb=10, factor=2)
    print(model)
    x_type = TensorType(dtype, (5, 10), device=DeviceRef.from_device(device))
    print("Compiling")
    compiled_model = model.compile(x_type)


if __name__ == "__main__":
    main()

I get the following error message

dtype=bfloat16 | device=Device(type=gpu,id=0)
MLP(
    up=Linear(in_dim=Dim(10), out_dim=Dim(20)),
    down=Linear(in_dim=Dim(20), out_dim=Dim(10))
)
Compiling
Traceback (most recent call last):
  File "/scratch/ap6604/.pixi/envs/default/lib/python3.13/site-packages/max/engine/api.py", line 424, in load
    _model = self._impl.compile_from_object(
        model._module._CAPIPtr,  # type: ignore
        custom_extensions_final,
        model.name,
    )
ValueError: Failed to run the MOToMGP pass manager:
-:1:1: error: failed to lower module to LLVM IR for archive compilation, run LowerToLLVMPipeline failed
error: invalid rebind between two unequal types: !pop.array<2, scalar<si32>> to !pop.array<2, scalar<si64>>
error: failed to legalize operation 'kgen.rebind' that was explicitly marked illegal
-:1:1: error: The graph compiler could not elaborate the generated KGEN


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ap6604/code_examples/modular/models/main.py", line 32, in <module>
    main()
    ~~~~^^
  File "/home/ap6604/code_examples/modular/models/main.py", line 28, in main
    compiled_model = model.compile(x_type)
  File "/scratch/ap6604/.pixi/envs/default/lib/python3.13/site-packages/max/nn/module.py", line 654, in compile
    compiled = F.functional(session.load(graph, weights_registry=weights))
                            ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/ap6604/.pixi/envs/default/lib/python3.13/site-packages/max/engine/api.py", line 430, in load
    raise RuntimeError(
    ...<5 lines>...
    ) from e
RuntimeError: Failed to compile the model. Please file an issue, all models should be correct by construction and this error should have been caught during construction.
For more detailed failure information run with the environment variable `MODULAR_MAX_DEBUG=True`.

I’m currently using:

  • NVIDIA H200 GPU
  • Mojo 0.26.2.0.dev2026021105 (1982f891)
  • Ubuntu 24.04.3 LTS

It would be great If someone could try to compile the code above and share the pixi.toml!

It’s possible that the particular nightly you tried that with had a temporary regression. We’ll be updating the pinned max version in the MAX LLM book soon, we know there’s at least one issue that has been fixed since the version used now.

If I have a pixi.toml using the latest nightly, that builds your example correctly:

[workspace]
authors = ["Modular <hello@modular.com>"]
channels = ["https://conda.modular.com/max-nightly", "conda-forge"]
name = "mlp-test"
platforms = ["linux-64"]
version = "0.1.0"

[tasks]

[dependencies]
max = "==26.2.0.dev2026022305"

Hi Brad, I appreciate you taking the time to run my example!

I tried the max nightly version that you suggested and some others without success. Could you help me understand what is the environment (software / hardware) that you use to test MAX? With that information I will replicate this exact setup. What linux distro, cuda version, nvidia driver version, nvidia GPU? I can spin up a AWS server with all those specifications and try to get everything to compile :smiley:

My latest run was in Ubuntu 24.04 with an RTX 4060 on NVIDIA Driver version 565, but this same code was also working on an M3 Pro Apple silicon Mac on macOS 15.7. The error is a bit odd, noting a mismatch between a 32-bit integer and 64-bit one somewhere. I wonder if there’s a Hopper-specific kernel bug somewhere.

Yes, it does seem to be something related to the 32-bit vs 64-bit based on two unequal types: !pop.array<2, scalar<si32>> to !pop.array<2, scalar<si64>> from the error message. But I don’t think it has to do with Hopper necessarily as even in my CPU only Ubuntu 22.04.5 LTS desktop I cannot get the example to compile either.

Which python3 version are you using? I don’t think this is the main problem but I do see a difference in runtime when executing the example code.