Build an LLM in MAX from scratch đź“–

The MAX Experimental API is a model-building framework we’ve been developing to help you create custom models in the MAX Framework. You can now get hands-on with this experimental API by following our MAX LLM tutorial.

In this tutorial, you’ll build each component of the model yourself—embeddings, attention mechanisms, and feed-forward layers. You’ll see how they fit together into a complete language model by completing the sequential coding challenges in the attached GitHub repository. At the end of the tutorial, you’ll be able to generate text with your model and compare it to GPT-2 on Hugging Face.

The MAX Experimental API will change over time and expand to include new features and functionality. As it evolves, we plan to update the tutorial accordingly. If you try the tutorial, please share your thoughts and feedback below. If you encounter errors, log them as GitHub issues or submit a pull request with improvements. We hope you enjoy the tutorial and being among the first to use this exciting new API.

12 Likes

Thanks for the book pretty cool. Running through it.

For step07 the test has

test_input = Tensor.randn(

        batch_size, seq_length, config.n_embd, dtype=DType.float32, device=CPU()

    )

Should it be?

from max.experimental import random

test_input = random.normal(

        (batch_size, seq_length, config.n_embd), dtype=DType.float32, device=CPU()

    )

I was guessing on this random. Some other minor things like sys and path missing, I can pull request later.

Hey for test.step_10 I didn’t see any randint? I filled with ones or what should we use on step 10?

I am running into the same issue.

in here you can find some pointers on how to solve some issues, for instance, the randn issue from testing step 07 : Many issues · Issue #7 · modular/max-llm-book · GitHub

1 Like

I have submitted a PR which allows the tests to complete: Fix issues with Tensor.rand* not being available and other by winding-lines · Pull Request #15 · modular/max-llm-book · GitHub

Let me know if this works for you,

Marius

1 Like

Thanks for your contribution @mseritan! PR #15 does resolve these issues and is now merged.

1 Like

I have a basic nvidia a4000 , although all the tests are being passed , i am getting an test failed message on step 5 and 6 , at the end that says “:cross_mark: Functional test failed: Failed to compile and execute graph! Please file an issue. This error should have been caught at op creation time.
Failed to compile and execute graph! Please file an issue. This error should have been caught at op creation time.”

I am seeing this same issue, Marius. I have seen several people who are reporting issues past this point, so I am not certain why we are blocked.

Here is the output I’m seeing on an Intel i9 box with an Nvidia 4090 video card:

âś… Output shape is correct: (2, 4, 768)

❌ Functional test failed: Failed to compile and execute graph! Please file an issue. This error should have been caught at op creation time.

   Failed to compile and execute graph! Please file an issue. This error should have been caught at op creation time.

Here is information about the system:

(Modular) toddb@fidfast max-llm-book % lscpu
Architecture:                x86_64
  CPU op-mode(s):            32-bit, 64-bit
  Address sizes:             39 bits physical, 48 bits virtual
  Byte Order:                Little Endian
CPU(s):                      32
  On-line CPU(s) list:       0-31
Vendor ID:                   GenuineIntel
  Model name:                Intel(R) Core(TM) i9-14900KF
    CPU family:              6
    Model:                   183
    Thread(s) per core:      2
    Core(s) per socket:      24
    Socket(s):               1
    Stepping:                1
    CPU(s) scaling MHz:      16%
    CPU max MHz:             6000.0000
    CPU min MHz:             800.0000
    BogoMIPS:                6374.40

The OS:

(Modular) toddb@fidfast max-llm-book % uname -a
Linux fidfast 6.14.0-37-generic #37~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 20 10:25:38 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

And the graphics card:

(Modular) toddb@fidfast max-llm-book % nvidia-smi
Wed Dec 17 08:05:01 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.57.08 Driver Version: 575.57.08 CUDA Version: 12.9 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 On | 00000000:01:00.0 Off | Off |
| 0% 30C P8 7W / 450W | 40MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 1742 G /usr/lib/xorg/Xorg 9MiB |
| 0 N/A N/A 1860 G /usr/bin/gnome-shell 10MiB |
+-----------------------------------------------------------------------------------------+

– end –

Anyone with help, please provide anything that you can think of.

Thanks,
Todd B.

Thanks for the detailed report here. I will dive into the steps and tests (this may be an issue in the try/except block in the test suite) and work to improve the functionality and error messaging here.