Model Correctness Check

as0909024 · January 5, 2026, 10:10am

I am currently exploring the Modular repository and have understood several aspects of the MAX pipeline so far. However, I am stuck at one specific point and would appreciate some clarification.
Integration test models directory

In the integration test models directory, there are model-specific tests along with supporting files. One such file is verify_pipelines.py. From my understanding, this script verifies model outputs against a reference implementation using predefined tolerance thresholds (for example, cos_dist_threshold and kl_div_threshold).

Basically, we are the one who is calculating these tolerances using this command.

./bazelw run //max/tests/integration/pipelines/python:verify_pipelines – --pipeline “Qwen/Qwen2.5-7B-Instruct-bfloat16” --devices=‘gpu’ --find-tolerances --print-suggested-tolerances

I took a example of Qwen model which is already supported in MAX, it gives us some suggested tolerances like the below image:

These tolerance values are not computed or validated by verify_pipelines.py itself. Instead, they are predefined based on prior analysis and are enforced to ensure that the numerical behavior of a model does not regress within the MAX pipeline over time. In practice, it seems that tolerances are chosen based on observed divergence from a reference implementation, often with an additional margin (for example, ~20–30%) to account for variability.

Please let me know if this understanding is correct.

If it is, my follow-up question is: when adding support for a new model in MAX, how should one verify that the model integration is actually correct? I have run inference using the added model and observed that it produces the expected output for a given prompt. Is this sufficient to consider the model integration correct within MAX, or are there additional recommended validation steps?

Thank you in advance for your guidance.

sbrunk · January 14, 2026, 4:16pm

It looks like the code to actually create a golden test output to validate against is in verify.py, which also explains how to run the creation with torch/max and then the verification:

github.com/modular/modular

max/tests/integration/pipelines/python/verify.py

0702a9721


      
          To get the logit files, use `generate_llm_logits` once with `torch` and once
          with `max`:
          
          ```
          ./bazelw run \
            //max/tests/integration/pipelines/python:generate_llm_logits -- \
            --device cpu \
            --framework max \
            --pipeline llama \
            --version llama3_1 \
            --output /tmp/max-goldens.json
          ```
          
          Remember to change both the framework and the output path when generating the
          logit files.
          
          Then, run `verify` with the logit files:
          ```
          ./bazelw run \
            //max/tests/integration/pipelines/python:verify -- \

This file has been truncated. show original

Topic		Replies	Views
All MAX API tests can now be run via Bazel in the `modular` repository MAX	7	217	February 6, 2026
Build an LLM in MAX from scratch 📖 MAX max-llms , max-llm-book	10	712	February 6, 2026
MAX Model Repository MAX	3	130	August 6, 2025
MAX models can now use customized Mojo kernels and standard library MAX	0	121	February 17, 2026
Inference performance issue MAX	7	328	April 29, 2026

Model Correctness Check

Related topics