Error running offline_inference.py

Hi, I’m new to modular and I’m trying to run offline inference code from Quickstart | Modular.

from max.entrypoints.llm import LLM
from max.pipelines import PipelineConfig


def main():
    model_path = "modularai/Llama-3.1-8B-Instruct-GGUF"
    pipeline_config = PipelineConfig(model_path=model_path)
    llm = LLM(pipeline_config)

    prompts = [
        "In the beginning, there was",
        "I believe the meaning of life is",
        "The fastest way to learn python is",
    ]

    print("Generating responses...")
    responses = llm.generate(prompts, max_new_tokens=50)
    for i, (prompt, response) in enumerate(zip(prompts, responses)):
        print(f"========== Response {i} ==========")
        print(prompt + response)
        print()


if __name__ == "__main__":
    main()

But i got error with this log :

[2025-05-29 09:06:53] WARNING memory_estimation.py:142: Truncated model's default max_length from 131072 to 94767 to fit in memory.
[2025-05-29 09:06:53] INFO memory_estimation.py:190: 

	Estimated memory consumption:
	    Weights:                4.58 GiB
	    KVCache allocation:     23.14 GiB
	    Total estimated:        27.72 GiB used / 30.80 GiB free
	Auto-inferred max sequence length: 94767
	Auto-inferred max batch size: 1

Exception ignored in: <function LLM.__del__ at 0x734964a084c0>
Traceback (most recent call last):
  File "/home/zetwhite/.local/lib/python3.10/site-packages/max/entrypoints/llm.py", line 72, in __del__
    self._pc.set_canceled()
AttributeError: 'LLM' object has no attribute '_pc'
Traceback (most recent call last):
  File "/home/zetwhite/quickstart/offline.py", line 25, in <module>
    main()
  File "/home/zetwhite/quickstart/offline.py", line 8, in main
    llm = LLM(pipeline_config)
TypeError: LLM.__init__() missing 1 required positional argument: 'pipeline_config'

I’m using ubuntu 22.04 / python 3.10 / modular == 25.3.0.
What causes this error and how can I fix it?

Ah, I found this post Quickstart Docs and giving setting to LLM fixed my problem!

2 Likes

Glad it was helpful.

Considering we’re seeing multiple people report the same problem I’m working to get the docs and code example updated.

2 Likes