Hi, I’m new to modular and I’m trying to run offline inference code from Quickstart | Modular.
from max.entrypoints.llm import LLM
from max.pipelines import PipelineConfig
def main():
model_path = "modularai/Llama-3.1-8B-Instruct-GGUF"
pipeline_config = PipelineConfig(model_path=model_path)
llm = LLM(pipeline_config)
prompts = [
"In the beginning, there was",
"I believe the meaning of life is",
"The fastest way to learn python is",
]
print("Generating responses...")
responses = llm.generate(prompts, max_new_tokens=50)
for i, (prompt, response) in enumerate(zip(prompts, responses)):
print(f"========== Response {i} ==========")
print(prompt + response)
print()
if __name__ == "__main__":
main()
But i got error with this log :
[2025-05-29 09:06:53] WARNING memory_estimation.py:142: Truncated model's default max_length from 131072 to 94767 to fit in memory.
[2025-05-29 09:06:53] INFO memory_estimation.py:190:
Estimated memory consumption:
Weights: 4.58 GiB
KVCache allocation: 23.14 GiB
Total estimated: 27.72 GiB used / 30.80 GiB free
Auto-inferred max sequence length: 94767
Auto-inferred max batch size: 1
Exception ignored in: <function LLM.__del__ at 0x734964a084c0>
Traceback (most recent call last):
File "/home/zetwhite/.local/lib/python3.10/site-packages/max/entrypoints/llm.py", line 72, in __del__
self._pc.set_canceled()
AttributeError: 'LLM' object has no attribute '_pc'
Traceback (most recent call last):
File "/home/zetwhite/quickstart/offline.py", line 25, in <module>
main()
File "/home/zetwhite/quickstart/offline.py", line 8, in main
llm = LLM(pipeline_config)
TypeError: LLM.__init__() missing 1 required positional argument: 'pipeline_config'
I’m using ubuntu 22.04 / python 3.10 / modular == 25.3.0.
What causes this error and how can I fix it?