Model/TensorMap to dynamically handle MANY DriverTensors as inputs?

For my research to make actual use of the GPU programming capabilities of MAX in the Mojo API, I need to be able to specify the number of driver-tensors during runtime dynamically. At the moment I interpret the MAX Mojo API as if this is still not possible (see here: Model | Modular). This clearly works on the old (non-driver) Tensor: We could simply declare a TensorMap, then borrow as many Tensors during runtime as we needed with their respective names.
Will this be possible with DeviceTensors living on the GPU any time soon? I think the Python API is already doing this. I’d be incredibly glad to have this feature, and it is currently the only thing that is holding me back from further research. :slight_smile:
If I am misinterpreting sth about the MAX Mojo API and this is in fact already possible, please let me know!

Unless you need names to associate with the input tensors, it’s my understanding that the

execute(self, owned *inputs: AnyMemory) -> List[AnyMemory]

interface will let you provide a variable number of max.driver Tensors as inputs to a graph from the Mojo API. Am I misinterpreting your use case?

2 Likes

ok, but a variadic list can not be created during runtime, can it?

if Variadic Lists are indeed possible, taking in regular Lists should be possible too!?

Hi Tille, you should be able to use the 2nd overload of execute which takes a Python dict[str, Tensor] to achieve this dynamism.

1 Like

ok, really :eyes:? How would i apply this in the following example? Converting back and forth between python and mojo objects feels weird, but why not??

from max.driver import accelerator_device, cpu_device, Tensor, Device
from max.engine import InferenceSession, SessionOptions
from max.graph import Graph, TensorType, ops


def main():
    alias VECTOR_WIDTH = 6

    host_device = cpu_device()
    gpu_device = accelerator_device()

    graph = Graph(TensorType(DType.float32, VECTOR_WIDTH))
    result = ops.sin(graph[0])
    graph.output(result)

    options = SessionOptions(gpu_device)
    session = InferenceSession(options)
    model = session.load(graph)

    input_tensor = Tensor[DType.float32, 1]((VECTOR_WIDTH), host_device)
    for i in range(VECTOR_WIDTH):
        input_tensor[i] = 1.25

    results = model.execute(input_tensor^.move_to(gpu_device))
    output = results[0].take().to_device_tensor().move_to(host_device).to_tensor[DType.float32, 1]()
    print(output)

Again, the goal is to NOT hardcode the number of input tensors, but initially have them as a list of (Mojo) driver.tensors.

Hi, I can’t make this work. :frowning: Can somebody please help me with that? The above Tensor should definitely live on an accelerator device, that’s why I don’t see a way to use the execute method overload that takes in a Python dict with Numpy arrays. Am I missing something?
If passing Device Tensors (on GPU) to a Model execute method dynamically is not actually possible yet, let me know, then I can just stop trying.

Unfortunately, I think this is a missing element in the execute() implementation for the Mojo Model. Under the hood, we treat the variadic list of AnyMemory inputs as a List, but currently provide no means of specifying a dynamic List as an input. Mojo also doesn’t yet let you unpack a List into a variadic set of arguments.

Sorry about causing you to chase after this, this does look like a hole in the API. The Python APIs for graph execution are a bit more comprehensive here.

1 Like

Ok thank you very much for the clarification. Can I expect this feature anytime soon or is this possibly intentionally not being worked on? I would be extremely grateful. :fire::folded_hands:

*As soon as this is possible I will be able to release a new Mojo library…