I noticed that the element-wise operations from Tensor are being removed and stumbled upon a commit message indicating that we should move to LayoutTensor. I was wondering how to best migrate to this new representation. The example in the docs
alias f32 = DType.float32
var tensor_5x4 = LayoutTensor[f32, Layout.row_major(5,4)].stack_allocation()
appears to commit to a static shape. How can the shape be specified dynamically? For example, after reading the training data from disk?
from layout.tensor_builder import LayoutTensorBuild as tb
from layout.tensor_builder import static
from layout.int_tuple import UNKNOWN_VALUE
# create 2x3 tensor, where both dimensions are dynamic
var t_2d_dynamic = tb[DType.float32]().row_major(2, 3).view(t_2d_static.ptr)
# create 2x3 tensor where the second dimension is known at compile time.
var t_2d_dynamic_static = tb[DType.float32]().row_major(2, static[3]()).view(t_2d_static.ptr) # first dimension is dynamic, second is static.
# custom layout
var t_dynamic = tb[DType.float32]().layout(Index(2, 3), Index(1, 2)).view(
t.ptr
)
# way too verbose but highly customizable
alias layout_t = Layout.row_major(UNKNOWN_VALUE, UNKNOWN_VALUE)
var layout = RuntimeLayout[layout_t](
RuntimeTuple[layout_t.shape, unsigned=True](8, 4),
RuntimeTuple[layout_t.stride, unsigned=True](4, 1),
)
var storage = UnsafePointer[Float32].alloc(layout.size())
var tensor = LayoutTensor[DType.float32, layout_t](storage, layout)```
As additional context to the commit message you see there: max.tensor’s Tensor type is incompatible with use on GPUs, and we’ve been de-emphasizing its use for a little while now in favor of types that do work well across CPUs and GPUs. For allocating and moving tensors to and from accelerators (or just working with them on CPU), max.driver’s Tensor is what you want to use. We plan to expand the Mojo interface to that with helpers to bring it more in line with what you could do with the prior max.tensorTensor type.
Within the functions that run on a GPU, you’ll start to see more use of LayoutTensor which provides powerful capabilities for hardware-optimized use of multidimensional data structures. It is more of a view into tensor memory, and exists inside of the GPU / CPU computation functions while something like max.driverTensor allocates and controls the memory outside of the computation. Don’t worry, we’ll be expanding the documentation and examples around the use of layouts very soon, that’s all in active progress (Joe just added a bunch of API docs over the last few days for parts of this).
In addition to the example that Caroline provided above, we’re moving many of our public examples to use LayoutTensor, such as all of the direct GPU function examples and several of the custom ops examples such as the progressive optimization of a matrix multiplication. Again, this use is within the GPU functions / operations themselves, where the prior max.tensorTensor type would not be viable.