Hi!
I’m trying to better understand how RuntimeLayout
works. To avoid boilerplate, I am trying to write a function that just converts a buffer into a row major LayoutTensor
. I have this code:
from gpu.host import DeviceContext, DeviceBuffer, HostBuffer
from layout import Layout, LayoutTensor, RuntimeLayout, RuntimeTuple
from layout.int_tuple import UNKNOWN_VALUE
alias fp16 = DType.float16
fn row_major_of_buffer[dtype: DType](mut buffer: DeviceBuffer, M: Int, N: Int):
alias layout = Layout.row_major(UNKNOWN_VALUE, UNKNOWN_VALUE)
var dyn = RuntimeLayout[layout](
RuntimeTuple[layout.shape](M, N),
RuntimeTuple[layout.stride](N, 1)
)
return LayoutTensor[dtype, layout](buffer, dyn)
def main():
ctx = DeviceContext()
M = 128
K = 32
A_buffer = ctx.enqueue_create_buffer[fp16](M * K)
var device_A = row_major_of_buffer[fp16](A_buffer, M, K)
but it breaks with several errors around not finding a proper constructor for LayoutTensor
.
Looking through LayoutTensor | Modular,
I see several options:
span, runtime_layout
ptr, runtime_layout
I have tried converting my buffer to both, but also not had any success. I feel like I am probably doing something fundamentally wrong, but I’m not sure what. Any thoughts?