Custom MultiHead Self Attention Transformer Training Phase using AMD RX 9070 XT 16GB. Python/Pythorch Vs Mojo

Welcome to the forum. @BradLarson recently posted Calling all AMD RDNA users: help us bring full MAX support to your GPUs! In other words, the RDNA experience is currently functional but the performance is still being tweaked.

When you say that mojo will get 2.5TFLOPS with a tiled and Pytorch will likely get 30-50, I’m curious where you got those numbers from.