I have been running Mojo successfully on Hopper and Apple Silicon, but my production workloads run on a Cray EX AMD MI250X system. I tried running it there out of the box, but I get “no compatible GPU detected.” Are there any environment variables or other settings I can change to try to force it to compile on CDNA2?
Is this something that’s feasible for an external user to add once Mojo is open-sourced?
You should be able to add this support today, assuming you can run a HIP driver at or newer than 6.3 on your system. The first place you’d need to add support would be in the info.mojo file within gpu.host. Within that file, you’ll see how we registered a number of other similar GPU families. Once you’ve added that there, you can use Bazel to build and run tests using your customized Mojo standard library. For example, you can verify that your new support detects your GPU and can do very basic vector addition with:
./bazelw run //mojo/examples/gpu-functions:vector_addition
which will build and run everything in the Mojo standard library and try to then compile and run a basic vector addition example.
For more complex models, you may need to add paths in relevant kernels that take into account the differences between CDNA2 and our current CDNA3 / CDNA4 paths. Again, you can use Bazel to build and test models on your platform to see where changes need to be made.
None of this should require internal compiler access, and you should be fully capable of implementing it yourself using the OSS modular repository.