Launch_bounds support for GPU code

Hello, I am porting some CUDA code to Mojo.
Does Mojo support a mechanism like launch_bounds to provide the compiler with more info on how to manage SM resources?
Are there plans to support it in the future?
Thank you

Hi, you can check out this file for reference.

1 Like

Hi @massimim, thank you for sharing this question! We do support this, via the @__llvm_metadata( MAX_THREADS_PER_BLOCK_METADATA =StaticTuple[Int32, 1](256)) decorator on the kernel function - here is an example: https://github.com/modularml/modular/blob/main/max/examples/custom_ops/kernels/histogram.mojo#L46-L50

1 Like

Thank you Emil

2 Likes

Thank you Caroline.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.