GPU Programming Manual

Content:

Source Code:

Please leave any questions, feedback, or suggestions for the GPU programming manual in this thread.

1 Like

Hey @jack.clayton , awesome content! Very easy to run with the vscode extension :slight_smile:

I got an error on the first cell, the import is wrong, “gpu.host” instead of “gpu”:

/tmp/mdlab/main.mojo:1:29: error: package 'gpu' does not contain 'DeviceContext'
from gpu import thread_idx, DeviceContext

Thanks for that, fix will go out in the next nightly. Working on getting the code blocks tested in CI

Some other feedback:

  • We can try to have more meaningful variable names, especially regarding aggressive abbreviation.
    I had to read a few time to understand that “els” = “elements” “in” = “input”, “dev” = “device”, “out” = “output”. Also because “in_dev” does not mean “input_device”, it actually means “input_buffer_on_device”.
  • The tutorial has “dtype” and “type” but I don’t think we need both, it adds unnecessary complexity.
  • We can try to stick with grid_dim and block_dim having the same number of elements in the tuple, and always use tuples. Any shortening is syntax sugar and actually makes it harder to understand. Sugar can be introduced later on once the concept has been grasped by the user. For example:
    ctx.enqueue_function[block_kernel](grid_dim=(2, 2), block_dim=2) can be written as ctx.enqueue_function[block_kernel](grid_dim=(2, 2), block_dim=(2, 1)) which is much easier to understand.
  • The markdown file can be split into two files. This is because when working on a separate mojo file and copy-pasting, it can get really long, and we don’t necessarily re-use functions or variables. It’s also harder to run/read a specific section or the tutorial I’m interested in.
  • For thread-level simd, 4 seems like a magic number, and it’s not explained how to find it, or even how to query the gpu to know what is the simd size available. A small explanation would be welcomed in this section.

Overall great work! I was not familliar with gpu programming and reading/running the tutorial was very smooth!

Awesome great feedback thanks for all that @gabrieldemarmiesse, I’ll fix these up