My first rasterized image from 3D Gaussians in Mojo (called as PyTorch customOp)

Very much a work in progress for the past months. More of a side fun experiment that I hope one day would work out.

I have been porting the GSplat kernels (which are written in CUDA C++) to mojo as a way to experiment with the mojo gpu programming. Also because I constantly find myself trying to fix problems with CUDA code integration in pytorch and all its pains, specifically when they are spread amongst multiple dependencies (since one cannot just add a package with cuda extentions to the pyproject.toml and have it work with UV for example or be properly installed using dependency managers).

So far I leave here the results of 1 Million gaussians being rendered. I still have a few more kernels to finish and some bugs to fix before I publish the repo (hopefuly soon). As of now I managed to call this custom operation from pytorch to rasterize the gaussians to pixels and get an image. The mojo kernel is compiled JIT which is a good proof of concept for how nice it could be to integrate it as a library into any other project.

I don’t want to give false hopes on performance since it is still very much unfinished code and not feature rich but it looks quite promissing and perhaps on par if not faster than CUDA implementations out there. Benchmarks to come I hope… xD

As of now some limitations arise for this to be a properly useful package one could use in 3DGS pytorch projects:

  • So far I haven’t managed to make Mojo no have to recompile the kernels on change of dimensions which means if the number of gaussians change it has to recompile, making it pretty slow in real life usecases
  • Haven’t experimented or implemented the backward pass kernels yet
  • Dependency on nightly versions of Modular right now

5 Likes