[Project] Mojo for Robotics: Porting GPU Navigation Kernels (Jetson / Strix Halo)

Quick update on the Mojo port we kicked off in this thread.

What’s Done

Scaffolding repo is up: https://github.com/aleph-ra/kompass-mojo

The first of the three kernel groups from the original post, the Trajectory Cost Evaluator, is now running end-to-end in Mojo. Six kernels ported from our SYCL baseline, wired up through an @exported C ABI, and called from a standalone C++ benchmark binary that mirrors the kompass-core benchmarking harness.

If you have a Mojo-supported GPU, you can reproduce everything below with:

git clone https://github.com/aleph-ra/kompass-mojo
cd kompass-mojo
pixi install
./scripts/run_benchmarks.sh <platform_name>

Results from my devbox (NVIDIA RTX A5000, Ampere sm_86)

Same workload we use in EMOS kompass-core: 5,001 trajectories × 1,000 points, 10 s horizon, 4 cost functions enabled.

  • kompass-core SYCL (AdaptiveCpp / CUDA): 16.358 ms (±0.12)
  • kompass-mojo (Mojo 0.26.1 / CUDA): 15.973 ms (±0.09)

I should note that I have used Claude to translate my mojo kernels, using the official skills and no optimization work on the mojo side has been done yet. Hence the initial result is quite impressive.

Next Steps

  • Port the Local Mapper (Bresenham raycasting) and Critical Zone Checker kernels into the same FFI layer.
  • Run on the real targets from the original post, Jetson Orin, Thor, and AMD Strix Halo APUs. Strix Halo should be an interesting case since its shared-memory APU architecture also provides a zero-copy path.

Matching AdaptiveCpp on a first-pass port was not something I expected, and it’s a strong early signal for the Mojo-for-robotics question.