As a PhD in Statistics, my frustration is that R has no native Numba or “GPUArray” equivalent. R lacks a native LayoutTensor (in Mojo terms) that maps efficiently to R’s column-major memory, making it impossible to write GPU kernels without dropping to C++/CUDA. This has led me to rely on other languages (Python and Julia) for my research projects.
I felt Mojo would be the ideal tool to solve this, as its ability to handle manual memory layouts and compile to bare metal makes it perfect for implementing “Numba for R.”
I am building MojoR, a JIT compiler that maps dynamic R semantics into static Mojo kernels. I just benchmarked a bivariate Gibbs Sampler (sequentially dependent and impossible to vectorize in R) on the JIT.
The Results (N=100k, Thin=100, Total=10M random samples):
-
GNU R: 19.28s
-
MojoR: 0.16s
-
Speedup: ~117x
# The Code (Standard R syntax)
gibbs_kernel <- function(N, thin) {
mat <- matrix(0, N, 2); x <- 0.0; y <- 0.0
# sequential loop: notoriously slow in R, fast in MojoR
for (i in 1:N) {
for (j in 1:thin) {
x <- rgamma(1, 3.0, y * y + 4.0)
y <- rnorm(1, 1.0 / (x + 1.0), 1.0 / sqrt(2.0 * (x + 1.0)))
}
mat[i, ] <- c(x, y)
}
return(mat)
}
# JIT Compile
fast_gibbs <- mojor::jit(gibbs_kernel)
I am currently building out the IR to reliably accelerate R code through JIT, creating Mojo kernels that run on the CPU/GPU without leaving the R session. Perhaps this can lead to a “MojoR language” that brings Mojo’s systems-level performance directly into the R ecosystem.