NPU and other exotic accelerator support

Curious · June 12, 2026, 11:49pm

Hello Modular forumites,

Out of sheer curiosity, is there anything in the Mojo/Max stack or even in MLIR that would prevent the Modular stack to support NPUs (Q.ANT) which already support C++, Python etc.?

owenhilyard · June 13, 2026, 1:26am

Q.ANT seem to be a bit vague about their stack and programming model, so I have no idea. Mojo is designed to be capable of being hammered into a “technically functions” shape for anything with a program counter (FPGAs, CGRAs, etc are just really weird to try to make a programming language for). NPUs are a bit of a mixed bag, and at this point they’re kind of the “everything else” category. For instance, if I took something with the programming model of the Cray 1 Supercomputer and put it on a PCIe card, I could probably call it an NPU and nobody would blink an eye. Similarly, a lot of people have called the “Taalas HC1 Technology Demonstrator” an NPU, and it bakes a model into the silicon directly, so I don’t think Mojo really has a chance of programming that.

If you want to look at a slightly better documented NPU, Tenstorrent has done an excellent job of providing documentation for their hardware. That Mojo would have absolutely zero issues talking to, and you could probably make that happen within a day or two of the compiler going open source and being able to turn on the RISC-V LLVM backend (since Tenstorrent uses a bunch of RISC-V cores).

Curious · June 13, 2026, 1:54am

NPUs in general don’t really interest me because from what I have seen they are highly specialized copper silicon chips as opposed the new photonic paradigm that Q.ANT is pushing forward. It seems obvious to me that old school copper silicon will not be the future of computing anymore than spinning disks or tape is the future of storage.

trojan_x · June 17, 2026, 10:36am

The Tenstorrent architecture is very distinct from NPUs.

The Tenstorrent architecture is basically a RISC -V GPU running on Network on a Chip (ÑoC).

Things to note is the that the tt-kernel source code is closed source and additionally they have the tt-lang DSL which in reference is very different from Mojo.

They use the TT-forge compiler and Tt-Metallium SDK is basically tied on greater ecosystem which is basically manually overided to run open source LLMs.

The major problem was that the TT-NN framework required converting tensors back to Pytorch just to run Models. If you buy Black hole or Wormhole architecture you have to manually overide everything

Though both Mojo and tt-forge are built on top of MLIR Tenstorrent hardware still relies on a closed ecosystem Which is Tt-Metallium is everything and middleman systems like Apache TVM.

Validating this here is one of the discussions I and the Tenstorrent folks had in the Tenstorrent Discord Server:

trojan_x · June 17, 2026, 1:33pm

To run NPU kernels in Mojo you’ll use __mlir_op and link it with their primary ISA VLIW.

Though this is very experimental

You can try this if you have Apple Neural Engine M-series on Mac (38 TOPs+) or Linux laptops likes System 76 oryx or Tuxedo that have built-in NPUs like AMD XDNA.

Different NPUs are different… like the XDNA from AMD uses MLIR AIE(AI Engine ) compiler and others from Qualcomm TurboX and Edgecortix..

trojan_x · June 17, 2026, 1:46pm

From a counterintuitive opinion NPUs are designed for the future using silicon but they’re meant to proof that you can run heavier models with lower TOPs or lower power.

If you look at NPU : GPU ratio in TOPs to performance output is often 1 : 2.5 meaning a 60TOPs NPU gives you the edge necessary to run like a 155 TOPs

Topic		Replies	Views
Important Question: Intel Lunar Lake+ Support General	3	282	July 9, 2025
Supporting New Accelerators in Mojo: The Case of the AMD MI300X GPU Programming mojo-compiler	7	541	May 14, 2025
Support for Turing Architecture? MAX	7	375	May 4, 2025
Support for older NVIDIA GPUs and drivers in Mojo and MAX Mojo gpu	0	226	February 22, 2026
How exactly does Modular / Mojo extend beyond CPU and GPU General	4	174	June 15, 2026

NPU and other exotic accelerator support

Related topics