Does Mojo have a defined overflow behavior for `Int` and `UInt`?

Does Mojo have a defined overflow behavior? I know the default is “what C++ does”, but C++ only recently (C++20) decided on two’s complement signed overflow.

This also leaves us with the hazards around overflow on Int and UInt, since while many would expect those to be 64 bits, some devices, such as Tenstorrent’s accelerators, are made up of massive grids of 32 bit cores with special handling for 64 bit addresses in a network on chip port or DMA engine. This types being variable bit width mean that their overflow is more likely and their use in indexing means that overflows are more likely to cause program crashes.

If it does not, should it, and if so what should it be?

1 Like

CC: @joe

AFAIK we do not have defined behavior for these types, but that’s something we should definitely have. Similar things goes to Float point types for which we need to define fp-model, exceptions etc.

This types being variable bit width mean that their overflow is more likely and their use in indexing means that overflows are more likely to cause program crashes.

I do think it’s essential to separate behaviors that stdlib will define and what compiler will have for MLIR-types/operations. That is, Int can technically have 2s complement signed overflow, but mlir.index type (which is used in Int) may have UB.

I can tell that compiler loves UB as that allows more aggressive optimizations to be done. For example, having defined signed behavior in

int x = ...;
for (int i = 0; i < N; ++i) {
  a[x++] = i;
} 

does force vectorizer to use scatter instead of a regular vector store.
So maybe from compiler side behavior should be relaxed and Mojo’s implementation(stdlib or user’s) can have checks if needed.

Do you have any preferences ?

1 Like

Hi Owen, your excellent question triggered a discussion in the Mojo team about undefined behavior in general and integer overflow in particular. Thank you for raising this! We’ll build a document in the proposals folder and will post a link here.

In the mean time, to answer this specific question – all integer types (both sized like Int8 and unsized Int and UInt) have 2s complement and wrap behavior. We should codify and publicly state that.

1 Like

Sorry about the delay in my response.

My personal preference is to default to undefined behavior for performance reasons, but provide explicit versions of operations with fixed and explicitly documented behavior. This means that we would have a wrapping_*, saturating_* and a carrying_*, with the latter returning a tuple which has the amount the overflow was by (expensive to compute, I know). This does mean that everywhere we Ideally, I would like debug mode, instead of defaulting to UB, defaulting to a panic or abort with a stack trace, at least until we can have some way to make it raise in a way that can be turned off if you don’t care, ex: fn foo() raises OverflowException if is_defined["MOJO_ENABLE_OVERFLOW_CHECKS"]() else None. I know that Evan Ovadia, no forum account I can find, has some work around this area that might make this doable to raise by default without the kernel teams at Modular having a hard time. In general, in places where some specific kind of overflow behavior is expected, I personally prefer it be explicitly called out, instead of being an implicit property of the language or hardware, since that gives Mojo the ability to change later if hardware starts to move towards “overflow is UB” or “overflow is saturating” for some reason. I think that debug mode is a good play to put sanitization behavior like this, since developers are trained to turn off the optimizer (or stop using -O2/3) if their code starts acting weird. I know this may cause havoc with the performance of SIMD operations in debug mode, but I think it’s valuable to have a good way to detect when operations start to overflow.

@denis Thank you for the clarification on the current behavior. I’d glad to see that this will be written down somewhere easy to find.