But mojo has a strong intersection with python. I see no need to deviate from Python unnecessarily.
There are just so many python programmers out there. We should not discourage them if mojo wants to be successful.
For pointer indexing, Iâd almost rather have the underflow since that will likely cause it to segfault so I can see the issue, as opposed to writing to or loading from before the allocation starts where I probably just corrupted the heap. Itâs still a logic error either way, but with underflow we can have a compiler switch to help people find the bugs instead of a combination of negative indexing and signed values going below zero causing data to be mysteriously wrong.
I agree that defaulting to checked arithmetic is likely a good idea, possibly with alternative types which do wrapping/saturating by default to save people from writing wrapping_add hundreds of times. However, for many algorithms there are performance considerations to checked arithmetic and having an opt out, especially for floats, is important.
Perhaps we could track the ranges of values via Scalar[min: IntLiteral, max: IntLiteral] and only have checks in places where the operation could overflow. Combined with type refinement (either properly supported by the compiler or using some interesting workarounds), that could get rid of most of the overhead of checked arithmetic in the common case. The SIMD version of that will be a bit of a mess of a dependent type to handle with maximum performance, so weâd need to think very long about that. Values obtained via IO could also cause some problems if they are left to their full range, and this may be a performance footgun for new users of Mojo.
C++ having implicit conversion is the cause of many of their issues there, as I mention bellow you in the thread.
The issue with having all of those is what happens when one person writes a library with CrashingInt and someone else is trying to recover from that and not having crashes in their code. Youâd need to make every function that does math be generic over the desired behavior. To me, that feels like it would either lead to a very bad ecosystem split or cause a lot of headaches. That may still be necessary as we investigate this more, but I think that to start with compiler options are a safer path since most users simply donât care about correctness enough to have every range call raise.
The reason we want to deviate from python is because python hides a lot of very important complexity to these issues. Arbitrary precision integers are one of the reasons why everyone says for loops are slow in Python, checked arithmetic has a lot of performance overheads, using generics to track value ranges is going to rapidly increase the complexity of writing generic code, and the differences between checked + raise, checked + crash, saturating and wrapping are very large for some programs, with some behaviors being totally unacceptable (ex: checked + raise for ergonomics or performance, checked + crash for people who like having uptime, saturating/wrapping for people dealing with money or safety-critical code). Python makes no attempt to solve this problem, so we canât use it as a template for what to do here.
Hi Folks,
It is difficult with me to engage with this thread, because there are a lot of things going on here. Some high level comments though:
-
We are actively removing implicit conversions between Int/UInt because they are unsafe and lead to predictability problems.
-
C++ has these implicit conversions (and is a wildly unsafe language in general), so I donât consider data on âthis is how common these types are in C++â to be particularly relevant to where we are going.
-
It is very clear that APIs disagreeing on signedness leads to fragmentation and complexity in APIs. This fragmentation manifests itself as a ton of casts from one to the other all over code, which makes it ugly and difficult to read. Stay tuned for a patch that has to introduce a ton of these due to the first step of implicit conversion removal.
-
Other (successful and scaled) languages have been through this before, e.g. Java doesnât even have an unsigned integer type (an extreme position). Swift does have it, but found that misuse leads to a lot of problems at scale. The API design guidelines directly address this, which Owen quoted above:
Use
UIntonly when you specifically need an unsigned integer type with the same size as the platformâs native word size. If this isnât the case,Intis preferred, even when the values to be stored are known to be nonnegative. A consistent use ofIntfor integer values aids code interoperability, avoids the need to convert between different number types, and matches integer type inference, as described in Type Safety and Type Inference.
-
Making APIs more generic to try to solve for this, e.g. pervasive use of
Indexer, isnât a very desirable solution, because it makes all APIs unnecessarily more complicated and ugly (takingSome[Indexer]for an argument instead ofInt). -
There is no good solution for return types. len(youList) needs to return some type - if you choose
UIntthen youâll need casts all over the place, because UInt is such a limited type.
There are some exciting and interesting ideas being floated in the thread. One FYI, is that the Mojo team is exploring eventually eliminating Int and UInt entirely - making them an alias for SIMD[index, 1] and SIMD[uindex, 1] but it isnât clear if that will be possible yet.
Thank you for all the discussion on these issues, they are core topics that affect a lot of Mojo code. Weâll be going through a transitional time, and that time will involve a lot of casts, but I expect things to normalize out as we work through some of the APIs that are incorrectly using UInt.
-Chris
Youâve brought up this point several times now (i.e. being unable to reach the upper elements of an array if indices are signed), but I donât think itâs realistic.
Using an Int to index into an array only prevents you from accessing an entire 32-bit address space if:
- You have a single array covering â„ 50% of the address space. (Very unlikely. Your entire program is nothing more than a single contiguous array that fills the entire address space? Really?)
- Furthermore, the elements of that array are all †1 byte. (If the elements are 2 bytes, then the array can hold at most 2^31 elements, so an
Intis fine.)
This is an extremely unlikely combination. And if it ever happens, you can probably work around it fairly easily, by using a UInt and bypassing the standard API for indexing.
In his paper Bjarne pointed out that he ran benchmarks on several devices and saw no performance difference, likely because the ALU executes the two checks simultaneously. So he has drawn the opposite conclusion to you. (I see you commented on this later.)
âHaving tons of bounds checks in a loopâ often means youâve structured your code poorly. Instead of iterating over i and passing that to __getitem__ on each iteration of a hot loop, you should probably be using an iterator. By using an iterator itâs both harder to make an indexing mistake, and the iterator only needs to perform one comparison operation per iteration. (âHave I reached the end yet.â)
Alternatively, if youâre doing random indexing rather than sequential iteration, youâre going to be bottlenecked on cache misses, not bounds checking!
If I had to summarize this thread, I would say that we have come to an agreement that having Mojoâs UInt implicitly underflow at 0 is a massive footgun, and we probably need to take action to address this. Until we do that, we canât even begin to contemplate having the standard library APIs take and return UInts!
Here are some steps we can take to solve this problem:
- Rename all of the
UInttypes, so that their names clearly reflect that they exhibit âmodular arithmeticâ below 0. I would suggest a name likeMod32,Mod64for the fixed size uints, and maybeModIndexfor the word-sized uint. This solves the problems seen in the C++ and Rust communities, where programmers often selectUIntwith the justification of âthis number canât be negativeâ, and then they hit a bunch of underflow issues. If the name isMod64theyâre going to notice that this type is for people who want modular arithmetic, rather than people who want non-negative integers. - (Less important) Consider introducing a safe set of types that Mojo users can reach for when they want to store an integer that is proven to be non-negative. A name like
Nat32,Nat64etc might make sense. The key difference fromMod32etc is that subtraction is checked/raises, so underflow bugs are impossible.- We would need to consider whether these types have enough utility to be included in the standard library, and whether the stdlib APIs should make use of them. This is currently unclear. The answer is quite possibly ânoâ.
Once we have taken the above steps, we will be in a good position to answer questions about what numeric types should be used for indexing, etc.
Owen, I agree that Python hides a lot of very important complexity, but that is why Python is so successful. Python programmers are happy with this tradeoff.
I agree with you that Python bigints are too abstract for Mojo which aims to be higher performance than Python, especially now most CPUs are 64 bit. But I would be very wary about reverting to the C/C++ implicit conversions from signed to unsigned, which I personally have shot myself in the foot with repeatedly over the years.
I would be happy with array indexing not allowing negative values and using UInt, provided that conversion from Int to UInt is implicit and raises an exception if the Int is negative. That allows the Python style programmers to keep their current code and programming style. The check can be done once when the index is converted and the internals of Mojo can assume unsigned.
Wearing my Python programming hat as a new Mojo programmer, I want implicit conversions to be safe. I donât want to even think about the weird stuff that can happen in C/C++ when I mix signed and unsigned. Ugh, if I want that level of detail, I may as well use CUDA.
So, implicit conversion
n : UInt = some-integer-value
provided the compiler checks for negative values and throws an exception if I get it wrong. Same for
i :Int = some-unsigned-value
I expect the compiler, not me, to check that if the unsigned value would overflow (unlikely but possible on 64 bit) an exception will be thrown.
As a Python programmer, I want my time to be considered more valuable than the compilers or library writers.
I agree with Nick that the implicit footguns are a serious problem.
And I second the suggestion for Nat32 and Nat64 types. Ada had those long ago in the 1980s, integer types that are range 0 .. INT_MAX. In theory wasting one bit, but a much closer match to what programmers often want from the type.
I am not sure about this. No one has really attempted to make a fast BigInt. If you apply all the peephole optimization llvm does, use stack allocations and eliminate a ton of branches. Then I am sure, you can build a BigInt type with comparable performance. BigInt is just a Variant[Int, UnsafePointer]. I donât think comparing a interpreted and a compiled language is valid here.
Fused add + jcc can be very competitive in performance.
Here is my take on Scalar[start, end] and other Int types:
This is less useful, the caller has to uphold additional guaranties and perform narrowing type casts.
// The address space 48bits wide.
fn __getitem__(UInt48) -> T
This is more useful, the caller does not have to uphold additional guaranties.
fn pop_count(Int) -> Scalar[0, 31]
We are actively removing implicit conversions between Int/UInt because they are unsafe and lead to predictability problems.
I havenât seen any objections to this. Indeed, many of the âproblems with unsigned integersâ are actually âproblems with implicit conversionsâ, and while I donât think we need to go quite as far as Rust for the final design, itâs probably best to rip that band-aid off now and Mojo can have safe implicit casts (strictly widening, possibly widening + sign change if that doesnât cause overload resolution ambiguity).
C++ has these implicit conversions (and is a wildly unsafe language in general), so I donât consider data on âthis is how common these types are in C++â to be particularly relevant to where we are going.
I can go dig up more data for Rust, which I think is the most conservative language in wide use with regards to integral casting. I can also look at how often casts are used to see how much friction mandating UInt for indexing would cause, since Rust wonât even let you index with the UInt32/UInt64 equivalents.
It is very clear that APIs disagreeing on signedness leads to fragmentation and complexity in APIs. This fragmentation manifests itself as a ton of casts from one to the other all over code, which makes it ugly and difficult to read. Stay tuned for a patch that has to introduce a ton of these due to the first step of implicit conversion removal.
I think that keeping Indexer, or expanding to simply use SIMD and performing a similar operation to get the best of both worlds might make the most sense as a compromise.
If we canât do that, we are still left with the question of what API we want to pick for the ecosystem, since I think that what len returns will determine a lot of what the ecosystem does.
Other (successful and scaled) languages have been through this before, e.g. Java doesnât even have an unsigned integer type (an extreme position). Swift does have it, but found that misuse leads to a lot of problems at scale. The API design guidelines directly address this, which Owen quoted above: âŠ
Java not having unsigned types is also one of the things I often hear discussed in the context of the failure of Java machines and JavaOS. In particular, that it made interacting with hardware much more difficult. This is one of my concerns with going Int by default, that it will force people who are interacting more directly with hardware to make tons of casts.
Iâll trust your recollection on what happened in Swift, by my experience with Rust, which, once again, requires explicit casts everywhere and uses a UInt equivalent for indexing, is that there werenât really any major issues. My personal experience in Rust is that many of the casts I do are a result of talking to C code which canât decide what type it wants a kind of variable to be (ex: using both signed and unsigned).
I think that expanding with another part of the swift docs is also relevant:
Use the
Inttype for all general-purpose integer constants and variables in your code, even if theyâre known to be nonnegative. Using the default integer type in everyday situations means that integer constants and variables are immediately interoperable in your code and will match the inferred type for integer literal values.Use other integer types only when theyâre specifically needed for the task at hand, because of explicitly sized data from an external source, or for performance, memory usage, or other necessary optimization. Using explicitly sized types in these situations helps to catch any accidental value overflows and implicitly documents the nature of the data being used.
I think that Mojo is a language designed for use-cases very likely to hit that second paragraph, where the performance of checking for invalid negative values becomes a concern.
Making APIs more generic to try to solve for this, e.g. pervasive use of
Indexer, isnât a very desirable solution, because it makes all APIs unnecessarily more complicated and ugly (takingSome[Indexer]for an argument instead ofInt).
I have two ideas for this.
All SIMD
- Get
IntandUIntintoSIMD, then havealias Integer = SIMD[_, 1] requires Self.dtype.is_integral(). This means that, so long as you arenât trying to index with a float, everything âjust worksâ, and then we can put all of the checks intonormalize_index(possibly moving that function to the prelude).
That would make a function that does indexing look like this:
fn __getitem__(ref self, var idx: Integer) -> ref [self] T:
idx = normalize_index["List"](idx, len(self))
...
This should both ensure widespread support for negative indexing, and let things coexist. Itâs also much nicer than the Indexer-based solution.
Make the Default Integral Type âThe narrowest that will fit this Integerâ
We can try to borrow from Adaâs range type and make the thing that IntLiteral decays to a type that tracks the possible ranges of values and can implicitly cast to âsafeâ dtypes.
Consider the following type:
struct Integer[min: IntLiteral, max: IntLiteral, backing_dtype=smallest_dtype_for_range[min, max]()]:
var value: Scalar[self.backing_dtype]
....
For example, right now Mojo defaults to Int, but a lot of things from libc will be Int32, which we canât safely cast to. That means that weâll get a lot of Int32(-1) or Int32(0). However, if the value was instead Integer[-1, -1], the compiler would be able to safely tell that cast is safe. This also lets Mojo tolerate some degree of user input or uncertainty, for instance a command line flag that is between 0 and 100, or HTTP status codes, where there are known uses in the wild of 100 to 1000, with 100 to 600 being standard. This ârange trackingâ also offers Mojo a way to solve the problem of python developers being confused by fixed-width datatypes, and potentially could let Mojo automatically fall back to arbitrary precision if we deem that desirable (which is a whole other debate). If we donât do arbitrary precision, then it would let us, in library code, only do overflow and underflow checks when they are actually necessary, which I think would get rid of most of the performance penalty of having checked arithmetic. This would benefit greatly from type refinement, but I think an MVP would be usable without that. We would, of course, need to hash out semantics for various operations, especially bit-wise ones. This also moves Mojo closer towards âproof assistantâ capabilities. However, I acknowledge that doing this could have a compile-time impact due to extra IR that I am unequipped to gauge.
There is no good solution for return types. len(youList) needs to return some type - if you choose
UIntthen youâll need casts all over the place, because UInt is such a limited type.
My subjective experience is that Rust has mostly avoided that issue for reasons I donât fully understand. I think that some hard data on this is necessary.
we now have people trying to do things with NPUs where a vector which needs most of a 32 bit address space is now not out of the question
Youâve brought up this point several times now (i.e. being unable to reach the upper elements of an array if indices are signed), but I donât think itâs realistic.
Using an
Intto index into an array only prevents you from accessing an entire 32-bit address space if:
- You have a single array covering â„ 50% of the address space. (Very unlikely. Your entire program is nothing more than a single contiguous array that fills the entire address space? Really?)
- Furthermore, the elements of that array are all †1 byte. (If the elements are 2 bytes, then the array can hold at most 2^31 elements, so an
Intis fine.)This is an extremely unlikely combination. And if it ever happens, you can probably work around it fairly easily, by using a
UIntand bypassing the standard API for indexing.
I actually found the example I was looking for on this. Qualcommâs AI 100 NPU (DC Hardware), a cut-down version of which is in all of the âWindows on ARMâ Snapdragon X/X Plus/X Elite laptops and Qualcomm Phone SoCs, is made up of Hexagon cores, which are 32 bit and have the ability to map that 4 GB address space to different parts of 64 bit host memory and move the mappings around while a kernel is running. This means that you are very, very likely to have almost the entire address space filled with model weights, especially on the models with 128 GB of memory. For reduction functions, it will likely be a single tensor mapped in. Qualcomm also recommends Int8 as the default datatype for these NPUs. Qualcomm uses this NPU IP across their many, many phone SoCs, where they have roughly 25% of worldwide smartphones with this thing in them. I feel that 25% of smartphones worldwide is sufficient volume to consider on its own, with Windows on ARM and the DC accelerators adding additional weight.
And if it ever happens, you can probably work around it fairly easily, by using a
UIntand bypassing the standard API for indexing.
That would make doing bringup for these devices very, very annoying since it would mean that you would essentially have an entire section of MAX which ignores normal indexing.
Range checking is also a very good point, doubling the number of checks, especially on weaker processors, is going to cause problems.
In his paper Bjarne pointed out that he ran benchmarks on several devices and saw no performance difference, likely because the ALU executes the two checks simultaneously. So he has drawn the opposite conclusion to you. (I see you commented on this later.)
Bjarne tests ânormalâ code in most his examples, not numerics. Mojo has a lot of code which is going to be very long for loops which will hit the ALU saturation case Bjarne said would cause issues:
unless the instruction pipeline and the arithmetic units are saturated, the speed will be identical
That should be an or, not an and, since if you run out of ALU, all of the frontend in the world wonât help you. However, the explicit goal of many compute kernels is to get as close to ALU saturation as they possibly can.
having tons of bounds checks in that loop may actually cause you to bottleneck on the scalar math
âHaving tons of bounds checks in a loopâ often means youâve structured your code poorly. Instead of iterating over
iand passing that to__getitem__on each iteration of a hot loop, you should probably be using an iterator. By using an iterator itâs both harder to make an indexing mistake, and the iterator only needs to perform one comparison operation per iteration. (âHave I reached the end yet.â)
Until we get a masked SIMD type (which only works on masked vector ISA, so all consumer Intel CPUs are out and all consumer ARM CPUs), an iterator canât express the end of a loop unless the backing data has been padded, which may involve a substantial amount of copying. We also get fun things like variable width SIMD with SVE where I donât think the platform ABI describes how to return a vector from a function without indirection, since itâs an unknown size at compile time. And, if we try to shove all of this into an interator, it will need runtime information meaning we lose the ability to use @parameter to unroll the loop.
Rename all of the
UInttypes, so that their names clearly reflect that they exhibit âmodular arithmeticâ below 0. I would suggest a name likeMod32,Mod64for the fixed size uints, and maybeModIndexfor the word-sized uint. This solves the problems seen in the C++ and Rust communities, where programmers often selectUIntwith the justification of âthis number canât be negativeâ, and then they hit a bunch of underflow issues. If the name isMod64theyâre going to notice that this type is for people who want modular arithmetic, rather than people who want non-negative integers.
I donât think the underflow/overflow issues are as present as you think they are, and they may actually be easier to check for on some platforms since many platforms have good mechanisms for âbranch on underflow/overflowâ where the assumption that itâs probably not overflowing or underflow is baked into the CPU. When you get rid of that, the compiler loses the ability to help users when values go negative, and they now need to remember to check for negative values themselves.
I also think that having default types with their behavior controlled by compiler flags is probably a good idea, since many libraries are not going to propagate an enum which sets the desired behavior through every function tree and libraries being more portable demands they are able to be generic.
Changing the names for basic integral types also seems like a recipe for confusion, since now we need to teach all of our users, instead of just teaching the developers coming from higher-level languages about how overflow and underflow work.
I agree that implicit conversions are bad, which is why I said to only allow safe ones. Widening without changing the sign is always safe and shouldnât cause issues. Allowing the sign for unsigned to signed conversions may cause ambiguous conversions in the type system and make things less ergonomic.
What are your thoughts on the generic âeverybody except library authors winâ solutions I have above? Python programmers seem to very much desire negative indexing so I think that having things error on negative indexing in that way would cause more confusion.
I think that UInt â Int should be explicit, for the reasons that we consider making it implicit a footgun in C++. We can make that constructor raise, but Iâm not sure if I like the idea of an implicit raising constructor.
The reason we want to deviate from python is because python hides a lot of very important complexity to these issues. Arbitrary precision integers are one of the reasons why everyone says for loops are slow in Python, checked arithmetic has a lot of performance overheads
I am not sure about this. No one has really attempted to make a fast BigInt. If you apply all the peephole optimization llvm does, use stack allocations and eliminate a ton of branches. Then I am sure, you can build a BigInt type with comparable performance. BigInt is just a
Variant[Int, UnsafePointer]. I donât think comparing a interpreted and a compiled language is valid here.
I think youâre discounting a lot of effort by a lot of people on pythonâs int, and there has been a lot of effort in fast arbitrary-precision ints, such as gmp, boost::multiprecision, MPFR and Haskell. Youâre comparing what is often a single-cycle operation with one that needs to do a branch, then a pointer dereference, then figure out what kind of data it has and what the other operand is. Even if you have a âsmall BigIntâ optimization, so no heap allocations, the fast path will still likely do the pointer dereference.. Thatâs a lot of extra operations before we get to doing the math.
Fused add + jcccan be very competitive in performance.
It can be, because the branch predictor on every CPU Iâve ever poked at assumes that underflow/overflow doesnât happen. However, it usually still tracks these branches and you end up making other parts of the program slower by trashing the branch predictor cache. You also donât get âchecked vector addâ instructions, at least on x86, which means you lose the ability to do SIMD.
using generics to track value ranges is going to rapidly increase the complexity of writing generic code
Here is my take onScalar[start, end]and other Int types:This is less useful, the caller has to uphold additional guaranties and perform narrowing type casts.
They have to do narrowing type casts if they donât already have a narrower type. For example, __getitem_(5) should âjust workâ, and so long as ranges are tracked properly the only place where narrowing would be necessary would be after operations have expanded the range of values beyond what the function can accept. When lacking this feature, code would likely either underflow/overflow or simply pass in an invalid value to the function.
// The address space 48bits wide.
fn __getitem__(UInt48) -> T
This is more useful, the caller does not have to uphold additional guaranties.
fn pop_count(Int) -> Scalar[0, 31]
This should be fn pop_count(Int) -> Scalar[0, 64], all 1s is a valid representation for Int and Int is 64 bits wide in Mojo since Mojo doesnât currently work on any 32 bit platforms.
A couple more ideas to toss out there, just FYI for discussion:
-
Iâm happy folks are generally agreeable about removing the implicit conversions - they really are problematic.
-
One reason to bias APIs towards
Int(in the absence of implicit conversions) is thatvar x = 42defaults toInt. -
Another benefit of standardizing on
Intis that (in my experience) programmers benefit from simple rules to follow, particularly when building large scale APIs that need to interoperate. -
One thing to clarify: both
IntandUIntin Mojo have 2âs complement behavior - there is no trapping on overflow in Mojo (as in Swift, or as in Rust in debug mode) because such a thing is very impractical on a GPU or with SIMD code. The consequence is that Int and UInt are very similar except for shift right, divide and remainder behavior. -
The point above about âuse things like UInt8 for narrow casesâ do absolutely come up in AI kernels (e.g. int8 quantization) though the data and logic in these algorithms is fiddly and narrow. We need to be able to do this and make it reasonable for kernel engineers, but far more code will work with list lengths and simple integer values than working at that level, despite Mojoâs current focus.
-
It would be possible to rename UInt to something long and scary, but Iâm not sure if such a thing has a benefit.
-
Owen seems concerned above addressing the full 32-bit space of embedded systems. I agree with others that this is pretty niche, but if it were important, it seems that we could support it by adding (e.g.) specific overloads of getitem to certain APIs (e.g. UnsafePointer, Span, etc) to make it possible to express and do these things, without burdening all application level code everywhere.
-Chris
One reason to bias APIs towards
Int(in the absence of implicit conversions) is thatvar x = 42defaults toInt.
Do you want to investigate throwing a pile of metaprogramming at this to make it so that var x = 42 can be a value backed by a UInt8 and var y = -551 can be a value backed by an Int16, with more compile-time value tracking so that casts which we can prove are safe can be done implicitly? This might give a best of both worlds option where Mojo tries much harder propagate information about compile-time constants to help make types as restrictive as they can be for the value range they represent.
Another benefit of standardizing on
Intis that (in my experience) programmers benefit from simple rules to follow, particularly when building large scale APIs that need to interoperate.
Rust seems to have managed with the guidance being to use usize for indexing and the most specific type that can represent your value range for everything else. Across 17 million source lines of code (as counted by cloc), the top 1000 rust libraries by download count + stuff I already had in the package manager cache, there were 130055 casts, which equates to one cast roughly every 134 source lines of code. If we compare that to Mojoâs standard library, and throw out the big float formatting table, Mojo comes in at roughly 80 sloc per cast, and Max comes out to ~78 source lines per cast. I think itâs very reasonable to ask why Rust has a lower density of casts than Mojo with implicit conversions, despite forcing explicit casts for all conversions, especially if the goal is to minimize the amount of casting that developers need to do.
One thing to clarify: both
IntandUIntin Mojo have 2âs complement behavior - there is no trapping on overflow in Mojo (as in Swift, or as in Rust in debug mode) because such a thing is very impractical on a GPU or with SIMD code. The consequence is that Int and UInt are very similar except for shift right, divide and remainder behavior.
I agree that for most cases 2âs complement is fine, but having a compiler flag to make arithmetic checked, even at substantial speed penalty, would be helpful in tracking down bugs in much a similar way as forcing GPU kernels to serialize so you can step through everything in a debugger is helpful.
The point above about âuse things like UInt8 for narrow casesâ do absolutely come up in AI kernels (e.g. int8 quantization) though the data and logic in these algorithms is fiddly and narrow. We need to be able to do this and make it reasonable for kernel engineers, but far more code will work with list lengths and simple integer values than working at that level, despite Mojoâs current focus.
To me, this seems to be a point in favor of using SIMD/Scalar for indexing once Int and UInt are moved over. Having types that intentionally accept different kinds of indexing than others seems like it could cause fragmentation as libraries align themselves with their dependenices.
It would be possible to rename UInt to something long and scary, but Iâm not sure if such a thing has a benefit.
I donât see any benefit to intentionally making the name scary, but I could see an argument for making Int and Uint named other things, given that even in this thread people have made the mistake of assuming Int is 32 bit. I personally like Rustâs names of usize and isize, but Size and Offset might also be good options.
Owen seems concerned above addressing the full 32-bit space of embedded systems. I agree with others that this is pretty niche, but if it were important, it seems that we could support it by adding (e.g.) specific overloads of getitem to certain APIs (e.g. UnsafePointer, Span, etc) to make it possible to express and do these things, without burdening all application level code everywhere.
I am slightly concerned that Mojo may be relegating what could be, by volume, one of the most common ML accelerators to a second class citizen. Qualcomm is a major smartphone SoC vendor, and the NPUs in those SoCs have that problem. In 2024, Samsung, who primarily uses Qualcomm SoCs, shipped 222 million phones. Other vendors in that list also use Qualcomm SoCs. Given that Nvidia shipped around that same number of GPUs in 2024, and AMDâs 2024 financial report putting RDNA GPU sales nowhere near that. Given that Qualcomm reuses this NPU IP all over all sorts of product segments and has done so for years at this point, I donât think itâs a good idea to make design decisions for Mojo that would make supporting these accelerators difficult in the future.
I do sincerely hope we get DType.int and DType.uint and that way we can do
alias Integer[dtype: DType] = SIMD[dtype, 1] requires dtype.is_integral()
As for the wish to centralize everything around Int and arguing that IntLiteral defaulting to it, I am still strongly in disagreement. I think it is rather a lack of better control of the materialization of IntLiteral and FloatLiteral.
If we got a way to inject a function into the materialization decorator we could do something like:
# gets smallest dtype for the given value and signedness
fn _get_dtype(v: IntLiteral) -> DType: ...
@nonmaterializable(Scalar[_get_dtype(Self())])
struct IntLiteral: ...
No, definitely not. It is an intentional design that the type of var x = 0 doesnât change if you change it to -1 or make it larger. This is a feature, not a bug. Literals themselves have infinite size though, so if you want something like this, you can use alias x = 42
given that even in this thread people have made the mistake of assuming
Intis 32 bit.
It can understand that possibility for confusion, and I do expect a lot of programmers to be coming from C/C++, but I havenât experienced confusion with Swift or Go code. Swift migrated people from (objective-)C to Swift and thus had exactly this point to get over, but it wasnât a significant issue AFAIK. I think the outcome was good, and I think âIntâ will be good for Mojo as well.
I am slightly concerned that Mojo may be relegating what could be, by volume, one of the most common ML accelerators to a second class citizen.
Iâm a fan of QCOM and their devices, but I donât see this as a problem in practice even for their accelerators. As others pointed out upthread, having a single huge array that takes more than half of your address space is a very unusual workload. If youâre doing that, then there are going to be many other weird things going on. Mojo has plenty of ways to express this, it doesnât mean that List needs to be the way to do it.
-Chris
Iâm still not sure of what the issue is with having explicit casting everywhere, that is already the current state of things whenever programming anything more than a short script in Mojo.
I donât find sheltering users from learning about integer bitwidth should be the cause of having to check for negative values every other function call where user input or other librariesâ code might be involved.
I like the idea of the simplicity it might bring to the ecosystem, but I donât like the fact that it is pushing people towards unknowingly not checking for negative values. Underflow or overflow issues should be catched in debug builds and through testing, we cannot hand-hold everyone.
We could easily give people the option to use safe raising implicit conversions and another explicit one where it doesnât raise, just checks at debug time. We already have such an option if we get Int and UInt to be a DType and use SIMD.cast[]() for explicit conversion and the constructors can be switched to raising for example. This is just one of several options, Iâm not particularly fond of it.
But the amount of vulnerabilities (or crashes) that not checking for values under zero might cause (and the runtime overhead) makes my hair stand up, because this feels just like mistakes of not checking a null pointer in C.
I strongly believe we should use UInt where appropriate, I would rather have huge verbosity in casting than save a couple of lines and have code look a bit prettier.
I agree, and I think that, given that a very large body of Rust code somehow has a lower density of casts than the two largest Mojo codebases do, that unsigned indexing is likely not the root cause of having tons of casts.
I am in favor of removing the implicit casting between Int and UInt, but Iâm against removing Indexer.
There are other options not mentioned here:
- Keep the explicit casting between
IntandUInt, but add mixed overloads for all common operations betweenIntandUInt:
fn __add__(self: Int, other: UInt) -> Int: ...
fn __sub__(self: Int, other: UInt) -> Int: ...
fn __lt__(self: Int, other: UInt) -> Bool: ...
These will be slightly slower since they perform safety checks, but they can be avoided by hardware/performance-focused developers by not using mixed types.
This means we can make len return UInt without ergonomic issues and without hurting correctness and performance for users who care about them.
-
Make implicit raising constructors between
IntandUInt. Since itâs raising, it will allow implicit casting in raising functions likedef, providing ergonomics for Python users without hurting performance for power users. -
Add a
def implicit raising constructorthat only works indefcontext for Python users (either with a decorator or by creating it withdefinstead offn, meaning it only works indefcontext).
My personal thoughts:
Mojo markets itself as being excellent at metaprogramming with its powerful parameter system. However, if a few __getitem__ functions parameterized with Indexer are enough to hurt compile times significantly or are undesirable because theyâre âtoo verbose/uglyâ, and we should just use Int everywhereâit suggests that Mojoâs metaprogramming performance and ergonomics donât scale well or arenât user-friendly enough. This makes me question the entire premise of the language.
I can understand defaulting len() to Int as it currently does, but I donât understand the rationale for removing Indexer, sacrificing performance and generic programming ergonomics while requiring casts everywhere and making the language less extensible to other types.