Here is a second proposal, which I think many people would prefer to proposal A.
Proposal B: Keep Int as the default type, but take steps to ensure that people don’t misuse it.
Readers who are fans of Mojo’s current design will prefer this proposal, because only a few small changes need to be made.
As mentioned earlier, Int is misused whenever it stores a quantity unrelated to the number and/or position of values stored in memory. For example, you should never use an Int to store site_visits or world_population etc. By doing so, you are writing a program that is likely to break when deployed to certain targets.
Mitigation 1: Rename Int, so that people understand its purpose
In today’s Mojo, programmers are likely to misuse Int, because:
- Its name is so simple (simpler than
Int64etc), that it seems like the type you’re supposed to use whenever you want a “normal” integer. But in reality, you are only supposed to use to count or locate values in memory. If you use it for any other purpose, you will end up with code that is not portable. - The name “int” has a different meaning in other language communities. In C++ it’s de facto 32 bits, and in Python it’s a BigInt. Overloading “int” with a third meaning is a recipe for confusion, especially for programmers who are adding Mojo into their existing Python and/or C++ projects.
I propose renaming Int to Len. Rationale:
- This name implies that
Lenis for storing lengths, sizes, and offsets into memory, especially given the implied connection to thelenfunction. - The name is reminiscent of
ssize_tin C++ andisizein Rust. - Intuitively,
var site_visits: Lenandvar button_clicks: Lennow feel like they have the wrong type, which is good, because they do!
As a bonus, Len ties in nicely with 0-based indexing. The array element x[0] is located at “length 0” from the start of the array.
Mitigation 2: Generate a compiler warning whenever an integer literal ≥ 2^31 is converted to a Len.
Large Len literals usually indicate a programming error. The now-famous
world_population statement is one such example:
var world_population = 8142000000MISTAKE: Has nothing to do with the size of memory.
Beyond being a programming mistake, this statement is also non-portable. On 32-bit platforms, it would either fail to compile, or worse—it would overflow.
The sensible thing to do here would be to emit a compiler warning saying “Warning: This statement will not compile on all targets. Large Len literals are usually a mistake. Did you mean to declare an Int64?”
Mitigation 3: Prevent adding Int64 to Len variables, and suggest that the Len variable should be an Int64.
This ties into the recent discussions about implicit casting. We should disallow code snippets like the following:
var nums: List[Int64] = ...
var total = 0
for num in nums:
if <condition>:
total += nums
The problem with this program is that it is prone to overflow on 32-bit targets. As long as Len.__iadd__ doesn’t accept an Int64 on the RHS, and we don’t allow Int64 to be implicitly converted to Len, we should be good here. However, if we want to really help our users out, the error message should suggest declaring total to be an Int64, since that is likely what the programmer meant to do!
A decent compromise?
This is a modest proposal. Its goal is to ensure that Mojo users work with integers in a manner that is both correct and portable, while retaining the “Int by default” behaviour of today’s Mojo.