I would strongly recommend considering static . Per @denis’s post, it doesn’t even seem to have been considered.
I agree with Sora. The precedent from many languages is that static is equivalent to “global variable”, and we will need some syntax for global variables at some point. I think that the current functionality of alias is quite different from what C, C++ and Rust do. For example, I would expect to be able to find a static variable in an elf symbol table. In Rust, you can do static with a mutex and have mutability, which you can’t with Mojo’s alias. Given that Mojo is likely to have a lot of SIMD lookup tables, having good syntax for it is important and I think that the least suprising way to handle it is to make Mojo’s use static in a similar way to C, C++ and Rust, meaning that it is a thing which exists at runtime. I don’t want to muddy the waters by making a static mut have vastly different behavior than static, and I think keeping a “wall” between values that are compile-time only and those which exist at runtime is important.
Thank you for suggesting this @Nick , you are correct that we didn’t even consider static. I (and I think many others) coming from C++ have the damage you mention - in that language, the static keyword has MANY different meanings.
If I take a step back, I can definitely see advantages to it. static is the natural dual to “dynamic” which is what runtime code is. It would also work as a qualifier on if nicely.
Here are my concerns with it:
It has heavily loaded baggage from a large number of programming languages that people will be coming from (including C++ and Rust), which could make it confusing. People may default to assuming they know what it means, unlike comptime where they would default to “not” knowing what it means.
The meaning in these other languages is as “static storage at runtime”, which (as you say) can be mutable or not. The meaning in mojo is completely different, “comptime” doesn’t exist at runtime at all.
As Owen correctly says, we will need a word for static storage duration at runtime. Using static for this concept would force us to name “that thing” something else.
Let’s be a bit more structured about it. This proposal would enable:
x, y = foo() # Dynamic declaration/assignment of x and y
comptime x, y = foo() # Comptime assignment
static var x = foo() # [eventually] static storage duration
What would the third concept be named if we use “static” instead of “comptime” for the second one?
This is true, and ideally we can mitigate this by ensuring that Mojo’s static caters to many of the same use cases as static in other languages.
The meaning in these other languages is as “static storage at runtime”, which (as you say) can be mutable or not. The meaning in mojo is completely different
The meaning in other languages is “static address”, whereas an alias could be described as a static value, or static constant. If you take this perspective, it seems reasonable that static x = 0 could denote a static value, whereas static var x = 0 could denote a static variable. If Mojo still had a let keyword, you could even think of the difference as being static let versus static var.
As Owen correctly says, we will need a word for static storage duration at runtime. Using static for this concept would force us to name “that thing” something else.
To the best of my knowledge “static storage duration at runtime” is just another way of saying “global variable”, as long as you remove the C++ quirk where static variables can be defined in local scopes. If we decide that we want a keyword for global variables, we’re in luck: Python already has a keyword named global for working with globals. There’s a caveat though: it’s currently used to “import” the identifier of a global into a function scope, rather than being used to define a global. That said, the syntax global x = 0 at module scope is entirely unused in Python, and is ripe for us to claim, if we want to.
As mentioned, I can see at least two options.
Option 1:
x, y = foo()
static x, y = foo() # "static value" (immutable)
static var x = foo() # "static address" but mutable value, owing to var
Option 2:
x, y = foo()
static x, y = foo() # "static value" (immutable)
global x = foo() # "global variable" (mutable)
I would prefer the second option because it’s more visually distinct. It’s also nice that it reuses the existing Python keyword for globals, so it at least would be unsurprising for Python programmers.
Note: Ideally it would be possible to take the address of both static and global variables, i.e. they both have “static storage duration”. Could the current phenomenon where aliases are always inlined/don’t have their own storage be relegated to an optimization?
Anyway, that’s my opinion on the matter. Maybe static can work, maybe it can’t. I mostly think that it would be nice to be able to talk about all of the “static things” in Mojo: static typing, static dispatch, static branching (static if), static loops (static for), and static values (static). These are Mojo’s constructs for zero runtime overhead, and we can contrast them against “dynamic things” in Mojo: dynamic typing (classes etc), dynamic dispatch, dynamic branching, dynamic loops, and dynamic values.
If we throw the word “comptime” into our vocabulary, that muddies the “static versus dynamic” story.
Yes, these are good points, and you’re right that the global keyword is conveniently what we want. We do need this to work inside of functions, but I don’t see why global wouldn’t work. One other intersection that isn’t ideal is with @staticmethod - a different notion of static already in Python.
FWIW for Mojo we probably don’t need a Python style decorator for static methods. We can just use the presence/absence of self in the method signature. This is what Rust does. (This might require making self a keyword.)
But maybe you’re more concerned about the use of the term “static method” in documentation etc. I’m not sure how/whether that should be addressed.
The meaning in other languages is “static address”, whereas an alias could be described as a static value, or static constant. If you take this perspective, it seems reasonable that static x = 0 could denote a static value, whereas static var x = 0 could denote a static variable. If Mojo still had a let keyword, you could even think of the difference as being static let versus static var.
Most other languages have static mean “these bytes exist in the binary” or they are functions which get added to a list of functions that are run by the language runtime as part of program startup (ex: to allocate memory). comptime, on the other hand, gives the compiler the freedom to eliminate everything that isn’t accessed.
To the best of my knowledge “static storage duration at runtime” is just another way of saying “global variable”, as long as you remove the C++ quirk where static variables can be defined in local scopes.
I’d say it’s more accurate to say that static usually means that you have allocated backing storage inside of the binary for the value and that the value will show up in the symbol table. Locally scoped but static variables are a useful feature for things like memoizing functions. Part of this is that there are also times when you need to stuff a big pile of immutable data into a binary in a paricular place. Say, if you wanted to include binary PTX or MLIR for something. I don’t think blurring the lines between pure compile-time and immutable run-time is helpful for most people, and it’s actively harmful for people working at lower levels of the stack.
I see static and comptime as very different concepts as far as variable storage is concerned, and I think that mixing the two could remove optimization oppertunities. We also have to contend with “static methods” from OOP, which multiple generations of programmers are familiar with. As you say, we might not need a decorator for that, but I think keeping the term is useful. In my opinion, we should aim to use the word people already use for things when they do the same thing, and not use words people already have meaning associated with in different ways. Maybe we use global for the Rust/C/C++ meaning of static, but I really think we shouldn’t mix static with things that are purely compile-time constructs.
static is indeed a strong contender, it is appealing on many fronts. It deserves a consideration.
I agree with Owen that static implies storage at runtime (per the C and Rust meanings). Specifically, I think it would look counterintuitive when used together with the “alias materialization” concept.
Currently an alias is “materialized” (explicitly or implicitly) at a place of use. Exaggerated example: the following creates two LLVM store ops in the loop, creating two copies of the lookup table:
static X: InlineArray[Int, 1024] = computeLookupTable()
fn find(value1: Int, value2: Int) -> Int:
for j in range(0, 1024):
if X[j] == value1 or X[j] == value2:
return j
return -1
My point is, static X .. = computeLookupTable() strongly suggests that X is a value computed and stored in static storage (in RODATA) and that X[j]does not inline-substitute the whole array there.
With explicit materialization, the following also looks weird to me:
static X = computeLookupTable()
...
.. materialize[X]()...
This would translate to plain English as “X is a static value, and it needs to be materialized into a dynamic value using materialize".
I am not a big fan of static. In my opinion, the keyword is ambiguous, it could either means that the object is statically-allocated or the value is statically-evaluated. The fact is that the keyword has already been widely used for “statically-allocated” objects, yet a mojo alias is a statically-evaluated value (staticeval would make more sense to me but it is too long).
The current behaviour of alias is surely not how we want it to work long-term, right? It seems like something that needs improvement. Maybe we should put aliases in static storage wherever that is sensible or necessary. And serendipitously, that would bring the semantics of alias closer to that of static in C/Rust/etc.
This is true. We would definitely be making a tradeoff. But… Mojo has done more dramatic things, like redefine the terms “argument” and “parameter”, defying a 50-year norm. Given that, I don’t know how bad it would truly be to tweak the norm of what “static” means. The Mojo docs are going to talk about metaprogramming heavily, so anyone who’s spent a day learning Mojo will likely be able to understand Mojo’s definition of static, and how it differs from global.
Given that, I don’t know how bad it would truly be to tweak the norm of what “static” mean
Sure, but what keyword should we pick to denote statically allocated objects in the future?
The current behaviour of alias is surely not how we want it to work long-term, right? It seems like something that needs improvement. Maybe we should put aliases in static storage wherever that is sensible or necessary
Not really (as far as I know). It is totally fair to complain that we don’t have a good way to materialize alias into static object (see, I am using static here again), but that is a separate issue from alias itself. To me, what you really need is a materialize_into_static-like API, but not a fundamental change to the semantics of alias itself.
Earlier in the thread we discussed using global for this purpose (i.e. mutable variables with static storage). It’s an existing keyword in Python.
I’m not well positioned to make suggestions on this aspect of Mojo’s design. I am a fan of trying to keep the design as simple as possible though, which is why I have suggested trying to make alias “less troublesome” (e.g. by automatically creating appropriate storage), instead of asking the programmer to manually fiddle around with this. I expect a lot of Mojo programmers aren’t going to inspect the binary to see what storage and/or instructions have been generated at all of the places aliases are used, so the default behaviour needs to be a good/performant one.
Yeah, let me clarify here - implicit materialization can be very surprising and expensive. We discussed this and made a change relatively recently (I forget when, a month or two ago?) to address this - the compiler now refuses to implicitly materialize values unless there are implicitly copyable types. For other things, you can use the explicit materialize[expensive]() syntax to make it clear where this is happening.
The remaining issue is that InlineArray is still implicitly copyable, because we don’t have conditional conformances implemented enough. Once it loses its implicit copyability (which needs to happy for other reasons!) it won’t be implicitly materializable.
The point about globals is very nice, I suspect we can make something like this work when there is time to worry about globals:
fn thing[paramVal: SomeExpensiveType]():
# not what we want, but what people will do at first.
var x = paramVal # Error, cannot implicitly materialize, use a global
use(x)
# Suggested solution gives you what you want with a predictable model:
global g = paramVal # Ok! In RODATA
use(g) # Just uses the address.
Seems neat and tidy! Maybe I can come to actually appreciate globals some day
I perhaps wasn’t clear earlier, but I am wondering why the compiler can’t just automatically do the equivalent of global g = paramVal in the cases where the function needs to read paramVal’s value and/or address at runtime. (This would happen once per function instantiation distinct value of paramVal, not once per use of paramVal as an argument!)
That seems like a good default to me? Asking users to define a global with a different name to the alias just to read the alias’s value at runtime seems like a lot of ceremony.
The relevance to this thread is that if alias works as I described (i.e. it is implicitly stored in a global var as necessary), then it would be indistinguishable from an “immutable static variable”, in the C/Rust sense of static. That makes it far more justifiable to rename alias to static.
There are several important questions to ask before we decide whether compiler should/can always automatically turn an alias into global IMO.
Are all types materializable to globals? Say Trees/Linked Lists (maybe?).
Should the value always have the same type before/after materializing into globals? It might not be a bad idea to be able to materialize a linked list into a global array for instance.
Is global always an acceptable way to materialize an alias such that it is okay for compiler always implicitly do that for you (even for GPUs?).
You probably also don’t want compiler to silently insert 100 global variables when you accidentally use the iterator in a dynamic function inside a parameter for loops
To me, it might be better to just have a GlobalMateralizable trait to allow user defined behavior. But also note that we don’t even have global yet so this might be too early to be discussed.
This doesn’t sound like a problem to me. If the loop has been inlined 100 times, you have already generated a “large” amount of code. It makes sense that this would also generate static data proportionate to the size of the code. And you’re going to end up with the same amount of data no matter whether the materialization is automatic or manual, so it’s not like making it automatic causes less efficient code to be generated. Unless I’m missing something?
(Also, I’m not suggesting every alias would need to be compiled to a global. Small values could be inlined into the code, as long as their address isn’t read.)
Ok, I’m going to stop now, because you’re right, this is diverging somewhat from the topic at hand.
The link to this thread is: if aliases can be automatically backed by globals where necessary, then they would behave much closer to C/Rust static variables, albeit they are immutable. So we could use static for immutable data that ships with the binary, and global for mutable and/or runtime-initialized data.
I don’t think this is a good idea because it’s very hard to determine when that’s a good idea without very complex analysis. You would want to inline if and only if you can constant fold most of the data away, at least for lookup tables. The compiler being “wrong” about this based on some hueristic we try to figure out can have devastating performance consequences (ex: if inlining a 2 MB lookup table in an unrolled loop). I don’t think that the compiler will be able to figure out what the developer wants to happen in many cases. For example, in some cases you may want a table to exist because some code uses dlopen and grabs the table directly out of the file, and in other cases you may want it to be removed if unused because it’s supposed to be purely compile-time. There isn’t a way for the compile to “figure it out”. For example, on some targets a 65 Kib value should be inlined because that’s the vector width of the processor and we want to make that a constant that gets loaded, but on most targets that’s a very large lookup table and should be put in .data. Trying to automatically figure out that kind of thing would involve “bouncing” the compiler up and down for every usage of an alias to determine if the compiler can constant fold accesses.
I think you’re trying to cut intrinsic complexity here. Whether or not a globally scoped thing exists at runtime is a decision I think the developer must make.
As a data scientist who’s not really committed to any language/terminology, I think comptime sounds intuitive and definitely appreciate the consistency across @parameter. Perhaps my disclaimer disqualifies me from the conversation though!
The problem we’re discussing is very similar to that of register allocation. In the 1970s a lot of programmers probably make analogous statements to yours. Something like:
Whether or not a local variable is allocated on the stack at runtime (vs. stored in a register) is a decision that the developer must make.
It’s not a perfect analogy, but the spirit is similar: do programmers actually need to know whether the compiler will store a value in .rodata or not, given they have no idea whether the compiler will store a local variable on the stack?
If the compiler can make a great decision 99% of the time (as happens for stack vs. register allocation), I think it makes sense to free programmers of this burden.
Modular has world class compiler engineers on their team, so I am hopeful that they can achieve something along these lines. But I will leave it to them.