Currently, mojo’s strict aliasing causes issues for some types of structures, most notably arena memory and types which are logical groupings. I’d like to propose a trait which turns a struct’s origin into one that types borrow the origin of – So pointers to a GroupOrigin effectively drop the object only when the group is destroyed. In and of itself, I think this is not that valuable, but combined with linear types where we can take a borrowed origin and then drop it manually with a linear type I think this is a pretty natural pattern and would open up a lot of design that effectively forces UnsafePointer/Origin erasure in some places right now. It’s still useful to know if the source object was dropped, and it just plays better with the rest of mojo’s ecosystem. In terms of actual implementation forcing that pointers to group origins be linear types might solve some complex tracking issues around ref-counting and could make group-origins be provably safe by showing no linear type outlives the group. (With the caveat of threading divergence but what’s new.)
struct Group(GroupAliased, Movable, ImplicitlyDestructible):
def borrow(mut self) -> Borrow[origin_of(self)]:
return Borrow[origin_of(self)](Pointer(to=self))
@explicit_destroy
struct Borrow[GroupOrigin: MutOrigin](Movable):
var group: Pointer[Group, Self.GroupOrigin]
def release(deinit self):
...
def main():
var g = Group()
var b1 = g.borrow()
var b2 = g.borrow()
b1^.release()
b2^.release()
Where either g drops when nobody is left to borrow its lifetime, or we constrain groups to also be linearly typed. I think the existence of this would avoid some forced-antipatterns that actively work around Mojo’s origin system right now.
I’m not opposed to exploring shared origins as a broader language primitive, but for Arenas specifically, I wonder if that introduces unnecessary compiler complexity.
Currently, it’s true that Mojo’s strict aliasing rules make true pointer-based arenas difficult to use. Any pointer borrowing the arena locks the borrow checker, preventing further mutations/allocations. However, instead of fundamentally altering how the compiler handles origins (such as introducing a new MutOrigin trait), couldn’t we use a branding pattern to enforce safety at compile-time?
Since Mojo has compile-time parameterized types, we can use a unique static tag to link an arena to its handles. Building an index-based arena this way feels like a much more natural, zero-cost fit that completely bypasses the pointer aliasing issue because indices don’t borrow the arena.
For example:
trait Tag: pass
struct TagA(Tag): pass
struct TagB(Tag): pass
struct Handle[tag: Tag](ImplicitlyCopyable, Movable):
var _index: Int
def __init__(out self, index: Int):
self._index = index
struct BrandedArena[T: Tag]:
var data: List[Int]
def __init__(out self, var data: List[Int] = []):
self.data = data^
def alloc(mut self, value: Int) -> Handle[Self.T]:
self.data.append(value)
return Handle[Self.T](len(self.data) - 1)
def get(self, handle: Handle[Self.T]) -> Int:
return self.data[handle._index]
def main():
var arena_a = BrandedArena[TagA]()
var arena_b = BrandedArena[TagB]()
# User tracks the indices
var obj_a1: Handle[TagA] = arena_a.alloc(1)
var obj_a2: Handle[TagA] = arena_a.alloc(2)
var obj_b1: Handle[TagB] = arena_b.alloc(1)
arena_a.get(obj_a1)
arena_b.get(obj_b1)
arena_a.get(obj_b1) # COMPILER ERROR:
Reductively, we have a bump allocator that we can reach into and point at regions of, and then we’re wrapping that bump-region in a linear-type so we don’t forget to “drop” the memory at some point since mojo’s asap destruction can’t apply here.
A reduced form of the whole issue, and where I think branding doesn’t help us specifically:
@always_inline
def kernel[N: Int, o: MutOrigin](
p: UnsafePointer[Scalar[DType.bfloat16], o],
) -> Float32:
return p.load[width=32](0).cast[DType.float32]().reduce_add()
def via_pool(mut pool: ScratchPool, sb: Int) -> Float32:
var lease = pool.borrow[Scalar[DType.bfloat16], 32]()
var ptr = lease.as_ptr[Scalar[DType.bfloat16]](sb) # UnsafePointer[BF16, o]
var r = kernel[32](ptr)
lease^.release()
return r
# There is no path from `Handle[QkvTag]` to `UnsafePointer[BF16, o]`.
def via_branded(mut a: BrandedArena[QkvTag, Scalar[DType.bfloat16]]) -> Float32:
var h = a.alloc(0)
return kernel[32]( ??? ) # no expression of type UnsafePointer[BF16, o] exists
In other words, I think branding is a valid approach but in this case not solving the issue I had in mind where we really do have a singular lifetime allocation but the interior pointers into it are truly not aliasing concepts, each one is (I’m claiming by construction) independent within the origin but truly tied to that origin. Pointer-slicing would also fall under the same theory, and anything that wants to represent a group of independent (non-owning) subset views.
Thank you for raising this. I agree this is an important use-case to solve for. We have been focusing on other more-basic priorities than the origin system on the push towards 1.0, but one of the major missing links is a feature that I imagine as “indirect origins”. It’s basically the ability to say (e.g. in List.__getitem__) that “this list element reference that I’m returning is derived from self”. The consequence of this is that you can use the ref so long as you don’t mutate the list object.
This will be important for closing a memory safety hole in List and other types, but will also open the ability to have arena allocators like you discuss. The only difference between List.append and Arena.allocate is that list.append will be seen as mutating the list object (thus invalidating any outstanding references), where-as the Arena object will not be mutating in the same way (thus you can continue allocating into it without invalidating any existing refs). However, the __del__ method on both of them invalidates all outstanding refs!
This is something without a concrete design doc; The Mojo team is getting together sometime this summer and is eager to talk about things like this, we will share more when we can.