`UnsafePointer` questions (init_pointee_copy, destroy_pointee)

Hi all,

reading the docs once again I have questions regarding UnsafePointer functionality.

There is this example at Value destruction | Modular

struct HeapArray(Writable):
    var data: UnsafePointer[Int, MutExternalOrigin]
    var size: Int

    def __init__(out self, *values: Int):
        self.size = len(values)
        self.data = alloc[Int](self.size)
        for i in range(self.size):
            (self.data + i).init_pointee_copy(values[i])

    def write_to(self, mut writer: Some[Writer]):
        writer.write("[")
        for i in range(self.size):
            writer.write(self.data[i])
            if i < self.size - 1:
                writer.write(", ")
        writer.write("]")

    def __del__(deinit self):
        print("Destroying", self.size, "elements")
        for i in range(self.size):
            (self.data + i).destroy_pointee()
        self.data.free()


def main() raises:
    var a = HeapArray(10, 1, 3, 9)
    print(a)

destroy_pointee for primitive types

To my understanding when working with UnsafePointer there is more responsibility for the developer when it comes to memory management.
One of those responsibilities is to call the appropriate lifecycle methods when necessary.

I totally get that this is necessary to call destroy_pointee for more complex/nested types like UnsafePointer[String, ...] or UnsafePointer[List[int], ...] because for those cases the pointee itself also allocated memory that must be freed which then cascades down.

However, I do not understand why this should be necessary for a primitive type like Int?
Is this really necessary? Or is this just a very simple example where where whole for-loop + destroy_pointee will be optimized to a no-op?
Super curious to know if there a real benefits of any kind for keeping that when working with primate types. :nerd_face:

init_pointee for primitive types

Similar question to above.
When I have the HeapArray + the following code:

def main() raises:
    var s = 5
    var a = HeapArray(s)

    for i in range(s):
        a.data[i] = i * i

    print(a)

Is there any value/advantage of the following code in HeapArray.__init__:

for i in range(self.size):
    (self.data + i).init_pointee_copy(values[I])

Again, I totally understand that init/destroy pointee are required for complex types where the pointee itself needs to allocate memory.
But I do not understand why this should be the case for simple primitive types.
Reading in the docs about UnsafePointer and their Lifecycles is nice but not deep enough for me to understand.

As far as I can tell init_pointee is there so that the correct lifecycle methods are called. In this case probably the idea is to clean up (destroy) existing memory and setup (initialize) memory correctly for the new object.
However, with my limited understanding there is absolutely nothing to do for primitive types, right?
what am I missing?
Probably I am clearly not deep enough into memory management yet… so it would be awesome if someone could enlighten me :fire: :star_struck:

Good question @JulianJS !
In short you are correct. destroy_pointee for any type that has a trivial destructor is a no-op. So as you pointed out, ptr.destroy_pointee() where ptr is an UnsafePointer[Int, origin] is essentially a no-op and the loop would ideally get optimized out by the compiler.
The Mojo example is likely just trying to show a “best practice”, as if the pointer was to a non-trivially destructible type like List, or it was used generically where you didn’t know if the type was trivial, you should always call destroy_pointee().

Unfortunately the init_pointee case is a bit more nuanced. Ideally you should always be using init_pointee and never directly dereferencing the underlying pointer if that pointer points to uninitialized memory.
In your case when you do

var ptr = alloc[Int](size)
for i in range(size):
    ptr[i] = i * i

the [i] is under the hood calling UnsafePointer.__getitem__ which if you look at the signature returns as ref [_] Int. The issue here is that a ref type in Mojo should never refer to uninitialized memory, as it could potentially cause undefined behavior. By calling ptr[i] = i * i you are creating an intermediate ref Int type which refers to uninitialized memory which is not allowed. You should instead be calling (ptr + i).init_pointee_move(i * i) since this does not create an intermediate reference.

As of today the code ptr[i] = i * i simply “happens to works correctly” due to how the Mojo compiler lowers to code to LLVM. Theoretically in the future we could change this which would then cause your code to become UB.

TLDR;

  • destroy_pointee() is not necessary when the pointer’s type has a trivial destructor.
  • init_pointee_move/copy() should be used even when the type is trivial to avoid creating an intermediate ref to uninitialized memory.
4 Likes

Thanks a lot for the great explanations! :nerd_face:

1 Like

Would not have been an issue if it followed Python semantics and called __setitem__ instead.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.