It would, but I’m not quite sure how we would implement that aside from adding another argument convention like __deinit_non_self
.
Thank you @clattner for cleanig up the argument conventions in the latest nightly!
Looking at the changes I see that deinit
is allowed in the function definition in traits. This was a misunderstanding from my side. I thought it was only allowed in function implementation.
Moveable:
fn __moveinit__(out self, deinit existing: Self, /):
"""Create a new instance of the value by moving the value of another.
Args:
existing: The value to move.
"""
...
In the Modular codebase >99% (all?) of the __moveinit__
and __del__
use deinit
as argument convention now. So there seems to be no mixup of var
and deinit
in the code. That‘s a good thing. Those corner cases where you need to use var
instead of deinit
seem to be seldom.
Now for the bikeshedding : I still prefer
del
to deinit
. But that‘s more a thing of taste than an issue.
When I see deinit
, I’ve read it as:
dei-nit - God-level nits
dein-it - Why am I bowing down to it?
de-innit - What’s the de- of “isn’t” it but with a localized accent?!?!?!!?!!!
de-init - Is the opposite of initialization strictly defined as destruction. Moving away from -init - what de- traditionally means - could be anything, could be anywhere, it’s a vector away from the source, but which way and how far and… (usually referred to late nights and too much coffee)
deinnit - Like dinnit or didn’t. “There is no do or not do, there is only dinnit.”
So please, clear
or consume
or wipe
or release
or even del
, anything but deinnit
@clattner I have a proper motivating example for keeping __disable_del
or a similar escape hatch around. In this case, the easiest way to run into this issue is by using linear types to try to enforce more safety when freeing pointers to non-global allocators.
from memory import OpaquePointer, OwnedPointer
from sys.info import alignof, sizeof
from math import align_up
from memory.unsafe import bitcast
@explicit_destroy
struct UnsafeBumpAllocator[slot_alignment: UInt](Movable):
var start: OpaquePointer
var end: UInt
var current: UInt
var allocations: UInt
alias Pointer[T: AnyType] = UnsafePointer[T, alignment=slot_alignment]
fn __init__(out self, var start: OpaquePointer, var end: UInt) raises:
self.start = start
self.end = end
self.current = 0
self.allocations = 0
fn allocate[T: AnyType](mut self, count: UInt) raises -> Self.Pointer[T]:
constrained[alignof[T]() < slot_alignment]()
var increment = align_up(UInt(sizeof[T]()) * count, slot_alignment)
if self.current + increment > self.end:
raise "Out of Memory"
var ptr = self.start.bitcast[UInt8]().offset(self.current).bitcast[T]()
self.current += increment
self.allocations += 1
return ptr
fn deallocate[T: AnyType](mut self, var ptr: Self.Pointer[T]):
@parameter
fn safety_checks() capturing -> Bool:
return ptr > __type_of(ptr)(self.start.bitcast[T]().address) and ptr < __type_of(ptr)(self.start.bitcast[UInt8]().offset(self.end).bitcast[T]().address) and self.allocations > 0
debug_assert[safety_checks, "safe", cpu_only=False]("Invalid pointer")
self.allocations -= 1
fn deallocate_batch[T: AnyType, size: UInt](mut self, var ptrs: UnsafePointer[InlineArray[UnsafePointer[
T, alignment=slot_alignment, *_, **_
], size]]):
@parameter
fn safety_checks() capturing -> Bool:
var end_ptr = self.start.bitcast[UInt8]().offset(self.end).bitcast[T]()
for i in range(size):
var ptr = ptrs[][size]
if ptr < self.start.bitcast[T]() and ptr > end_ptr:
return False
if self.allocations < size:
return False
return True
debug_assert[safety_checks, StaticString("safe")]("Invalid pointer")
self.allocations -= size
fn release_allocator_memory(deinit self):
@parameter
fn safety_checks() capturing -> Bool:
return self.allocations == 0
debug_assert[safety_checks, StaticString("safe")]("Freed before all memory was returned")
@fieldwise_init
@explicit_destroy
struct NamedBumpAllocatorPtr[T: AnyType, name: StringLiteral, slot_alignment: UInt = 64](
Movable, Copyable
):
alias PointerT = UnsafePointer[T, alignment=slot_alignment]
var _inner: Self.PointerT
fn __getitem__(ref self) -> ref [self._inner] Self.PointerT:
return UnsafePointer(to=self._inner)[]
fn free(deinit self, mut allocator: NamedBumpAllocator[name, slot_alignment]):
allocator._inner.deallocate[T](self._inner)
@explicit_destroy
struct NamedBumpAllocator[name: StringLiteral, slot_alignment: UInt = 64]:
var _inner: UnsafeBumpAllocator[slot_alignment]
fn __init__(out self, var start: OpaquePointer, var end: UInt) raises:
self._inner = {start, end}
fn allocate[T: AnyType](mut self, count: UInt) raises -> NamedBumpAllocatorPtr[
T, Self.name, slot_alignment=slot_alignment
]:
return NamedBumpAllocatorPtr[T, Self.name, slot_alignment](self._inner.allocate[T](count))
fn deallocate[T: AnyType](mut self, var ptr: NamedBumpAllocatorPtr[
T, Self.name, slot_alignment=slot_alignment
]):
ptr.free(self)
fn deallocate_batch[T: AnyType, size: UInt](mut self, var ptrs: InlineArray[NamedBumpAllocatorPtr[
T, Self.name, slot_alignment=slot_alignment
], size]):
self._inner.deallocate_batch[T, size](
UnsafePointer(to=ptrs).bitcast[InlineArray[UnsafePointer[T, alignment=slot_alignment], ptrs.size]]()
)
fn release_allocator_memory(deinit self):
var inner = self._inner^
inner^.release_allocator_memory()
In a hot loop, this escape hatch allows me to “batch free” pointers, something frequently done in packet processing where you will frequently have 64/128 packets you process through a stage of the network stack together. After you transmit the packets, you free them back to the allocator in a batch like this. By forcing deinit
to only be on self
arguments, this goes from an O(1)
operation to an O(n)
operation, in an area where unrolling the loop is likely going to massively bloat performance-sensitive functions (especially so on a GPU). I don’t think that moving this up a level, to a place where InlineArray
has to be concerned about this, is the correct answer.
edit: after some discussion with @Nick I’ve made a few edits to make the problem clearer. Note that while NamedBumpAllocator.deallocate_batch
does work right now, it probably shouldn’t.
Okay, I have a bit of time to write, so here’s my proposed alternative to @clattner’s deinit
syntax.
First, let’s recall our goals:
- Eliminate the special treatment of
__del__
and__moveinit__
, i.e. that they don’t implicitly invoke__del__
on theirvar
argument. - Find a design that extends naturally to “custom destructors”/linear types.
- Avoid the need for an ugly
__disable_del
statement.
Here is the main issue I see with the deinit
design:
- Whether a function directly or indirectly deinitializes a variable should be an implementation detail that callers don’t need to think about. But by putting
deinit
in a function’s signature, we risk misleading Mojo users into thinking that “directly deinitializing an argument” is a part of a function’s interface/type. - As a concrete example, there is a decent chance of confusion when people are writing an implementation of
Movable
for their struct. If the trait declares that the second argument is avar
, readers might mistakenly think their implementation needs to usevar
, and therefore they won’t reach fordeinit
. Conversely, if the trait specifiesdeinit
, readers might mistakenly think they need to usedeinit
, when they are also allowed to usevar
.
The obvious way to resolve this issue is to stick with a single convention (var
), and move the “deinit” declaration to the body of the function. Something like:
```
fn __moveinit__(out self, var existing: Self):
# normal move stuff
self.x = existing.x^
# now we mark the argument as deinitialized
deinit existing
```
This syntax would have the same effect as __disable_del
. By introducing an explicit keyword, we are eliminating the magic from __del__
and __moveinit__
, which was our goal. And by leaving function signatures untouched, we avoid misleading people into thinking deinit
and var
are distinct “argument conventions” that callers of the function need to reason about.
Keyword alternatives
I think the above proposal is solid, but the name deinit
isn’t ideal. Contrary to what the keyword suggests, the statement deinit existing
doesn’t “perform deinitialization”. The statement merely signals that the fields of the argument should be moved or destroyed earlier in the function body, and that the argument should be considered deinitialized thereafter. Said another way: the argument is “pulled apart”, rather than going through __del__
. (This is also why I find @christoph_schlumpf’s suggestion to use the `del` keyword problematic.)
We need a keyword that means “the argument is no longer live”. Here are two alternatives:
```
fn __moveinit__(out self, var existing: Self):
self.x = existing.x^
unassign existing
```
Explanation: The statement marks the argument as “unassigned”, i.e. it marks it as holding no value.
```
fn __moveinit__(out self, var existing: Self):
self.x = existing.x^
release existing
```
Explanation: The statement releases the memory of the argument, allowing the memory to be used for other purposes. (Note: the memory is not being released to the global allocator; it’s being released back to whoever memory’s owner is. The owner could be a List
’s buffer, or the stack, etc.)
Whatever keyword is chosen, it can be a “soft keyword”.
Summary
Having an explicit unassign
statement at the end of __del__
and __moveinit__
implementations would have the following benefits:
- It eliminates the “_del_ disablement magic” from these functions.
- It allows Mojo users to define destructors with arbitrary names, which are needed for linear types etc.
- (Different from Chris’s proposal) It keeps function signatures untouched, meaning that users of these functions don’t need to learn about
unassign
! Only implementers of these functions need to learn what it does. This is a major win—custom destructors/move constructors are rarely needed outside of systems programming, sounassign
can be a keyword for “expert users”.
I think that only having two places in the language with an explicit “logical free” (deinit
) is going to be rather odd. The arg convention approach is fine, but I think that it needs to expand beyond self
parameters and that we will still likely need an escape hatch of some sort.
To clarify: The unassign x
statement would be permitted within any function definition that has “privileged access” to the type. It is not a feature specific to __del__
and __moveinit__
. The whole point of Chris’s proposal, and my counter-proposal, is to allow ordinary methods to behave in a destructor-like manner.
The semantics of unassign x
are as follows:
- The compiler enforces that the fields of
x
be deinitialized prior to the statement being reached, either via an explicit transferx.field^
, or by an implicit invocation ofx.field^.__del__()
. - The compiler marks
x
as uninitialized. Crucially, this prevents the destructor ofx
from running!
I imagine the semantics I’m describing is consistent with how __disable_del
works.
Hi folks,
I’m sorry for the delay, I’ve had a bit of PTO recently and wasn’t able to follow all of this. I’ll take a crack and addressing some of the aggregate feedback instead of responding to each post, but let me know if I missed anything important upthread.
Macro comment, I agree with nick’s summary of the constraints/goals and the positioning. This is definitely an implementation detail of the function, not part of its signature. Agreed.
deinit
is not part of the type.
One clarification - `deinit`
is not part of the type of a function (and therefore it is technically meaningless in a trait) because it only impacts the implementation of a method, not its formal signature. Deinit works the same way as in var. Here’s a testcase:
struct GoodDtor:
fn __del__(deinit self): pass
fn explicit_dtor(deinit self): pass
fn normal_var(var self): pass
fn test_deinit_fn_types():
# These are all the same.
var fp1 : fn(var self: GoodDtor) -> None
fp1 = GoodDtor.__del__
fp1 = GoodDtor.explicit_dtor
fp1 = GoodDtor.normal_var
# expected-error @+1 {{'deinit' is not supported in function types, use 'var' instead}}
var fp2 : fn(deinit self: GoodDtor) -> None
For the implementation, I chose to disallow deinit in function types (as above) just to avoid this sort of confusion. That said, it allows deinit in traits, because I think it would be very confusing to declare moveinit and del as taking var in their traits: this would mislead people who are going to implement it in the standard way.
On removing __disable_del
I realize that Rust has std::forget
but that doesn’t mean Mojo should. Such an operation is extremely dangerous because it breaks memory safety, e.g. makes it trivial to introduce memory leaks in general code. I understand that Rust declares that leaks are not safety problems, but I don’t think that is the right approach for Mojo. We can and should aim higher.
That said, we do need the ability to perform this operation. The proposed approach gives the type author the ability to do this by implementing a method (which can be otherwise a noop). Evan is actively implementing “extensions” for structs in Mojo, which will allow other code to add their own “forget” method to arbitrary other types as well, so I don’t think there is a need to add syntax.
There is a suggestion of renaming __disable_del x
to just del x
. This seems like a bad idea to me. del self
literally already means something else in Python, and Mojo will eventually support classes - reappropriating it in this way would cast a dark shadow on the future.
Nick also suggest renaming it to deinit x
. The problem with this is that it implies you could deinit
anything, not just self
. It would be very strange to limit a statement to self and equally to have a statement (required in every __del__
) named deinit_self
.
I don’t really understand Owen’s bump pointer example. Wouldn’t that just require types that have trivial destructors?
On naming deinit
I have no specific love for deinit
and am happy to rename it for a better name. That said, I don’t agree that del
(an existing Python keyword with existing very-defined semantics) is the right way to go. del
removes a key from a dictionary (including the “local variables” dictionary), it does not disable destructors.
I suspect that advocates for del
as a spelling are arguing it because it is 3 letters (shrug, seems like little motivation) but more likely that it is because the __del__
method would use it. Note though that __del__
is a very different thing: __del__
is the operation that tears down the struct when the last reference goes away (its “implicit” destructor) - it is an operation, just like the one in Python. It is true that the __del__
method implementation needs an argument with a disabled destructor, but the entire point of the proposal is that this isn’t specific to __del__
, we need the ability for named methods to do the same thing.
The thing we’re talking about in this proposal is not an “operation”, it is a declarative specification of how the self argument of a method is processed. The choice of the word “deinit” specifically (which is loosely held) was motivated by aligning with the existing terminology of “init” for initializers/constructors. And providing a dual for that. I don’t consider the names like “clear” or “consume” or “wipe” or “release” to be very evocative personally, but I’m open to discussions on this if y’all think one is well motivated somehow and that aligns with other existing nomenclature in the language: the “no del” ideas seem more directionally correct since that’s what they do.
Maybe not an arg convention?
Another different direction to take this is to make it a decorator instead of an arg convention. We could utilize a decorator for example:
@deinit
fn __del__(var self): ...
or (Mojo doesn’t support decorators on arguments, but let’s stretch our imagination):
fn __del__(@deinit var self):
One argument in favor of this is that it would make more clear that this is an implementation detail of the method, not something that impacts its type. OTOH this seems over verbose for no good reason.
I’m sorry if I missed a macro point upthread, if so please let me know. Thank you all for the discussion here! This is how we shape Mojo’s evolution, and there is still time to make changes.
-Chris
On removing __disable_del
I don’t really understand Owen’s bump pointer example. Wouldn’t that just require types that have trivial destructors?
The example is showing a use-case where making it so that self
is the only way to deinit leads to an algorithmic performance regression. In this case, having self
be deinit
means that I need to keep decrementing the allocation count by 1
for each, and hope that compiler magic merges all of those into a single add, something the compiler may not be able do if the array size is compile-time unknown, since that would involve realizing that there will be len(array)
decrement operations and merging those.
The point with this is that the types might not have trivial destructors, but that I have in some way already taken care of that, and I now need to deal with a bag of linear types where I don’t want to call a function for each of them but I, the programmer, know that they are fully destroyed so the destructor should be disabled. This is mostly an escape hatch thing, and while I don’t think it’s a good idea to use it often, not having it makes trying to use linear types to make a SlightlyLessUnsafePointer
type that lives in the hot loop of a program harder because it constrains how I can manage them. I don’t see a good way to do this with deinit
in a way that doesn’t force O(n)
in the number of instances of linear types I need to destroy. This isn’t a problem for many programs, but I have a lot of usecases where n
is going to be 64/128/256 and the code will live in a hot loop that gets invoked half a million times per second, so that overhead adds up rapidly. Relying on the compiler to hopefully see through all of the allocator logic and figure out what’s going on seems like relying on compiler magic to me.
We can call it __unsafe_bad_idea_disable_dtor
if you think that’s sufficiently scary name to dissuade using this escape hatch, but I think the escape hatch needs to continue to exist. I agree with you that making it a safe operation with well defined consequences in all circumstances is a bad idea. Using it should require reading the source of the linear type to understand what invariants get broken if you don’t properly free it, and it should come with a “here be dragons”. You can introduce OOM crashes via this mechanism, or break memory safety, or do any number of other horrible things, but it’s also a powerful tool when you really need it. I want it to be a keyword because it’s one of the things I want to be able to audit a codebase for, since it can be used incorrectly to cause issues, but if the compiler can tell me everywhere the equivalent behavior is used via some kind of audit mode then that’s also fine.
Similarly, the single pointer deallocate
also wants to deinit
a type other than self
, since the allocator holds responsibility for freeing the pointer. This is behavior I would like to see supported for cases exactly like this, where something might want to logically destroy a linear type but hold onto the raw memory for later reuse. Here, allowing deinit
on non-self args removes the need for an escape hatch.
On naming deinit
I agree that del
is different and may cause expectations of behavior that deinit
does not have. I’ll toss destroy
into the ring as an option, but that might not make sense for some linear types. deconstruct
is another option but feels a bit long, so decon
might be an option since that ties into the OOP knowledge shared by a lot of programmers.
Maybe not an arg convention?
This might be a better direction to go in. It helps clarify that deinit
is not part of the function type and is an implementation detail. I like the arg-attched version better.
Thanks for responding Chris! I think we are close to being on the same page. I agree 100% with the requirements you’ve laid out. Where we differ is that I believe those requirements can be fulfilled by a statement, whereas you believe they are best fulfilled by some kind of annotation on the function.
I’ll attempt to explain why I believe a statement is most appropriate.
I realize that Rust has
std::forget
but that doesn’t mean Mojo should. Such an operation is extremely dangerous because it breaks memory safety.
I agree—we shouldn’t offer a function that just “forgets” to clean up a var
. That’s a massive footgun. (Except for super niche expert use cases.)
Nick also suggest renaming it to
deinit x
. The problem with this is that it implies you coulddeinit
anything, not justself
. It would be very strange to limit a statement toself
.
That’s right: For the sake of generality, I believe deinit
should be invocable upon arguments beyond just self
. However, we definitely need to restrict its use. So, I propose that the statement be invocable upon any argument for which the programmer has access to the private members. The logic being: in any context where the private members of an object are accessible, we are already trusting the programmer to “do the right thing” with the object. A programmer can break memory safety by overwriting an object’s private members, and they can also break memory safety by using deinit
inappropriately. Therefore, I am proposing that Mojo’s mechanism for blocking access to private members should also be used to block the invocation of deinit
. In my opinion, this is a good simplification of the model. It means Mojo doesn’t need any special restrictions specific to deinit
!
deinit
does NOT have the same behaviour as Rust’s mem::forget
!
I need to clarify one last detail: I am not proposing that the deinit
statement merely “disable the destructor”. Instead, what it does is “pull apart” the fields of the object, forcing them to be transferred elsewhere, either explicitly by the programmer, or by the implicit invocation of thing.field^.__del__()
.
Given this behaviour, perhaps the deconstruct
keyword suggested by Owen better conveys what’s happening.
Here’s a concrete example:
struct Foo:
fn __init__(out self):
pass
fn __del__(var self):
# This statement forces all of the fields of `self` to
# be uninitialized. No fields exist, so this is trivial.
# Finally, `self` is marked as uninitialized.
deconstruct self
print("An instance of Foo was destroyed")
fn __moveinit__(out self, var existing):
# Moving is also trivial.
deconstruct existing
struct Composite:
var x: Foo
var y: Foo
fn __del__(var self):
# This statement invokes the destructor of `self.x`
# and `self.y`. (The print statements are executed.)
# Finally, `self` is marked as uninitialized.
deconstruct self
fn __moveinit__(out self, var existing: Self):
self.x = existing.x^
self.y = existing.y^
# The fields of `existing` have already been deinitialized,
# so all the following statement does is mark `existing` as
# uninitialized. (Therefore, its destructor won't be run.)
deconstruct existing
fn special_move(out self, var existing: Self):
self.x = existing.x^
self.y = Foo()
# The following statement invokes the destructor of
# `existing.y`. (Its print statement is executed.)
# The field `existing.x` has already been deinitialized,
# so no action is required there.
# Finally, `existing` is marked as uninitialized.
deconstruct existing
Design summary
The proposed deconstruct
statement offers a structured way to deconstruct an object. If you think of an object as being “encapsulated” inside a balloon, deconstruct
merely pops the balloon, and allows the individual fields of the object to be transferred and/or destroyed—whatever its user wants to do. However, because deconstruct
breaks encapsulation, it can only be invoked in a context where the programmer has access to the private members of an object, i.e. in a context where encapsulation is already absent.
Mojo doesn’t yet offer member privacy (private
), but it’s on the roadmap. When privacy is introduced, it will become very difficult to break memory safety using deconstruct
.
I realize that Rust has std::forget but that doesn’t mean Mojo should. Such an operation is extremely dangerous because it breaks memory safety.
I agree—we shouldn’t offer a function that just “forgets” to clean up a var. That’s a massive footgun. (Except for super niche expert use cases.)
I agree that in many contexts having a “turn off the dtor” keyword is a foodgun. However, I spend much of my time in “super niche expert use cases” chasing the limits of the hardware. I think that this escape hatch does need to exist to allow maximum performance while constructing low-level primitives like allocators while using linear types to enhance their safety. I’m fine leaving this as a semi-internal feature with a __
prefix and then only documenting it in the “Mojonomicon” or wherever else we document all of the things you can easily shoot yourself in the foot with in Mojo.
Nick also suggest renaming it to
deinit x
. The problem with this is that it implies you coulddeinit
anything, not justself
. It would be very strange to limit a statement toself
.That’s right: For the sake of generality, I believe
deinit
should be invocable upon arguments beyond justself
. However, we definitely need to restrict its use. So, I propose that the statement be invocable upon any argument for which the programmer has access to the private members. The logic being: in any context where the private members of an object are accessible, we are already trusting the programmer to “do the right thing” with the object. A programmer can break memory safety by overwriting an object’s private members, and they can also break memory safety by usingdeinit
inappropriately. Therefore, I am proposing that Mojo’s mechanism for blocking access to private members should also be used to block the invocation ofdeinit
. In my opinion, this is a good simplification of the model. It means Mojo doesn’t need any special restrictions specific todeinit
!
I agree with Nick. We have my bump allocator example for single-ptr deallocate
as somewhere you want to deinit
something other than self. We may be able to restrict the ability to do that via something like C++'s friend functions, but that has all of the well-known downsides of friend functions. As another example, to do compile-time refcounting using linear types, I need to destroy 2 objects. If they are both the same time, I need to something like fn merge_counts[T: AnyType, N: UInt, Lhs: LinearRcOwner[T, N], Rhs: LinearRc[T]](deinit lhs: Lhs, deinit rhs: Rhs) -> LinearRcOwner[T, N-1]
.
I think there needs to be a discussion about encapsulation in Mojo. Making it so that deinit
can only go on self
is a fairly strong form of encapsulation, one I think might be a bit too strong given that I think we want the ability for types to cooperate, at least types from the same file and possibly types in a library scope. To me, this clashes with not having private variables or other forms of encapsulation, because it means that the only reason I can’t move the body of a destructor to another type or a free-floating function is because the compiler says I can’t. I can see the point that, in most cases, you shouldn’t deinit anything other than self, but there are also places where you must do that. If we want to start adding encapsulation, I think we more scopes than “everything is totally open to everywhere” and “nobody but self
can touch this”. Rust’s pub(crate)
visibility is a good example, since it allows anything in the library to look at and use the thing, but nothing outside of the library. Rust’s default private visibility is also module scoped, which in Mojo would mean a file-scoped visibility, and I think that being able to have a few closely related types which interact with eachother’s internals in a single file is a good thing.
deinit
does NOT have the same behaviour as Rust’s mem::forget
!
I need to clarify one last detail: I am not proposing that the
deinit
statement merely “disable the destructor”. Instead, what it does is “pull apart” the fields of the object, forcing them to be transferred elsewhere, either explicitly by the programmer, or by the implicit invocation ofthing.field^.__del__()
.
I think this would lead to some very weird behavior unless you could only invoke it on totally uninitialized objects (meaning that everything has been moved away already), since due to linear types it would need to yield all members which haven’t been moved away.
struct Foo:
fn __init__(out self):
pass
fn __del__(var self):
# This statement forces all of the fields of `self` to
# be uninitialized. No fields exist, so this is trivial.
# Finally, `self` is marked as uninitialized.
deconstruct self
print("An instance of Foo was destroyed")
fn __moveinit__(out self, var existing):
# Moving is also trivial.
deconstruct existing
struct Composite:
var x: Foo
var y: Foo
fn __del__(var self):
# This statement invokes the destructor of `self.x`
# and `self.y`. (The print statements are executed.)
# Finally, `self` is marked as uninitialized.
deconstruct self
fn __moveinit__(out self, var existing: Self):
self.x = existing.x^
self.y = existing.y^
# The fields of `existing` have already been deinitialized,
# so all the following statement does is mark `existing` as
# uninitialized. (Therefore, its destructor won't be run.)
deconstruct existing
fn special_move(out self, var existing: Self):
self.x = existing.x^
self.y = Foo()
# The following statement invokes the destructor of
# `existing.y`. (Its print statement is executed.)
# The field `existing.x` has already been deinitialized,
# so no action is required there.
# Finally, `existing` is marked as uninitialized.
deconstruct existing
Having to manually write deconstruct
seems like a recipe for footguns. For instance, does forgetting it in __del__
mean you recursively call __del__
? Also per my previous point, how do I deal with existing.y
being a linear type in special_move
? What we could try is to make it so that deconstruct is an argument convention which invokes the destructors of any remaining members which haven’t been deconstructed at the end of the function.
However, I think this is cleaner as an arg convention which disables the top-level dtor (to avoid recursion), but leaves the member dtors intact.
struct Composite:
var x: Foo
var y: Foo
fn __del__(deconstruct self):
pass
# call x.__del__() and y.__del__()
# alt function
fn __del__(deconstruct self):
global_foo_trashcan.put(self.x^)
# call y.__del__()
struct FooTrashcan:
fn trash_x(mut self, deconstruct foos: Composite):
self.put(foos.x^)
# if Foo is linear, then error
# If Foo is affine, then call foos.y.__del__()
fn trash(mut self, deconstruct foos: Composite):
self.put(foos.x^)
self.put(foos.y^)
# Nothing left to destroy.
One question this does leave is whether you should be able to deconstruct
a type where you can see all of the linear members but some affine members are hidden from you. Another option would be to have a concept of visibility(file) fn __del__(...)
where some types have RAII which is only accessible in a given visibility scope. This would make them act like linear types for the purposes of deconstruct
in random other files or libraries, meaning them being private would stop users from using deconstruct
, but without any extra burden for the implementer from needing to make a pile of linear dtors.
Every proposal that we’ve seen requires a keyword to be manually written. In Chris’s original proposal, you need to manually write deinit
in the argument list. In my proposal, you need to manually write deconstruct
at the end of the function. All proposals will result in the exact same compile-time warning/error (not a footgun) if you forget to use the keyword.
The compiler would notice that __del__
is calling itself recursively, and it would emit an error, suggesting that you use deconstruct
.
Something similar would happen for moveinit: if the compiler sees that moveinit is implicitly invoking __del__
, it would emit a warning “this probably isn’t what you want”, and suggesting that the user either invoke deconstruct
, or manually invoke __del__
if that’s really what they intended.
That’s exactly the semantics I have proposed for the deconstruct
statement. It would behave in the manner you describe, even though it appears at the end of the function. It doesn’t need to appear in the syntactic position of an argument convention.
Again… that’s the semantics that I’ve proposed!
Every proposal that we’ve seen requires a keyword to be manually written. In Chris’s original proposal, you need to manually write
deinit
in the argument list. In my proposal, you need to manually writedeconstruct
at the end of the function. All proposals will result in the exact same compile-time warning/error (not a footgun) if you forget to use the keyword.
Let me clarify. I don’t see an issue with it as an arg convention since that’s modifying what the function is doing and is easy to remember, but it’s much easier for there to be a path through the function which doesn’t deconstruct something and results in calling __del__
.
The compiler would notice that
__del__
is calling itself recursively, and it would emit an error, suggesting that you usedeconstruct
.
How do you avoid this error tripping on deconstructing a singly linked list?
Also per my previous point, how do I deal with
existing.y
being a linear type inspecial_move
? What we could try is to make it so that deconstruct is an argument convention which invokes the destructors of any remaining members which haven’t been deconstructed at the end of the function.That’s exactly the semantics I have proposed for the
deconstruct
statement. It would behave in the manner you describe, even though it appears at the end of the function. It doesn’t need to appear in the syntactic position of an argument convention.
So it’s not a statement, it’s an expression which evaluates to some collection of linear (and non-linear?) members. That also causes issues with unmovable types, so we would need some guarantee that the members didn’t move. To me, it seems like making it an arg convention means that it’s more or less exactly the same except that it’s not possible to run into the compiler error you describe needing. I’d rather make it so that it is impossible to ever see that error.
How do you avoid this error tripping on deconstructing a singly linked list?
I’d need to see an example to be convinced that there’s actually a problem that needs addressing.
I am proposing that deconstruct
be a statement, not an expression. I think you’ve misinterpreted something I said.
If people prefer, I would also find it acceptable for deconstruct self
to appear before the deconstruction process begins.
fn __moveinit__(out self, var existing: Self):
deconstruct existing
self.x = existing.x^
self.y = existing.y^
My primary goal is to keep the keyword out of the function signature, because that is going to cause bucketloads of confusion.
I’d need to see an example to be convinced that there’s actually a problem that needs addressing.
struct ListNode[T: AnyType]:
var next: OwnedPointer[Self]
This will have a corecursive __del__
. If I do it myself with UnsafePointer, I can make it a recursive one very easily with self.next.destroy_pointee()
. You said recursion would be an error, but this is a valid reason to recurse in a dtor.
So it’s not a statement, it’s an expression
I am proposing that
deconstruct
be a statement, not an expression. I think you’ve misinterpreted something I said.
I think that I did, because my thought was that it would yield linear types for the user to deal with, but I guess it could just error if there’s an initalized linear type involved.
fn __moveinit__(out self, var existing: Self):
deconstruct existing
self.x = existing.x^
self.y = existing.y^
At this point, why not this?
fn __moveinit__(out self, deconstruct existing: Self):
self.x = existing.x^
self.y = existing.y^
To me, forcing users to remember deconstruct
means that one of two things needs to happen:
- We special case
__moveinit__
and__del__
to issue a compiler warning when you forget, meaning that those functions are still “special” as far as this keyword is concerned. - Users can forget to call
deconstruct
and the compiler will let them continue, meaning that some dtors may not be called, or it may try to call a dtor on an already moved-from instance of a type.
When I said recursive invocation of __del__
would be an error, I meant self.__del__
invoking self.__del__
. This is one of the challenges highlighted in Chris’s proposal, and I was responding to that.
I am imposing no restrictions on valid recursion.
At this point, why not this?
We’re going in circles now. In my earlier posts, I explained in a lot of detail why I think putting the deconstruction functionality in function signatures is a mistake. It misleadingly suggests that deconstruct
is an argument convention—an agreement between the caller and the callee—when it’s actually just a private detail of the function’s implementation. Mojo programmers will be confused/misled by this syntax. I have enough experience teaching & watching people learn new languages to confidently predict this.
To me, forcing users to remember
deconstruct
means that one of two things needs to happen: We special case__moveinit__
and__del__
to issue a compiler warning when you forget
The warning for __del__
would be “unproductive recursion” and is a valid warning for all functions. The warning for __moveinit__
would be a recommendation to deconstruct
the var
argument instead of implicitly del
eting the argument. Yes, the warning would be specific to __moveinit__
but this is really not a big deal. In fact, it’s something that could be left to an external linter if we wanted. It’s not dangerous; C++ programmers are accustomed to this pattern. It’s just bad practice.
Users can forget to call
deconstruct
and the compiler will let them continue, meaning that some dtors may not be called, or it may try to call a dtor on an already moved-from instance of a type.
If you don’t write deconstruct
then the normal rules for destroying a var
apply. If you’ve moved one of its fields then it has a “hole” in it and the destructor will refuse to run. Otherwise its destructor will be invoked. There is nothing dangerous about this, and this behaviour isn’t specific to __del__
or __moveinit__
. It’s just how var
works in Mojo.
At the end of the day, I’d like to point out that nobody has presented a good reason to put the new deinit
/deconstruct
keyword into function signatures.
Consider the differences between:
fn __moveinit__(out self, deconstruct existing: Self):
…and:
fn __moveinit__(out self, var existing: Self):
deconstruct existing
…and Chris’s suggestion of:
@deconstruct
fn __moveinit__(out self, var existing: Self):
None of these alternatives are particularly verbose, and the use of deconstruct
will be rare in practice, so attempting to find a syntax that minimizes the number of ASCII characters is a waste of time in my opinion! The learnability of this feature is orders of magnitude more important. These are entirely different leagues of concern, and I have been prioritizing learnability.
So for those who would like deconstruct
/deinit
to appear in a function’s signature, I would hope to hear your perspective on why this is more learnable than making it the first statement appearing inside the function body. And I would hope to see you respond to my earlier posts on why I think putting the keyword in function signatures is going to invite confusion. (I can already envision all of the blog posts asserting that “Mojo’s argument conventions are read/mut/var/deinit
.”)
Considering the latest discussed solutions I prefer the decorator best, because it
- does not change the function’s signature
- has a clear unique place (in front of the function declaration instead of anywhere inside the function body)
- communicates that something special is happening inside this function.
The only disadvantages that I see are that
- it seems a little more wordy than an arg convention
- specifying which arguments are considered requires either string arguments for the decorator or some non-pythonic special syntax.