I have been working for many weekends and evenings on FireBolt, an Mojo implementation of the Apache Arrow format started by Krisztián Szűcs. With recent Mojo releases it has come time to cleanup our memory story. My current thinking on the ownership structure is here.
trait Array(Movable, Representable, Sized, Stringable, Writable):
fn as_data(var self) -> ref [self] ArrayData:
"""Return a read only reference to the ArrayData wrapped by self."""
pass
the compiler fails with
firebolt/arrays/base.mojo|10 col 40 error| cannot return 'self's origin, because it might expand to a @register_passable type
|| fn as_data(var self) -> ref [self] ArrayData:
I do not intent to use this trait with @registe_passable types, how do I communicate this to the compiler?
There is also a separate error once I start to implement the trait. I am not sure if it’s possible to indicate I need any origin derived from __origin_of(self)?
firebolt/arrays/nested.mojo|68 col 20 error| cannot return reference with incompatible origin: 'self.data' vs 'self'
|| return self.data
Hi my friend, to solve the first issue, you can use fn func_name(ref self, …) …: . Ref works with both register passable types and memory types. If you want a read only or immutable origin, you will need fn func_name[self_origin: ImmutableOrigin](ref [self_origin] self, …) …:
The second issue is just your origin signature. Each field within your structure has its own origin. So your return type should be ref [self.data] instead of ref[self].
Happy to know you are working on the Apache arrow format for mojo. I was going to do the same but I’ll see if I can contribute to your repo.
firebolt/arrays/primitive.mojo|116 col 49 error| cannot infer origin for a function result
|| ](ref [self_origin]self) -> ref [self_origin.data] ArrayData:
firebolt/arrays/primitive.mojo|117 col 20 error| cannot return reference with incompatible origin: 'self_origin._mlir_origin.data' vs 'self_origin._mlir_origin'
|| return self.data
These errors make sense to me, I just don’t know how to express what I need. Each of these objects passed in a self contain a data as an ArrayData and I want to extract it and return.
Samuel is right, using ref self is a good workaround for this.
Unfortunately, the origin system doesn’t have a good way to decouple origin references from variable accesses, and don’t support the ability to erase parts of the access pattern, which is what you need in a trait. This is something we need to build out, but isn’t in the short-term plan due to other priorities.
Your as_data example seems like it should work though, could you post a small self-contained example that I can try out?
Ah, yes, that is precisely what we can’t do right now. We need a way to “erase” field information instead of keeping it precise. We don’t have syntax or an implementation that is confirmed, but that would allow you to write something like:
Are you also considering a way to “upcast” origins ([self.core] → [self]) in case someone wants the “compile-time RW lock” behavior that Rust has to uphold some invariants?
I pushed a variation on your suggestion to the main repo
I also have a client repo where I play with the integration with python/parquet using some earlier PyCapsule work. There is a lot of work to do here, the direction I am focussing on is to complete enough of the API to make it useful. I am sure that the internals can be improved, specially as Mojo goes towards 1.0