Struct Extensions Proposal

This seems like a way to spell the same thing as the following in Rust, but explicitly attaching the extension to structs and not requiring a trait.

impl<T> Roundable for T where T: Floatable {
    fn round(&self) -> isize { ... }
}

I’ve stolen Mojo’s name for float-like things because Rust doesn’t have a standard one.

I think this is a very interesting idea for having reusable non-conforming extensions, especially if we decide that all non-conforming extensions are automatically imported alongside the struct. Rust forbids wildcard (impl<T> T where ... { ... }) extensions due to the orphan rules, but there’s also the practical concern that, in most cases, extending every single struct is a bad idea.

However, I have an adjacent idea. What if we offered “trait extensions”?

trait extension Floatable:
    fn round(self) -> Int:
        ...

Trait extensions would be required to provide implementations of the behavior, essentially adding a way to add default functions onto a trait using only functions from that trait or other ones which are added as bounds.

Evan proposed trait extensions in his original doc. :shushing_face:

So he did, that’s what I get for not re-reading the proposal.

Hi All

I just went through the proposal on git and this got me really excited. There was one point which I saw that I thought of a different way of writing it.

Decision 6: Are extensions automatically imported?

I must say that anything that is implicitly done is something that trips me up from time to time. Especially if I have to check code outside of an IDE, but it might just be me. So I am more in favor of it being explicit.

I did note however a issue of what would happen if the extension was imported instead of the target struct. In that case I would say it is an error, but I also though of maybe having the import be slightly different to prevent that from being possible.

Let’s say instead of this:

# Library A's __init__.mojo

from .spaceship import Spaceship
from .spaceship_extensions import Spaceship

it was more like this

# Library A's __init__.mojo | All extensions of Spaceship is included.

from .spaceship import Spaceship with extension *

This would prevent people from importing the extension without the struct.

If you gave extensions names you could import only those you need by name for that module or file.

# Library A's __init__.mojo | Can specify exactly which extensions to import

from .spaceship import Spaceship with extension (MyCoolSpaceshipExtensions)

Or perhaps if you need to import a extension for a specific struct that was also imported you could do something like this, but I am not sure :

# Main File main.mojo
from .super_spaceship import SuperSpaceship  # NOTE: was not extended in any other modules.
from .spaceship_extensions import extension MyCoolSpaceshipExtensions on SuperSpaceship

# OR SuperSpaceship was defined in the main.mojo file itself

The one benefit with the “import … with extension …” is that you cannot import the extension without the target struct.

1 Like

Decision 1: Where can conforming extensions be? anywhere
Decision 2: Should we allow non-conforming extensions? yes
Decision 3: Where can non-conforming extensions be? anywhere
Decision 4: What can be in an extension? agree, maybe we can have struct inheritance adding fields and parameters.

Decision 5: Handling conflicts:

# geometric.mojo
struct Point:
    var x: Int
    var y: Int

struct Line:
    var start: Point
    var end: Point

trait Distance:
    fn distance(self) -> Float64: ...

trait Midpoint:
    fn midpoint(self) -> Point: ...


# eculiadian.mojo
from geometric import Line
extension Line(Distance):
    fn distance(self) -> Float64:
        var dx = self.end.x - self.start.x
        var dy = self.end.y - self.start.y
        return sqrt(dx * dx + dy * dy)

# manhattan.mojo
from geometric import Line
extension Line(Distance):
    fn distance(self) -> Float64:
        var dx = self.end.x - self.start.x
        var dy = self.end.y - self.start.y
        return abs(dx) + abs(dy)

# using_both.mojo
from geometric import Line
from eculiadian import Line as EuclideanLine
from manhattan import Line as ManhattanLine

fn print_distances(line: Line):
    var euclidean_line: EuclideanLine = line # implicit cast
    var manhattan_line: ManhattanLine = line
    print("Euclidean distance: ", euclidean_line.distance())
    print("Manhattan distance: ", manhattan_line.distance())

Picking any extension combination you want:

from ext1 import Line as ExtLine1
from ext2 import Line as ExtLine2
from ext3 import Line as ExtLine3

alias Ext1and2 = ExtLine1 & ExtLine2
alias Ext1and3 = ExtLine1 & ExtLine3

fixing verbosity by automatically merging extensions with the same name

from ext1 import Line
from ext2 import Line
from ext3 import Line

# Line has all 3 extension since the name is the same
# syntax sugar for:
from ext1 import Line as ExtLine1
from ext2 import Line as ExtLine2
from ext3 import Line as ExtLine3
alias Line = ExtLine1 & ExtLine2 & ExtLine3

Decision 6: Are extensions automatically imported?

No - because you can’t handle conflicts automatically, you have to do it manually, only the user knows which extension he wants to use.

Decision 7: Need import extension?

No, I don’t see a compelling reason to require an explicit import extension statement.

Decision 8: What if we import only an extension but not its target struct?

impossible, because in python and mojo there are no export statements so when you import the base struct at the top of the file, you are also exporting it.

from geometric import Line # this is rexporting Line from this file
# extension Line(Distance):... if we comment the extension we expect Line to be the base struct

Because extensions share the same name as the base struct, they are automatically imported when the base struct is imported. it is merging them with &.

Unless we add named extensions, it is impossible to not import the base struct when they have the same name, but even if we add named extensions I don’t see a reason to not import the base struct. What use case is there for using an extension without the base struct?

TL:DR:

  • User journey 4: requires clause for extending traits would allow extending structs defined in such as Variant which would be very powerful.

I think my usecase falls under user journey 4: modular/mojo/proposals/struct-extensions.md at main · modular/modular · GitHub. I currently need to do a full copy of the Variant struct in my own project for handling variants of many different truct types that implement a custom trait.

For (simplified) example, I replace

struct Variant[*Ts: ExplicitlyCopyable & Movable]

With HasBar trait:

struct Variant[*Ts: HasBar]

Where HasBar inherits ExplicitlyCopyable & Movable, but also defines a bar method call. This allows me to dispatch bar() over a variant:

struct VariantWrapper:
    alias Type = Variant[A,B] # Where A and B implement HasBar
    var impl: Self.Type
...
    fn bar(self) -> String:
        @parameter
        for i in range(len(VariadicList(Self.Type.Ts))):
            alias T = Self.Type.Ts[i]
            if self.impl.isa[T]():
                return self.impl[T].bar()
        return "<unknown type>"

The big issue with above is I need to do a full copy of the Variant struct and replace the Ts param with my own user specific trait.

Overriding the default trait of Ts with a custom trait allows me to avoid writing long chains of isa and impl[SomeType], and allows me to quickly add additional types to the variant without needing to update each function call.

So using the spaceship example:

extension Spaceship requires engine_type: WarpEngineTrait:

I would be able to do:

extension Variant requires *Ts: HasBar:

Turning 40-50+ lines of source code into a 1 liner. I have a identical need for this for a custom logging lib (different formatters, io, filters).

I had an interesting thought today.

If struct extensions can be named, we can use this to select and/or override the trait implementation associated with a struct instance. For example:

from string_utilities import HashA, HashB  # import extensions

var x: Set[String with HashA]  # items hashed using HashA.__hash__

var y: Set[String with HashB]  # items hashed using HashB.__hash__

Benefits of such a feature:

  • This feature would prevent the need to create wrapper structs (i.e. newtypes) in cases where you want to replace an existing trait implementation.
  • If Mojo were to impose Rust’s orphan rule, this feature would provide an escape hatch.

Challenges:

  • If String already implements __hash__, then the type Set[String] must not be usable where a Set[String with HashA] or Set[String with HashB] is expected (and vice versa), because if a Set switches hash functions midway through its existence, it will malfunction. Therefore, if Mojo were to support the above feature, its use would need to be carefully constrained. (Or maybe this type incompatibility is a fatal flaw—that’s TBD.)

Related work:

  • Carbon has “adapters”, which serve the same purpose as the feature I sketched. Adapters have compatability rules, which at a glance, seem to address the above challenge.
2 Likes

The Carbon team have found a cool version of Rust’s “orphan rule” that is much more flexible. Here is a talk about it. (I’ve linked to Josh Levenberg’s section of the talk. I recommend watching the following 15 minutes, at minimum.)

There seems to be a lot that Mojo can learn from the Carbon model. I recommend taking a look at it @Verdagon. :slightly_smiling_face:

1 Like

Hey all, I’ve had to step away from this thread for a bit (lots of priorities to juggle) but I should be back in action in a week or two. And by then, struct extensions’ conservative foundation will be in, so we can play with it while we decide what final form it will take. Lots of interesting things to discuss!

@Nick, I took a look. There’s a lot in there! I’ll have to soak on that for a while. Will leave a more in-depth reply when I’m back.

4 Likes

Welp, that was a long “week or two” :laughing:

So, updates since last time:

  • Around Oct 4, we landed the most conservative form of the feature (described in Intermediate struct extensions implementation ).
  • Since then, most of my time has been on other things (traffic cop, vacation, LLVM conference, thanksgiving), but I did manage to get some bugfixes in. On Nov 12 (well, Nov 25 after a revert and re-merge), I successfully moved simd.mojo’s __extension SIMD: to other files (python/conversions.mojo and python/python_object.mojo), huzzah.

So, the most conservative form of struct extensions is in right now, and I have a moment to breath and look forward the final design a little bit.

Decision 1 (“Where can traitful/conforming extensions be?”):

  1. Option A: Either in the struct’s package or the trait’s package.
  2. Option B: Either in the struct’s file or the trait’s file.
  3. Option C: Anyone can extend any struct for any trait.

I’m tentatively vetoing Option B since doesn’t support User Journey 1. Remaining choices are A and C.

Tentative conservative decision is A. We can relax this later, so I’m going to de-scope this for now.

We could relax this and allow traitful/conforming extensions in other packages, if we do something like Carbon does (thanks for the lead Nick!).

If anyone has strong opinions, please open a feature request!

Decision 2: Should we allow traitless/non-conforming extensions?

Resounding yes. Also I misspoke when I said Connor leans against this, he actually just doesn’t want traitless/non-conforming extensions outside the package of their target struct.

Also, our hand was kind of forced by Max here, they really wanted it, so it exists now. :person_shrugging: We can de-scope this for now.

Decision 3: Where can a traitless/non-conforming extensions be?

  1. Option A: In the struct’s package.
  2. Option B: In the struct’s file.
  3. Option C: Anyone can add an extension anywhere for any struct.

I’m tentatively vetoing Option B since doesn’t support User Journey 1. Remaining choices are A and C.

Tentative conservative decisions is A. We can relax this later, so I’m going to de-scope this for now.

We could relax this and allow traitless/nonconforming extensions in other packages, if we have a nice way to import them, and maybe a nice way to disambiguate callsites who are accidentally calling methods from multiple overloads (“spooky action at a distance”).

If anyone has strong opinions, please open a feature request!

Decision 4: What decorators can be in an extension?

Tentative conservative decision is methods, aliases, requires clauses. No annotations. We can relax this later, so de-scoping for now.

(As Owen says, we might want some sort of @export decorator at some point)

Decision 5: Handling conflicts

The current behavior is that normal overload resolution rules apply to it, and if there is a tie, the user is kind of out of luck.

As Owen and Nate say, we’ll want a disambiguation syntax for when there’s a conflict. Perhaps L.Spaceship.fly_to(ship, “Corneria”).

Interestingly, this conflict isn’t possible today with top-level functions; we can’t import a foo and also define a function foo, we get an error on the latter’s definition because of the name conflict.

Yinon brought up an interesting possibility of renaming an extension on import. That might be really useful for this.

This is still in-scope, I want to figure this out before users gets into this situation.

Decision 6: Are extensions automatically imported?

  • Option A: Explicitly import extensions
  • Option B: Struct imports automatically import their extensions

People like A because it’s more explicit, and B because it’s more ergonomic. Tale as old as time!

Respectively:

  • I conservatively implemented option A, and it turns out to be very un-ergonomic. For example, if a user wants to say var x = Float64(my_python_object) they need to put from python.conversions import SIMD to import the extension SIMD: in python/conversions.mojo. This is a bit of a deal-breaker; people using simple Mojo for Python stuff shouldn’t have to know what SIMD is. So, let’s consider option A was vetoed.
    • However, I’ll also mention an option A2: When main.mojo does from library_a.file_x import Spaceship, it imports any extensions in file_x (not just Spaceship extensions) and also any extensions in any files that file_ximports itself (as long as they’re in library_a).
  • B is stated a bit imprecisely. It should be “If you can see a struct, you can see all extensions in the struct’s package.”

So it’s down to A2 vs B. A2 says that (direct+indirect) imports make an extension visible. B means that an extension’s presence in a package makes it visible.

This decision is still in scope, we’ll decide on it this week if all goes well.

Decision 7 (“Need import extension?”):

This was a pretty resounding no, but I gotta admit, it’s kind of growing on me. It could be useful if we want to have explicit control over which third-party traitless/non-conforming extensions we want to import.

However, it’s moot for first-party extensions, and we don’t support third-party traitless/non-conforming extensions yet. Let’s de-scope this. If anyone has strong opinions, please open a feature request!

Decision 8 (“What if we import only an extension but not its target struct?”):

As Yinon said, we accidentally got this for free. In this foo.mojo:

from A import Spaceship

extension Spaceship:
    fn barrel_roll(mut self):
        self.roll(360)
        maniacal_laughter()

when a user says from .foo import Spaceship, they actually will import the Spaceship because from A import Spaceship re-exported it.

Also, this is moot if we go with decision 6 option B.

Still, if we go with decision 6 option A2, we’ll need to decide this. Let’s keep it in scope.

Decision 9 (“Support importing multiple extensions?”):

Pretty resounding “yes”. Let’s consider this one closed.

Decision 10 (“Proposed Syntax”):

I’ll summarize this later.

Decision 11 (“What decorators can be on extensions’ methods”)

A lot of folks think we should have an allowlist. I actually think we shouldn’t, because I don’t really see why a method’s location should affect the decorators. I need to learn more about decorators. Let’s consider this still in-scope.

Other points

Josiah, I think you’re spot on. I encountered that exact same thing with Variant in my latest article about mojo metaprogramming. And yeah, i see how extensions could help with that, so really good point, thank you!

I am also discovering that there are some risks because we have both overloading + extensions. Spooky performance regressions at a distance.

rd4com has an interesting idea, a sort of way to define an extension up front, but then allowlist which types it applies to. This is kind of similar to defining a trait and a bunch of impls, then an extension on things that conform to that trait, I think. Let’s table this for now.

Nick, you had an interesting idea with using extensions with with. But I think we only need var x: Set[String with HashA] if Mojo made a misstep somewhere. In Java, they require a .hash()/.equals() method on the object itself, but in C++ they decouple the hashing logic from the object by having a separate hasher and equator. Maybe there’s another example we can use?

Summary

  • Decision 1 (“Where can traitful/conforming extensions be?”): Either in the struct’s package or the trait’s package. We can relax this later, so de-scoping.
  • Decision 2 (“Should we allow non-conforming extensions?”): Yes, considering closed.
  • Decision 3 (“Where can non-conforming extensions be?”): In the struct’s package. We can relax this later, so de-scoping.
  • Decision 4 (“What decorators can be in an extension?”): Methods, aliases, requires clauses. No annotations. We can relax this later, so de-scoping for now.
  • Decision 5 (“Handling conflicts”): Still in-scope, I want to figure this out.
  • Decision 6 (“Are extensions automatically imported?”): We’ll figure that out this week hopefully.
  • Decision 7 (“Need import extension?”): Moot, so de-scoping for now.
  • Decision 8 (“What if we import only an extension but not its target struct?”): Moot if we go with decision 6’s option B which we’ll figure out this week. We’ll see.
  • Decision 9 (“Support importing multiple extensions?”): Yes. Closed.
  • Decision 10 (“Proposed Syntax”): TBD.
  • Decision 11 (new, “What decorators can be on extensions’ methods”): Still in-scope, I want to figure this out.

Summarizing the summary, remaining open decisions are:

  • Decision 5 (“Handling conflicts”)
  • Decision 6 (“Are extensions automatically imported?”)
  • maybe Decision 8 (“What if we import only an extension but not its target struct?”)
  • Decision 10 (“Proposed Syntax”)
  • Decision 11 (new, “What decorators can be on extensions’ methods”)

Summarizing the summary of the summary: This week we’ll figure out decision 6, which will affect decision 8. Then we’ll think about the syntax and other details.

Summarizing the summary of the summary of the summary: struct extensions still goin’!

4 Likes

Decision 6:

From a compiler performance perspective, does it make sense to package scope this? Essentially, if you import one extension, you get all of the extensions to that type in that package. That solves the issue with the stdlib not working as expected while preventing transitive dependencies from clogging up types with methods that often aren’t really relevant (See Rust when using the tracing crate).

Decision 8:

One consideration for this is what happens once Mojo has reflection. I think there may be a bunch of weird interactions with not having the full information about a type present when reflecting over it. That might make it best to import everything the compiler knows about and present it in order to avoid odd behaviors, since I think that having reflection work nicely is more important than not cluttering local namespaces.

Decision 11:

My main concerns with decorators are things like @staticmethod and @implicit which change the behavior of the function in ways I’m not sure how to propagate through a trait boundary. If we can figure those out, it should be fine.

Other

I am also discovering that there are some risks because we have both overloading + extensions. Spooky performance regressions at a distance.

What kind of performance regressions? Now that we have downcasting, we can have a form of “in-function specialization” by using a top-level function to dispatch things which have similar type signatures (or at least the same argument count). While function overloading is nice to have, I think that this “specialization” without overloading is probably enough for most cases, and extensions are an important enough feature to have that I would personally toss out function overloading to have them.

Nick, you had an interesting idea with using extensions with with. But I think we only need var x: Set[String with HashA] if Mojo made a misstep somewhere. In Java, they require a .hash()/.equals() method on the object itself, but in C++ they decouple the hashing logic from the object by having a separate hasher and equator. Maybe there’s another example we can use?

I agree with you Evan, I think that the desire to have this logic probably means that it’s more desirable to decouple the types and take a Some[Hasher[String]] (yet another case for parametric traits). It’s the same strategy pattern as many other languages employ and shouldn’t be to alien provided we have good “I don’t care about this” behavior.

Decision 5

I believe Dart handles extension conflicts in a very consistent and expected manner.

  • If a method on a extension conflicts with a method on a struct, calling this method will result in the structs own method being called.
  • If a method on an extension conflicts with on on another extension, calling a method is a compile time error, because neither extension has any higher priority. Extension are named in dart, so you can call them by doing this.
    import A.Spaceship
    
    extension FlyingExt on Spaceship:
        fn fly_to(mut self, new_location: String):
            ...
    
    extension AnotherFlyingExt on Spaceship:
        fn fly_to(mut self, new_location: String):
            ...
    
    spaceship.fly_to(location) # compile time error
    
    FlyingExt(spaceship).fly_to(location) # calls the method on FlyingExt
    
    Calling FlyingExt(...) does not create a new struct, it simply signals to the compiler which struct should be used.
    Naming extensions has the side benefit that they can be used to help with controlling imports.
    from .spaceship import Spaceship, SpaceshipExt # Importing the extensions into scope
    

Decision 6

In many instances, extensions are useful for adding helper methods to common built in structs. It is quite common that these extensions can clutter autocomplete with many duplicate/unwanted extensions. Users should have control whether these are in scope at all.

This is especially true for built-in types. For instance, package A has some helper methods for List. Package b has package A as a transitive dependency. Someone using package b would then have these methods in their scope although they are unwanted.

The downside to implicit/magic imports is losing control and predictability, making it difficult to understand (without using tools e.g., code review) what extensions are being used, and making it hard to resolve conflicts (without limitations such as simply forbidding it).

  • Is it possible to develop a library with a private internal extension without exporting it?
  • Is it possible to provide optional extensions from a package without forcing you to import them?

I think this can also be solved by allowing the extension author to specify an export condition, something like: when someone imports PythonObject, also export the extension associated with it (the SIMD extension).

This was limited to the struct’s extension (which shares the same name), but it doesn’t have to be.

I don’t have a great idea for how to represent this syntactically, but here are some ideas:

# stdlib/python/__init__.mojo

from .python_object import PythonObject
from .conversions import SIMD as PythonObject # Import the SIMD extension and export it alongside PythonObject.
# So when someone imports PythonObject, it also imports any extensions with that name, they don't have to be extensions of PythonObject.

This is a bit confusing, so maybe we need named extensions. The extension could be named SIMDPythonConvertibleExtension to differentiate it from the SIMD struct.

# stdlib/python/__init__.mojo

from .python_object import PythonObject
from .conversions import SIMDConvertibleToPythonExtension as PythonObject # Descriptive name.
# Still exported under the same name as PythonObject

Specify the extension name:

# stdlib/python/conversions.mojo

extension SIMD(ConvertibleToPython) as SIMDConvertibleToPythonExtension: 
    ...

Another option:

# stdlib/python/conversions.mojo

extension SIMD(ConvertibleToPython) for PythonObject: # Specify the "owning struct" of the extension.
    ...

Other ideas:

  • An export statement for extensions.
  • A struct + extension combination operator before exporting.