Do the traits for copyability need revisiting?

Today’s Mojo includes both the Copyable trait (for implicit copyability) and the ExplicitlyCopyable trait.

I think the current design of these traits might be suboptimal.

More precisely, I suspect ExplicitlyCopyable should have been named just Copyable, and then we shouldn’t offer a trait for implicit copyability. Instead, we can allow the @implicit decorator (currently used for constructors) to be added to a struct’s implementation of Copyable.copy, or Copyable.__init__, or whatever the method ends up being named. This would give us the y = x syntax that we want.

This design has several advantages over today’s model:

  1. This reduces the number of traits in the standard library by 1.
  2. In this design, a type only needs to define one copy method (with a standard name), regardless of whether it is is implicitly or explicitly copyable. This is simpler than today’s model.
  3. This design forces programmers writing generic code to support both cheap-to-copy and expensive-to-copy types. This seems like a no-brainer to me. Why should generic code artificially restrict itself to only supporting copyable types that are implicitly copyable? Implicitness is just syntactic sugar, and it would be bizarre for a generic function to care about that.

The main disadvantage (?) of this design is that y = x syntax would only be available in contexts where the implementation of Copyable.copy is known to be annotated as @implicit. In generic contexts the implementation isn’t known, so copies must be explicit. This is fine IMO, because in generic contexts the copies may be arbitrarily expensive, so being explicit is a good thing.

Thoughts?

6 Likes

My one concern is that I’m generic contexts, there may be a case where the optimal logic is different for types that are cheap to copy and types that are expensive to copy. Maybe a nice middle ground would be to allow the @implicit decorator to automatically implement the copyable trait.

+1 to @akneni 's concern about generic contexts. I want a way to differentiate between “this fits in a register and is basically free to copy” and “I might copy 200GB if I copy this”. I think that, in the general case, everything which is implicitly copyable should still have the .copy() method and implement ExplicitlyCopyable, simply for the sake of having a unified trait. However, there are times where you may be forced to do a lot of copies with a type, and having generic code say “I only take Copyable” is a warning that it probably is going to copy a lot (meaning throw a refcount over the type if it has an expensive copy).

I think this proposal makes some good points. A couple of small thoughts:

  1. Requiring an @implicit decorator for implicit copy means that @value becomes slightly less useful for trivial types, but I’m personally fine with that tradeoff.
  2. I think the first justification should be the last because it is the weakest. The existence of more traits is not inherently bad, just traits that duplicate each others behavior (point 2).

I’m not fully on board with conflating triviality with copyability. An example would be types that are technically integers (like file handles) where we only want a single mutating owner at a time.

Under one of the other discussions, I proposed separating out Movable, Copyable and Dropable to each have trivial variants, so an fd is TriviallyMovable, but not Copyable and Dropable but not TriviallyDropable. I think separating that out helps with things like these.

This doesn’t make sense to me. You could make the same argument about any trait method. For example, perhaps your generic function takes a Stringable, and the optimal logic is different depending on whether a type is cheap to stringify or expensive to stringify. That doesn’t mean we should split the Stringable trait into CheaplyStringable and ExpensivelyStringable. I don’t see how Copyable is any different to this.

Also, we’re going to have a TriviallyCopyable trait soon, which means “it’s safe to memcpy the value”. This is already close to the distinction you’re looking for.

If a generic function “only takes a Copyable”, then it is incompatible with types that are (only) explicitly copyable. This would be a PITA in cases where:

  • I’ve decided that my type should be explicitly copyable.
  • I want to use your library.

The problem here is that you (as a library author) might be assuming that not being implicitly copyable means copying 200GB, whereas I (as the library user) am using it to mean “this takes 20 cycles to copy, and I want to know where I’m spending those cycles”. If your assumptions are different than mine, and you’re restricting my ability to use your library based on those assumptions, I’m going to end up frustrated.

The root problem is that implicit copyability is nothing more than syntactic sugar. It doesn’t actually provide performance guarantees, and it can’t ever do that, because different programmers are going to have wildly different opinions on what the performance threshold for implicit copyability should be.

Given this, in my opinion, library authors shouldn’t be defining APIs that only accept implicitly copyable types. If you do this as a library author, you’re basically saying: “If you opt into this syntactic sugar for your data type, you can use my library, but if you don’t want to use the syntactic sugar, then you can’t.”

This would be a bizarre situation to end up in.

1 Like

@ConnorGray FYI when you’re back from holiday. I’ve lost track of this design here.

Aside: On Discord we recently discussed renaming __copyinit__ to __copy__, and making it an instance method. (Github issue.) If we restructure copyability as suggested in my original post, we would need to figure out whether the __copy__ dunder still has a purpose.