List slicing inconsistency

At the moment it‘s quite surprising for users that lst[0:4] returns a Span (without copying values) and lst[0:4:2] returns a new List (with copied values) which leads to different semantics.

How about adding a NonContiguousSpan to the stdlib? It would store the step size and a contiguous memory pointer into the original list. Its getter would then use the stored step size to access the non copied values in a non contiguous manner. This would greatly improve performance of non contiguous list slicing and align its semantics with contiguous list slicing.
Or is this considered a corner case that isn’t worth implementing?

1 Like

One thought I have about this, can we decouple Layout from LayoutTensor and use it more generally? I think that using LayoutTrait as a more general abstraction for an indexing scheme would help solve this issue, and could still enable better solutions to this issue. However, there would need to be some work to expose scatter/gather through a “LayoutSpan” and the ability to do some specialization for 1d indexing would help a lot.

1 Like

I like the idea of generalizing over indexing schemes. But isn‘t slicing a 1D list a much simpler thing than having a generalized LayoutTrait that has to be tweaked for this use case?

Or, to put it in other words: How about starting with a concrete LayoutSpan for list slicing and generalize afterwards?

This way the current inconsistency can be fixed in the short term and used as a starting point for further generalization.

So, it’s “start:end:1” versus “start:end.”

The former has value semantics, while the latter has reference semantics.

You define these different semantics as an inconsistency.

There are multiple solutions to unify the semantics, such as changing stepped slicing to a lazy, iterator-like abstraction, or changing non-stepped slicing to have value semantics via either the always-copy or copy-on-write approaches.

However, it should be noted that the always-copy approach was abandoned for performance reasons because a systems programming language would find it too tempting to simply increment a pointer and decrement a length.

I wonder if it’s truly worthwhile to dedicate resources to addressing this specific use case when the language has so many more compelling problems to tackle.

Indexing a 1d list is simpler than indexing an N-D thing, but since we can be generic over rank we should. Perhaps some conditional conformance to make things a bit nicer for 1d is in order, but I think a unified abstraction is valuable here. We would need some kind of LayoutSpan/View type to hold this.

Shall I try implementing a PoC for this?

@owenhilyard : Do you have input on how to start such a PoC?

The Idea would be to start with a struct that is generalizable but can be used in the concrete case of List slicing. One could start based on the current Span implementation or start from scratch with a LayoutView. The easiest way I think this could be implemented is having a pointer to the original list and using the slice info (start, end, step) for getter/setter logic.

The idea would be to always have reference logic for performance reasons. So slicing lst[:4:2] would have the same reference semantic and performance characteristics as lst[:4].

BTW: @Serotonin pointed out on Discord that numpy np.array has reference semantic for both, contiguous and non-contiguous slicing.

You essentially propose this(in simplied mojo sudo code):

struct NonContiguousSpan[T]:
    var span: Span[T]
    var step: Int
    fn __getitem__(self, idx: Int) -> T:
        return self.span[self.step * idx]
    fn __iter__(self) yields T:
        for i in range(0, len(span), self.step):
            yield self.span[i]

Yes, exactly, something in this direction was my first thought, but I think @owenhilyard has something more generic in mind.

Step 1 would be to make span that uses LayoutTrait. Once you have that, then most things should sort-of just work.

What is the stance of the Modular/Stdlib team on this?

  • it’s no inconsistency but intentional
  • not worth fixing this inconsistency before Mojo 1.0
  • Fix it with a concrete non-contiguous Span
  • Fix it by implementing LayoutTrait

See Can we sort elements in descending order in mojo? for an example using slicing with negative step size where the expectation might have been that the slicing was non copying.

LayoutTrait has been removed from Mojo.

Now we have one less option.

What is the stance of the Modular/Stdlib team on this?

  • it’s no inconsistency but intentional
  • not worth fixing this inconsistency before Mojo 1.0
  • Fix it with a non-contiguous Span