With the recent work on Iterators in the stdlib, I think it’s a good idea to have another discussion on Mojo’s iterator API.
Explicitly not in Scope
Things which are out of scope for this discussion.
- The inability to put trait bounds on
Iterator.Element
,requires
solves this.
Current API
trait Iterator(Copyable, Movable):
"""The `Iterator` trait describes a type that can be used as an
iterator, e.g. in a `for` loop.
"""
alias Element: Copyable & Movable
fn __has_next__(self) -> Bool:
...
fn __next__(mut self) -> Self.Element:
...
fn bounds(self) -> Tuple[Int, Optional[Int]]:
"""Returns bounds `[lower, upper]` for the remaining iterator length.
Returns a tuple where the first element is the lower bound and the second
is an optional upper bound (`None` means unknown or `upper > Int.MAX`).
This helps collections pre-allocate memory when constructed from iterators.
The default implementation returns `(0, None)`.
### Safety
If the upper bound is not None, implementations must ensure that `lower <= upper`.
The bounds are hints only - iterators may not comply with them. Never omit safety
checks when using `bounds` to build collections.
Example:
```mojo
fn build_list[I: Iterator & Iterable](iter: I) -> List[I.Element]:
var lower, _upper = iter.bounds()
var list = List[I.Element](capacity=lower)
for element in iter:
list.append(element^)
return list
```
"""
return (0, None)
trait Iterable:
"""The `Iterator` trait describes a type that can be turned into an
iterator.
"""
alias IteratorType[mut: Bool, //, origin: Origin[mut]]: Iterator
fn __iter__(ref self) -> Self.IteratorType[__origin_of(self)]:
...
There is also fn __next_ref__(mut self) -> ref [origin] Self.Element:
, which is present on many collections and is somewhat attached to the Iterator API.
Alternatives In Other Systems Languages
Rust
pub trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
//76 optional methods with defaults based on `next`.
//I have included some notable ones
// Used for chunked iteration, often with SIMD
fn next_chunk<const N: usize>(
&mut self,
) -> Result<[Self::Item; N], IntoIter<Self::Item, N>>
where Self: Sized { ... }
// Equivalent to Mojo's `bounds`
fn size_hint(&self) -> (usize, Option<usize>) { ... }
// We need to be able to union two origins in order to handle this
fn chain<U>(self, other: U) -> Chain<Self, <U as IntoIterator>::IntoIter> ⓘ
where Self: Sized,
U: IntoIterator<Item = Self::Item> { ... }
// Do we want the ability to have an iterator over `out` values to handle things like this?
fn collect_into<E>(self, collection: &mut E) -> &mut E
where E: Extend<Self::Item>,
Self: Sized { ... }
}
C++
The actual interface is too much to include here, so here’s a link: std::iterator_traits - cppreference.com
Included for completeness, I don’t think this is a great idea.
Issues with the current Mojo API
__has_next__
For many kinds of iterators, such as those which pull from linked lists, the work required to answer this and the work required to retrieve the next item are nearly identical. This can also introduce odd side-effects for iterators which take multiple items from another iterator they contain, for instance, a SIMD iterator may need __has_n_next__[N: UInt]
where N is the SIMD width. I am concerned that this API may lead to a lot of duplicated checking or buffering inside of iterators. Additionally, we should consider APIs which are pulling from *PMC queues, where __has_next__
is a race condition without buffering.
The other major issue is what iterators should do when __has_next__
is false and the user calls __next__
. The current behavior for List
is to abort the program with debug assertions enabled or have UB in the default case. However, in some circumstances it may be preferable to raise. If we’re already raising, that’s functionally equivalent to returning Tuple[Bool, T]
, at which point Optional
is right there.
Iteration by reference
Right now, we can do iteration by reference via either the __next_ref__
helper, which isn’t technically part of the iterator API, or by returning Pointer[T, origin]
. The latter was considered a very strong point of annoyance as far as the community was concerned. However, owning iterators have no good way to implement this behavior since they may be returning a reference to a temporary, such as would be the case for range
.
Two options discussed in the past to help handle this are __deref__
, which is used in a similar manner to Rust’s Deref
/DerefMut
. Another option would be a __getattr__
overload but that might introduce a lot of compiler overhead. A third option is an OwnedOrigin
which can be used to shuffle everything through ref
.
Iterable.__iter__
taking ref self
This means that you can’t transfer ownership of a collection to an iterator, which may cause issues for iterators intended to drain the collection for use with Linear types to allow destroying collections which hold linear types.
__has_next__
and Generators
Ideally, we want to have python-style generators at some point, since they are useful for implementing both futures for async and to do make the compiler automate many of the currently manual transformations used when writing iterators. We could easily make all generators which return None
map to an Optional
-style iterator. However, we don’t really have a good way to do __has_next__
without buffering.
Alternatives
Optional
- Solves many of the issues with
__has_next__
. Iterable.__iter__
takingref self
is orthogonal.- Requires returning
Pointer[T, _]
for references. - Requires that
Element
can move/copy intoOptional
, which may require special compiler handling to elide. - Problems with batching
// Iterable left the same
trait Iterator(Copyable, Movable):
"""The `Iterator` trait describes a type that can be used as an
iterator, e.g. in a `for` loop.
"""
alias Element: Copyable & Movable
fn __next__(mut self) -> Optional[Self.Element]:
...
fn bounds(self) -> Tuple[Int, Optional[Int]]:
...
Just return a boolean
This would require multiple out
arguments to be allowed, which may be desirable anyway. It also has the issue of making any users who forget to check is_valid
walk into an uninitialized elem
, or one which may hold data from the previous iteration.
I personally don’t think this is a good idea outside of very specialized circumstances
// Iterable left the same
trait Iterator(Copyable, Movable):
"""The `Iterator` trait describes a type that can be used as an
iterator, e.g. in a `for` loop.
"""
alias Element: Copyable & Movable
fn __next__(mut self, out elem: Element, out is_valid: Bool):
...
fn bounds(self) -> Tuple[Int, Optional[Int]]:
...
raise StopIteration
The pythonic approach.
// Iterable left the same
trait Iterator(Copyable, Movable):
"""The `Iterator` trait describes a type that can be used as an
iterator, e.g. in a `for` loop.
"""
alias Element: Copyable & Movable
fn __next__(mut self) raises StopIteration -> Element:
...
fn bounds(self) -> Tuple[Int, Optional[Int]]:
...
Example Usage
This is what the implementation would need to look like after desugaring.
try:
while True:
var item = next(iter)
...
except StopIteration:
pass
Handling Batching
In order to make good use of SIMD, we’ll need an iterator interface that can handle batching items. We need to support non-masked SIMD due to many SIMD ISAs not having masking, which means that we can’t just have a MaskedSIMD
type which has a paired mask. I’m going to implement these using the Optional
form of the API but most can be translated to __has_next__
or one of the alternatives.
Variant
This can handle SIMD and drain loops, and may have better instruction locality than a normal iterator. However, the interface is much more complex as a result.
// Iterable left the same
trait Iterator(Copyable, Movable):
"""The `Iterator` trait describes a type that can be used as an
iterator, e.g. in a `for` loop.
"""
alias Element: Copyable & Movable
alias FixedBatchElement: Copyable & Movable & Iterator requires FixedBatchElement.Element == Element
alias DynamicBatchElement: Copyable & Movable & Iterator requires DynamicBatchElement.Element == Element
fn __next__(mut self) -> Variant[FixedBatchElement, DynamicBatchElement, None]:
...
# What should this return? The number of fixed batches left?
fn bounds(self) -> Tuple[Int, Optional[Int]]:
...
Multiple Iterators
This could potentially be handled with an iterator that transforms along a progression of return types for more normal loops, but I can imagine circumstances (such as a producer/consumer queue), where having both fixed and dynamic batch sizes intermingled would be helpful.
For example:
trait Iterator(Copyable, Movable):
"""The `Iterator` trait describes a type that can be used as an
iterator, e.g. in a `for` loop.
"""
alias Element: Copyable & Movable
fn __next__(mut self) -> Optional[Element]:
...
fn bounds(self) -> Tuple[Int, Optional[Int]]:
...
struct ListSimdIterator[mut: Bool, //, dtype: DType, origin: Origin[mut]](Iterator):
alias Element = SIMD[dtype, simdwidthof[dtype]()]
var idx: UInt
var inner: Pointer[List[Scalar[dtype]], origin]
fn __next__(mut self) -> Optional[Element]:
...
fn bounds(self) -> Tuple[Int, Optional[Int]]:
...
extension[dtype: DType] List[Scalar[dtype]]:
fn simd_iter(ref self) -> (ListSimdIterator[dtype, Origin[self]], ListIter[Scalar[dtype], Origin[self]]):
...
Additional misuse-resistance can be added using proof tokens or other techniques. In theory, we could also use type-level programing to produce a drain loop which uses less instructions (ex: make the main loop 256-bit, then have an optional 128-bit SIMD, then a scalar iterator). However, as currently written this may require duplicating the main loop.
Masking
We can also decide that performance on anything that isn’t a a recent ARM server chip, AMD Zen 4 or newer, Intel server, or RISC-V Vector CPU doesn’t matter that much. Most GPUs can do masking at the Warp/Wavefront level, so perhaps some creative abstractions can make it work there. If we do that, then batching becomes much less of an issue.
Other considerations
- Do we want to handle async iterators (Rust Streams)? Those need a discussion about async cancellation and how you add a timeout to
__next__
. - This entire API needs to be duplicated for
raising
iterators, as well as any other “effects” that Mojo finds (async
,blocking
, interrupt safety stuff, potentially warp32 vs warp64 on AMD, etc). - Given that Mojo tires to avoid magic compiler optimizations, how do we make iterators inspectable so that we can perform optimizations with
@parameter if
(ex: merging adjacent Filter/Map iterators, making.count()
on an iterator with a known size just grab that size, etc) - Should we wait until Mojo has Generators and
yield
to implement this so that the compiler is responsible for all of the the state tracking and transformations?
For example here’s what the current ListIter
could turn into:
extension[T: Copyable & Movable] List[T](Iterable):
fn __iter__(ref self):
for i in range(len(self)):
# `get_unsafe` used to avoid bounds checks
yield self.get_unsafe(i).copy() # `ListIter.__next__` copies right now
Although we lose bounds
, that might be possible to add back with extensions provided it’s possible to name the generator. This might not be possible for compiler reasons in which case we’d need to revisit the idea.
ex:
extension[T: Copyable & Movable, origin: Origin] List[T].IteratorType[origin]:
@always_inline
fn bounds(self) -> Tuple[Int, Optional[Int]]:
...
Using generators also means that the compiler can check that a particular type is only returned once. For example, None
or the drain loop part of a SIMD iterator.
- What about bidirectional iterators?
- Linear types and iterators which buffer things, how do we handle that?
- Should thread-safe queues
*PMC
go under the iterator abstraction at all?