ADTs with swappable implementations

Occasionally, when working with rust, I’ll swap from using HashMap to using BTreeMap or BrownMap and find that I get a large performance boost. the issue is that the functions for these structures have slightly different signatures, so every time I swap, I need to do a little bit of refactoring.

I was thinking with mojo, it may be nice to have Dict in std be a trait that can be implemented by anyone’s implementations like HashDict, BtreeDict, RadixDict, etc. This also allows other people to create third party implementations of these ADTs and make it trivial for users to switch to these implementations of the ADTs from their existing implementations.

I’m using Dict as an example here, but this could extend to most common ADTs like sets, lists, trees, etc.

Since experimenting with different implementations becomes trivial, people become more likely to do this experimenting resulting in generally better performance across the board. It also helps reduce fragmentation and helps unify the API for common ADTs.

There’s also some precedent for this. Zig has done exactly what I described with allocators, and it was such as huge success that they decided to do it again with IO.

I’d like to get the community’s thoughts on this; I’ll create an official proposal if this gets enough support.

1 Like

This is probably overdue. I think the main reason it hasn’t happened is because right now there’s some anticipation of new capabilities around conditional conformance which may result in API changes, but getting the discussion started now is likely a good idea.

I wouldn’t call it Dict, since the ADT is more formally called Map, and there’s some extra capability stuff that I think should be in there (OrderedMap, RandomAccessMap, etc), but I agree with the overall idea. Same for List and Set. Since List and Set are taken, we’d probably want to call it ListLike and SetLike or something similar. The current Dict type is technically an ordered hash map (to match Python’s expectations), not the pure hash map people might expect, so having a pure hash map would make sense.

For Allocators, I have some ideas for what Zig would call General Purpose Allocators, but Mojo needs parametric traits before I think that doing more specialized (ex: fixed set of layouts) allocators are possible.

I also have some thoughts on IO, and came to those ideas without knowing about Andrew’s work in the area. We ended up in similar-ish places, except that I opted to do more things at compile time than Zig does. You can see my take on IO here: [stdlib] [proposal] Binary IO API by owenhilyard · Pull Request #4728 · modular/modular · GitHub

Probably the most “radical” part of it is that I think that the POSIX IO interfaces are not going to last much longer, mainly because they mandate a lot of copies and memory bandwidth is becoming increasingly precious. The proposal there is designed to point at DPDK, SPDK and io_uring as state of the art for their respective areas (DPDK and SPDK win at raw perf but make sacrifices many consider unacceptable around ease of use (possibly fixable) and portability).

Feel free to write up a proposal for Map/Set/List ADT traits, probably with RandomAccess and Ordered extensions to better capture some of the capabilities common ADTs have as a separate post (to contain the discussion so we can archive it) and ping me.

4 Likes

Is this something like collections.abc — Abstract Base Classes for Containers — Python 3.14.2 documentation?

Similar, Wikipedia has a good set of examples: Abstract data type - Wikipedia

Collection and Container are a bit too general to be super useful imo, but perhaps there are cases where they might be useful as a base trait.