Mojo max bindings

Can somebody clarify why max bindings were removed from mojo?

Will they be reintroduced later or are we stuck with python?

The team is working on it and soon we’ll be able to share more.

Thanks for asking. I’ll state upfront that this was not an easy decision to make, many of us worked really hard in bringing up those initial Mojo Graph API interfaces and proof-of-principle models built on them. It also absolutely does not rule out building something better in Mojo when we’re ready for it, or carrying these on as a community project.

We decided to open-source the Mojo Graph and Driver APIs and then wind down their Modular-supported distribution due to increasing maintenance issues with these APIs. We had built these at a time when Mojo was in a very different state, and when we examined what it would take to modernize them we concluded it most likely would require a complete ground-up reworking. In their current state, they were standing in the way of some cleanup and enhancements we wanted to do to improve the GPU programming experience in Mojo.

The Python Driver and Graph API came about because we wanted to make it as easy as possible for groups with large Python codebases and Python familiarity to make use of MAX in production. It let us embrace community standards, like Hugging Face model definitions, weights, and tokenizers, and leverage Python-based serving solutions. It made it easy to provide smooth interoperability with NumPy arrays or PyTorch tensors, allowing for progressive uptake of MAX graphs in Python projects. You can see us continuing to improve Python integration via initial Python->Mojo interoperability and support for writing custom PyTorch ops in Mojo.

Building the Graph and Driver APIs in Python also let us work against a relatively stable language, so we didn’t need to push API / language coevolution in the same way as we had when creating these interfaces in Mojo. That took some pressure off of the Mojo team and allowed them to focus their efforts at near-term needs for Mojo as a high-performance language while planning for the right long-term designs needed for what we wanted Mojo to become. Some core issues that had prevented ergonomic Mojo Graph API interfaces have since been fixed (I hear someone was hacking on list and dictionary literals over a weekend), but there are other sharp edges we’d still like to address.

We also learned a lot when we built the initial Graph API in Mojo, and used those lessons to make the Python APIs a complete “v2” rewrite. Were we to revisit a Mojo API for graph construction, we’d want to start from some of the patterns we’ve found to work really well in the Python APIs.

To date, we’ve been able to achieve state-of-the-art performance in the large-scale models we’ve driven via the Python MAX APIs, due to almost all of the computational time being spent running the model graphs. However, I’m personally a big fan of fast host languages for driving ML models, and I recognize that Python doesn’t scale down to embedded systems. I can see a point where we’d want to reinvest in these Mojo APIs at Modular, but with the open-sourcing of the Mojo Graph and Driver APIs we wanted to at least provide an option for the community to carry these on and maybe evolve them independently of us.

3 Likes