How to import and run an exported MAX Model from MEF

TilliFe · February 16, 2025, 1:48pm

We can export a compiled MAX Model via export_compiled_model. What really is the MEF and is there a way to import and execute such a exported model at a later stage (in the Mojo API)?
I’d love to learn more about this

TilliFe · February 17, 2025, 5:01pm

I found instances of the MEF in the model cache for MAX, which are probably meant to reduce unnecessary compilations on topologically similar graphs:

.magic/envs/default/share/max/.max_cache/mof/mef

While playing around with that builtin model caching and my own Dictionary-based model caching mechanisms, I recognized that the builtin one - which probably loads compiled models from disc - is still pretty slow in comparison and rather unusable for some actual JIT stuff. Will/Can this be enhanced in the future?

BradLarson · February 17, 2025, 5:22pm

MEF is a device- and MAX-version-specific cache format produced by the graph compiler. It is not portable between devices and MAX version, and is primarily intended for caching the results of compiling a MAX Graph on a single system for repeated runs of the same graph. The methods for manually saving and loading MEFs were put in place at a time when the automatic graph caching didn’t work as reliably as it does today, and are largely obsolete. Automatic caching of graph compilation is pretty solid today.

As for performance, it’s unclear what you’re comparing here. I will say that the Mojo and Python Graph APIs do diverge in how they handle graph caching: the Mojo API inlines model weights with the compiled graph, where the Python API by default keeps the weights external from the graph. The latter speeds up loading the cached graph and massively reduces their footprint on disk. This technique was developed as we were working on the Python Graph API and hasn’t yet been introduced to the Mojo Graph API.

TilliFe · February 17, 2025, 5:57pm

Thank you very much for your reply, very insightful. Please don’t get me wrong here, I love it that this caching mechanism exists.

Performance-wise I meant to compare it to having a compiled model as an Object in e.g. a list or dictionary, and then in every iteration get an instance from this collection element and then run it on some input. This is usually like a 100 times faster in my experience than the automatic caching mechanism that MAX provides. Could you elaborate on what of the model loading from the cache takes so much time in comparison? Could the cached models be brought closer to the running process to reduce loading time?

Topic		Replies	Views
Porting various models to MAX MAX	6	170	May 8, 2025
Mojo max bindings MAX	2	165	May 26, 2025
Community meeting question: MAX speed loading weights & GPU warmup MAX	5	107	February 13, 2025
⁠OSS of MAX Mojo APIs⁠ MAX	4	200	May 15, 2025
MAX Nightly 25.5.0.dev2025061805 Released Nightly	0	27	June 18, 2025

How to import and run an exported MAX Model from MEF

Related topics