Encoder-Decoder (T5) model serving support

Do MAX planning to support T5 and other encoder-decoder models for serving OpenAI /v1/embeddings endpoint?

Unfortunately, vLLM refused to support this in future… [Model] Add T5 model (2/2) by NickLucche · Pull Request #11901 · vllm-project/vllm · GitHub

1 Like

Right now, model support is added either because one of Modular’s customers wants that model, the model is notable (gpt-oss-120b), or because a community member contributes an implementation. As far as I am aware, MAX has no reason to reject encoder-decoder models and contributions are welcome, but having Modular employees implement support will likely be low priority until Modular has a need for it, since Modular is also trying to keep up with the tidal wave of LLMs.

3 Likes

That’s a very accurate summary Owen! :ocean:

3 Likes

Yep, I understand that. This topic was only meant to start a conversation or to track this feature somewhere. You can treat this as a feature request :slight_smile:

I have been planning to contribute this model! While T5 is an NLP classic, the T5EncoderModel, and its variants like UMT5EncoderModel, have effectively become the most popular text encoders for SOTA diffusion models, including HunyuanVideo-1.5, the Wan-2.x series, and Flux.1.
I am currently finalizing the second stage of my Z-Image pipeline integration PR. Once that is merged (or in parallel with its fourth stage), implementing T5/UMT5 support is next on my roadmap to pave the way for porting especially those two video-gen models.