BentoCloud deployment status questions

While creating deployments on the BentoCloud, I have been seeing some unintuitive behavior regarding the status both through the web GUI and via CLI web responses:

  1. When a deployment has failed and I redeploy, the status changes from “Updating” to “Image Build Failed” before going to “Image Building”. The “Image Build Failed” is transient and seems premature. I don’t see any evidence of the failure in the log before it immediately starts building and succeeds. Since I am trying to verify deployment status automatically, the seemingly false failure is making it difficult to maintain ground truth.

  1. Sometimes my deployments will appear to succeed, but be stuck in the “Image Building” state for hours despite responding 200 to readyz. Attempting accurate automation is difficult when the status does not update in a timely manner to a success state.

1 Like

Bump bump

Hi Alex, do you have a more specifics logs for this deployment?

It seems like older build logs aren’t saved for long, so I don’t have the logs for those exact incidents pictured above. The first case is easy to recreate and I copied it into this gist (this forum won’t let me attach): BentoML deployment log when status reports build failure but no error in log · GitHub

The second case is less consistent to reproduce so I don’t have it on hand, but I checked the logs and they seemed like the succeeded like normal and responded to /readyz as well as the inference endpoint and returned valid responses despite being stuck in an “Image Building” state.

When image failed to build it is either:

  • the pypi packaging had some problems with installation
  • bentocloud’s image builder is not available

We have addressed the second point here. Can you check if this is the case

for 2. Can you point me to a deployment name that I can verify this on our side? Thanks

You have addressed it? I still see the issue unless I’m misunderstanding what you mean.

My test deployment horizon-contact-classification-preview-2204 has examples of both issues. The long/stale “Image Building” can be seen at timestamp (UTC): 2026-05-21 22:59:57

Hello, any updates on this front?