I tried Gemma 4 31b through the website and the response took a few seconds, but the timing came back as 0.1 ms/tok .
This implies 10k tok / s which isn’t possible given the output was less than 100 tokens but took more than 1 second.
I can add a screenshot but it’s easy to recreate on the website. Is that calculation accurate?
Good catch, @anandtyagi You’re right that the number wasn’t accurate. The playground was pulling timing data from a backend measurement that didn’t account for the full streaming duration, so the ms/tok figure was understated for streaming responses. We shipped a fix for this today. Give it another try and let us know if the numbers look more reasonable.
Unfortunately not quite yet @dunnoconnor I timed it myself, this took ~7.5 seconds so this should be 132.2 ms / tok

