Tinygrad claims that GLM5.2 can achieve 120 tok/s on a dual-machine interconnected Blackwell configuration, with a price of $150,000.

GPU0%
GLM0%

BlockBeats News, June 21st, GPU retailer Tinygrad announced that, according to reliable sources, the GLM 5.2 model can achieve a inference speed of 120 tokens per second on two networked Blackwell architecture tinyboxes.

The price of this configuration is $150,000, with the option to choose between two standard tinyboxes or one tinybox Pro, both capable of delivering the aforementioned performance. Tinygrad is promoting this as a selling point, emphasizing a private deployment route of "one-time purchase, never pay cloud fees," directly competing with pay-as-you-go cloud inference services.

As of now, this news has not been officially confirmed by the GLM team, and Tinygrad has not disclosed further technical details.

---------------------------------
Click the original text link below to join the BlockBeats · Lark AI News channel, monitoring global AI trends and news 24/7.

출처:BlockBeats

면책 조항: 현재 콘텐츠는 제3자 관점에서 제공되거나 제3자 관점에서 AI가 직접 번역한 것입니다. CoinEx는 콘텐츠의 진위성, 정확성, 독창성을 보장하지 않으며 CoinEx의 투자 조언으로 간주하지 않습니다. 암호화폐 가격은 변동성이 크므로 잠재적인 위험에 유의하시기 바랍니다.