Citi: AI Inference Demand Remains Tight, Bottleneck Shifting From Chips to Power and Data Centers

GPU0%

BlockBeats News, June 16th, Citigroup stated that the intensity of demand for AI inference continues, with computing power scarcity spilling over from the latest generation of chips to the previous generation of GPUs, and driving model vendors to accelerate monetization through pricing, quotas, and routing mechanisms.

Analysts at Citigroup, including Heath Terry, wrote in a report released on June 14th that A100 GPU rental prices rose by 0.6% in the past week and accumulated an 11% increase over six weeks, indicating that AI computing power demand is not only concentrated on the most advanced hardware. At the same time, some cutting-edge models have significantly increased prices after improving their intelligence scores, with Citigroup stating that "scarcity is being monetized faster than it is being solved."

The report pointed out that currently no model vendor can simultaneously possess the three advantages of intelligence, speed, and price. The intelligence score of the latest cutting-edge models has increased by about 4 points, but the overall price is close to doubling; meanwhile, mid-range models have made progress in speed, with the median output speed of the top 20 models increasing from 64 tokens/s to 105 tokens/s over six weeks.

Citigroup also stated that the capability gap between closed-source models and open-source models is widening, with the proprietary models leading open-source models in intelligence by around 10 points, up from about 6 points. This means that top model vendors still tend to use stronger capabilities to hold onto the high-end market rather than directly compete on price with open-source models.

Beyond computing power, electricity and data center siting are becoming new constraints on AI expansion. The report mentioned that a private neocloud has already signed contracts for 4.9GW of demand, but the planned pipeline exceeds 40GW, highlighting the gap between demand surges and supply landing. Citigroup stated that data centers tend to be located in regions with electricity prices around 9-12 cents/kWh, while the percentage of renewable energy and long-term power purchase agreements also affect siting decisions.

The report believe that the future AI infrastructure costs will continue to rise. With increasing component prices, electricity access, and upfront infrastructure investment, capital expenditure calculated by equivalent computing power like H100 is rising, and electricity costs are shifting from operational stages to upfront capital investment.

Citigroup stated that the next phase of value may flow to the "inference routing layer," a platform that can determine which model, quantization method, and hardware to use for different tasks. This layer can reduce inference costs and improve output efficiency, but enterprise data, intellectual property, and privacy protection will be implementation challenges.

From the perspective of the industry chain, the report points not only to GPUs but also to data centers, electricity, optical communication, cloud infrastructure, and model applications. Citigroup listed related coverage targets such as Ciena, Lumentum, and MiniMax in the appendix, showing that the AI inference cycle is spreading from chips to a broader range of infrastructure and application layers.

Kaynak:BlockBeats

Yasal Uyarı: Mevcut içerik üçüncü taraf kaynaklardan alınmış veya doğrudan yapay zeka tarafından üçüncü taraf kaynaklardan çevrilmiştir. CoinEx, içeriğin gerçekliğini, doğruluğunu ve orijinalliğini garanti etmez ve bu içerik, CoinEx tarafından herhangi bir yatırım tavsiyesi teşkil etmez. Kripto varlıkların fiyatı ciddi dalgalanmalardan geçer, lütfen potansiyel risklerin farkında olun.

İlgili HaberlerHepsine bak

Guangfa Bank Hong Kong: Short-term Volatility Increases in AI Hardware Chain, But Fundamentals Remain Strong

Serenity: AI Impact Comparable to Industrial Revolution, Capital Spending Yet to Show Clear Inflection Point Signal

Opinion: Space AI Computing Power Deployment Could Become a Significant Upside Option for SpaceX Valuation, Orbital Approach Not GPU-Cost-Prohibitive and Potentially Only 1/4 Ground Expenses

En Çok Arananlar

Coinler
Fiyat
24sa Değişim