Nvidia, AMD, Amkor, Arista @ UBS Tech Conference
Deep Dives
Explore related topics with these Wikipedia articles, rewritten for enjoyable reading:
-
High Bandwidth Memory
9 min read
The article discusses HBM capacity and bandwidth limitations in older GPUs as a potential driver of replacement cycles. Understanding HBM's technical architecture, stacking technology, and evolution across generations would give readers deeper insight into why memory constraints matter for AI workloads.
-
Tensor Processing Unit
12 min read
The article references Google TPUs and questions what happens to older versions, while also discussing the GPU vs XPU competition. Understanding TPU architecture, its differences from GPUs, and Google's design philosophy provides essential context for the hyperscaler competition discussion.
-
Fabless manufacturing
12 min read
The article contrasts Nvidia as a 'merchant silicon vendor' against hyperscalers designing custom chips. Understanding the fabless model explains why companies like Nvidia, AMD, and now cloud providers make different architectural tradeoffs and how the semiconductor industry structure shapes AI chip competition.
Thoughts from various conversations at the UBS conference last week:
Nvidia
No Replacement Cycles Yet
Colette confirmed that there hasn’t been a datacenter GPU replacement cycle yet:
Timothy Arcuri: And I get the question a lot about how much of what you’re shipping is replacing existing GPUs versus just additive to the existing base. And it seems like almost all of what you’re shipping is just additive to the base. We haven’t even begun to replace the existing installed base. Is that correct?
Colette Kress: It’s true. It’s true that most of the installed base still stays there. And what we are seeing is the advanced new models want to go to the latest generation because a lot of our codesign was working with the researchers of all of these companies to help understand what they’re going to need for their next models. So that’s the important part that they do. They move that model to the newest architecture and stay with the existing. So yes, to this date, most of what you’re seeing is all brand new builds throughout the U.S. and across the world.
On the one hand this is fairly obvious: GPUs, even older ones, are super useful whether you’re pre-training, post-training, fine-tuning, serving inference, labeling data, simulating autonomy, synthetic data generation, ablation studies, regression testing, etc etc. R&D teams everywhere can absorb essentially unlimited amounts of old GPU compute. Every lab has more experiments it wants to run than budget for new GPUs.
So why throw out old GPUs that can still crank out tokens, even if the throughput is lower? Especially if they are nearly or fully depreciated!
But it does raise the question: what would cause GPU replacement cycles?
Power Budget Reallocation
Recall that power is a constraint. Remember how Andy Jassy answered a capacity question on the Amazon earnings call in terms of power and not chips?
Justin Post: I’ll ask on AWS. Can you just kind of go through how you’re feeling about your capacity levels and how capacity constrained you are right now?
Andrew Jassy: On the capacity side, we brought in quite a bit of capacity, as I mentioned in my opening comments, 3.8 gigawatts of capacity in the last year with another gigawatt plus coming in the fourth quarter and we expect to double our overall capacity by the end of 2027. So we’re bringing
...
This excerpt is provided for preview purposes. Full article content is available on the original publication.