← Back to Library

TPUv7: Google Takes a Swing at the King

Deep Dives

Explore related topics with these Wikipedia articles, rewritten for enjoyable reading:

  • Tensor Processing Unit 12 min read

    The article centers on Google's TPU chips competing with Nvidia, but readers may not understand the technical architecture that makes TPUs specialized for AI workloads versus general-purpose GPUs

  • CUDA 12 min read

    The article mentions Nvidia's 'CUDA moat' as a key competitive advantage that Google needs to overcome - understanding CUDA's parallel computing platform explains why software lock-in is so powerful in the GPU market

  • Vertical integration 15 min read

    The article describes Google's strategy of controlling the full stack from silicon to software, and the 'circular economy' criticism of Nvidia funding startups - this business strategy concept provides crucial context for understanding the competitive dynamics

The two best models in the world, Anthropic’s Claude 4.5 Opus and Google’s Gemini 3 have the majority of their training and inference infrastructure on Google’s TPUs and Amazon’s Trainium. Now Google is selling TPUs physically to multiple firms. Is this the end of Nvidia’s dominance?

The dawn of the AI era is here, and it is crucial to understand that the cost structure of AI-driven software deviates considerably from traditional software. Chip microarchitecture and system architecture play a vital role in the development and scalability of these innovative new forms of software. The hardware infrastructure on which AI software runs has a notably larger impact on Capex and Opex, and subsequently the gross margins, in contrast to earlier generations of software, where developer costs were relatively larger. Consequently, it is even more crucial to devote considerable attention to optimizing your AI infrastructure to be able to deploy AI software. Firms that have an advantage in infrastructure will also have an advantage in the ability to deploy and scale applications with AI.

Google had peddled the idea of building AI-specific infrastructure as far back as 2006, but the problem came to a boiling point in 2013. They realized they needed to double the number of datacenters they had if they wanted to deploy AI at any scale. As such, they started laying the groundwork for their TPU chips which were put into production in 2016. It’s interesting to compare this to Amazon, who in the same year, realized they needed to build custom silicon too. In 2013, they started the Nitro Program, which was focused on developing silicon to optimize general-purpose CPU computing and storage. Two very different companies optimized their efforts for infrastructure for different eras of computing and software paradigms.

We’ve long believed that the TPU is among the world’s best systems for AI training and inference, neck and neck with king of the jungle Nvidia. 2.5 years ago we wrote about TPU supremacy, and this thesis has proven to be very correct.

TPU’s results speak for themselves: Gemini 3 is one of the best models in the world and was trained entirely on TPUs. In this report, we will talk about the huge changes in Google’s strategy to properly commercialize the TPU for external customers, becoming the newest and most threatening merchant silicon challenger to Nvidia.

We plan to:

  • (Re-)Educate our clients and new

  • ...
Read full article on SemiAnalysis →