Google recently shipped its latest AI chip, the TPU v5e, and introduced the “A3 Supercomputer” powered by NVIDIA H100 GPUs. A3 VM instances are set to launch next month, pushing a significant leap in AI infrastructure. Google has also revealed an expanded partnership with AI chip leader NVIDIA.
2x Training Performance, 2.5x Inference Boost, and 50% Cost Reduction
Google is looking to take the lead in AI training with their latest 5th Gen TPU. Here are a few remarkable things to highlight:
TPU v5e was designed for improved training, inference performance, and cost-effectiveness. Its training performance is doubled and inference performance improved by 2.5 times compared to TPU v4.
TPU v5e offers these advancements at less than half the cost, enabling cost-efficient training and deployment of larger AI models. Moreover, Google is turning its sights on scalability, including configurations accommodating up to 256 chips and INT8 computational power reaching 100 PetaOps.
TPU v5e’s virtual machine configurations to cater to various user needs. Included in the release is a new capability called “Multislice.” Multislice is a service for users requiring even more computational power by distributing model calculations across tens of thousands of TPU chips.
Google is also partnerning with NVIDIA to launch the A3 Virtual Supercomputer. The machine is designed to addresss the growing computational demands driven by generative artificial intelligence and large language models.With so much advancement, the next progression is only predictable: Google Cloud has integrated an additional 20 AI models, bringing the total supported models to 100. These new set of AIs give customers the flexibility to choose from a variety of models to meet their operational needs.