NVIDIA CEO: ‘This isn’t computing the old-fashioned way; this is a whole new way of doing computing’
SAN FRANCISCO – At this week’s Google Cloud Next conference, Google announced that generative AI technology from NVIDIA is now available and optimized for Google Cloud users. The partnership touches nearly every aspect of computing, from infrastructure design to extensive software enablement, in an effort accelerate AI application creation for Google Cloud developers.
NVIDIA CEO Jensen Huang joined Google Cloud CEO Thomas Kurian on the keynote stage to discuss the expanded partnership and detail just how transformative generative AI is. According to both Huang and Kurian, the pairing will lead to “significant” and “unprecedented” performance to all kinds of AI applications and will accelerate large language models (LLMs). Huang also told the audience that more broadly, the companies are working together to accelerate Google’s Vertex AI platform, as well as AI models and software for the world’s researchers and developers.
“This isn’t computing the old-fashioned way; this is a whole new way of doing computing,” he said. “We’re working together to reengineer and re-optimize the software stack… [and] push the frontiers of large language models distributed across giant infrastructures so that we can save time for the AI researchers, scale up to gigantic next generation models, save money, save energy. All of that requires cutting-edge computer science.”
In a massive leap forward for cutting-edge computer science, PaxML, Google’s framework for building large language models (LLMs), is now available on the NVIDIA NGC container registry, which they claim means developers can easily use H100 and A100 Tensor Core GPUs.
“This Jax-based machine learning framework is purpose-built to train large-scale models, allowing advanced and fully configurable experimentation and parallelization,” further explained Dave Salvator, director of product marketing in the Accelerated Computing Group at NVIDIA, in a blog post. “PaxML has been used by Google to build internal models, including DeepMind as well as research projects, and will use NVIDIA GPUs.”
The companies also announced the integration of Google’s serverless Spark with NVIDIA GPUs through Google’s Dataproc service.
“Generative AI is revolutionizing every layer of the computing stack, and our two companies … are joining forces to reinvent cloud infrastructure for generative AI,” Huang said at the conference. “We are starting at everything single layer, starting with the chips — H100 for training and data processing — all the way to model serving with L4 [layer 4]. This is a reengineering of the entire stack from the processors to the systems, to the networks and all the software.”