Intel claims to have doubled the performance and value of its Xeon family of processors for data centres with the introduction of a new sixth-generation Xeon 6 line, plus new Gaudi 3 accelerators. The former – presented as Xeon 6 with P-cores (‘performance’ cores) – doubles performance of its Xeon 5 central processing units (CPUs) for artificial intelligence (AI) and high-power compute (HPC) workloads, it said; the latter offers “up to” 20 percent more throughput and twice the compute value (“price/performance”) of Nvidia’s H100 graphics unit (GPU).
It based the last measure on grunt power for inference of LLaMa 2 70B1 foundation language models. The new P-core chips, codenamed Granite Rapids, for enterprise data centres replace the old ‘Scalable Xeon’ (‘Emerald Rapids’) units; they go alongside its second-track E-core platform, named Xeon 6 Sierra Forest, and announced previously, for cloud customers. The twin-track P-line and E-line nomenclature defines CPUs for enterprise-geared HPC in data centres and cloud-oriented core density and energy efficiency in multi-threaded workloads.
Intel’s new Xeon 6 with P-cores family comprises five CPUs, so far, available with 64, 72, 96, 120, or 128 cores – more than doubling the core-count of the fifth-generation Xeon line at the high end. They are being produced on Intel’s new 3nm-class process technology (Intel 3), actually classified as 5nm, where its previous generation was on Intel 7 at 10nm. The Xeon 6 also features double the memory bandwidth, and AI acceleration in every core. “This processor is engineered to meet the performance demands of AI from edge to data center and cloud environments,” the firm said.
Meanwhile, the Gaudi 3 accelerator features 64 Tensor processor cores (TPCs) and eight matrix multiplication engines (MMEs) for deep neural network computations. It includes 128GB of HBM2e memory for training and inference, and 24 200Gb Ethernet ports for scalable networking. It is compatible with the PyTorch framework and Hugging Face transformer and diffuser models. Intel is working with with IBM to deploy Gaudi 3 AI accelerators as-a-service on IBM Cloud – to further “lower total cost (TCO) to leverage and scale AI, while enhancing performance”, it said.
Intel is also working with OEMs including Dell Technologies and Supermicro to develop co-engineered systems tailored to specific customer needs for effective AI deployments, it said. Dell is currently co-engineering retrieval-augmented generation (RAG) solutions using Gaudi 3 and Xeon 6. It stated: “Transitioning gen AI from prototypes to production systems presents challenges in real-time monitoring, error handling, logging, security and scalability. Intel addresses these through co-engineering efforts with OEMs and partners to deliver production-ready RAG solutions.”
It continued: “These solutions, built on the Open Platform Enterprise AI (OPEA) platform, integrate OPEA-based microservices into a scalable RAG system, optimized for Xeon and Gaudi AI systems, designed to allow customers to easily integrate applications from Kubernetes, Red Hat OpenShift AI, and Red Hat Enterprise Linux AI.” It talked-up its Tiber portfolio to tackle developer challenges with cost, complexity, security, and scalability. It is offering Xeon 6 preview systems for evaluation and testing, and early access to Gaudi 3 for validating AI model deployments.
Justin Hotard, executive vice president and general manager of Intel’s data center and AI group, said: “Demand for AI is leading to a massive transformation in the data center, and the industry is asking for choice in hardware, software and developer tools. With our launch of Xeon 6 with P-cores and Gaudi 3 AI accelerators, Intel is enabling an open ecosystem that allows our customers to implement all of their workloads with greater performance, efficiency and security.”