Dell Technologies among partners using Intel Xeon 6 and Gaudi 3 for RAG-based enterprise AI
Alongside the recent launch of its latest Xeon 6 CPUs and Gaudi 3 artificial intelligence (AI) accelerators, Intel also announced a full-stack solution meant to help enterprises use AI more cost effectively, quickly, reliably and securely. Intel AI for Enterprise RAG, set to launch in Q4, can support a range of use cases, including audio, text and visual Q&A, code generation and translation, and content summarization.
In an interview with RCR Wireless News during Intel’s recent Enterprise Tech Tour in Hillsboro, Oregon, Intel Vice President of Data Center and AI Software Bill Pearson described a retrieval-augmented generation (RAG) approach to enterprise AI as a practical, cost-effective way to realize the benefits of AI without the cost and complexity of fine-tuning a large language model (LLM).
LLMs serve as the foundation for many AI-enabled chatbots like ChatGPT. But in the context of an enterprise use case, a commercial LLM generally lacks domain-specificity and certainly doesn’t have access to a business’s proprietary data. One option would be to take that LLM and fine-tune for the needs of a particular company working in a particular sector; however, that’s a big ask for many companies. RAG essentially supplements an LLM with another data set, like an enterprise’s internal process documentation, for instance, thereby helping it deliver responses aligned with the context of a particular query.
Pearson said this approach works for customers running AI workloads in on-premise data centers or in the public cloud. Intel’s goal, he said, is to deliver solutions “that are going to have the most applicability across the various customer bases.” And regardless of where the workloads are run or what the AI approach is, he said step one is to understand the problem, define the use case, then examine technology options.
In various sessions throughout the Enterprise Tech Tour, Intel executives hit on some of the complexities with both enterprise AI deployment, in general, and specifically to using RAG. The big ones were around customization, performance and scalability, total cost of ownership, and access to validated enterprise-ready systems and support.
Check out this deck for a closer look at Intel AI for Enterprise RAG.
Looking at larger trends around enterprise AI adoption, Pearson called out the importance of a strong partner ecosystem. For the RAG solution, Dell Technologies is working with Intel to use Xeon 6 and Gaudi 3 for RAG-based solutions. Intel also brings to the table the Open Platform for Enterprise AI (OPEA), a Linux Foundation “sandbox-level project…that enables the creation of open, multi-provider, robust and composable [generative AI] solutions that harness the best innovation across the ecosystem.”
On the silicon side, Intel this month detailed its Xeon 6 with P-cores, which delivers twice the performance of its predecessor with an increased core count, double the memory bandwidth and embedded AI acceleration. The Gaudi 3 AI Accelerator is optimized for generative AI at scale, and it features 64 Tensor Processor Cores (TPCs), 128 gigabytes of HBM2e memory for AI training and inferencing.
Intel EVP and GM of the Data Center and AI Group Justin Hotard said in a statement, “Demand for AI is leading to a massive transformation in the data center, and the industry is asking for choice in hardware, software and developer tools. With our launch of Xeon 6 with P-cores and Gaudi 3 AI accelerators, Intel is enabling an open ecosystem that allows our customers to implement all of their workloads with greater performance, efficiency and security.”