To complement the Qualcomm Cloud AI 100 Ultra accelerator, the company has developed a software suite for AI inference workloads
From an enterprise perspective, AI is all about putting data to work in a way that improves process and workflow efficiency, and creates new revenue opportunities. The center of data gravity is at the edge where connected devices of all sorts produce a steady stream of information that potentially contains valuable insights if only it could be effectively, quickly parsed and fed forward into whatever process or workflow the user has identified. At the moment, the center of AI gravity is in the cloud, although broad industry discourse suggests edge AI is a priority given the clear benefits around cost, latency, privacy and other factors. The high-level idea here is to bring AI to your data rather than bringing your data to AI.
Qualcomm has built a compelling narrative around edge AI and it’s role in bringing to market products that propel AI from a series of point solutions to a larger system. Last month during the Consumer Electronics Show in Las Vegas, Qualcomm had a range of consumer-facing announcements covering automotive, personal computing and smart home tech; but they also had an interesting launch that speaks to enterprise adoption of edge AI solutions.
During the show, the company announced its Qualcomm AI On-Prem Appliance Solution and Qualcomm AI Inference Suite which, when combined, let enterprises “run custom and off-the-shelf AI applications on their premises, including generative workloads,” according to a press release. This, in turn, can accelerate enterprise AI adoption in a way that reduces TCO as compared to relying on someone else’s AI infrastructure estate.
The combined hardware and software offering “changes the TCO economics of AI deployment by enabling processing of generative AI workloads from cloud-only to a local, on-premises deployment,” Qualcomm’s Nakul Duggal, group general manager for automotive, industrial IoT and cloud computing, said in a statement. On-prem enablement of a range of AI-based automation use cases “reduces AI operational costs for enterprise and industrial needs. Enterprises can now accelerate deployment of generative AI applications leveraging their own models, with privacy, personalization and customization while remaining in full control, with confidence that their data will not leave their premises.”
Industrial giant Honeywell is working with Qualcomm to design, evaluate “and/or” deploy “AI workflow automation use cases” using the new hardware and software products. Aetina, a Taiwanese edge AI specialist, “is among the first OEMs to provide on-premises equipment for deployments based on the AI On-Prem Appliance Solutions;” that’s in the form of Aetina’s MegaEdge AIP-FR68. And, “IBM is collaborating to bring its watsonx data and AI platform and Granite family of AI models for deployment across on-prem appliances, in addition to cloud, to support a range of enterprise and industrial use cases in automotive, manufacturing, retail and telecommunications.”
The appliances leverage Qualcomm’s Cloud AI 100 Ultra accelerator card. Relevant specs include:
- ML capacity (INT8) of 870 TOPs
- PCIe FH3/4L form factor
- 64 AI cores per card
- 128 GB LPR4x on-card DRAM
- 576 MB on-die SRAM
The inference software suite includes ready-to-use apps and agents for chatbots, code development, image generation, real-time transcription and translation, retrieval-augmented generation (RAG), and summarization.
Click here for details on the on-prem appliance, and here for more on the inference software suite. And for a higher-level look at edge AI, distributed inference and test-time AI scaling, give this a read.