The laws of physics, economics and countries suggest edge AI inference—where your device is the far edge—just makes sense
In the race to build-out distributed infrastructure for artificial intelligence (AI), there’s a lot more glitz and glam around the applications and devices than around the cooling, rackspace and semiconductors doing the heavy lifting in hyperscaler clouds. While that’s maybe a function of what most people find interesting, it’s also not misplaced. For AI to live up the world-changing hype it’s riding high upon, distributing workload processing for AI makes a lot of sense—in fact, running AI inferencing on a device reduces latency leading to an improved user experience, it saves the time and cost of piping data back to the cloud for processing, and it helps safeguard personal or otherwise private data.
To say that another way, when you read an announcement for a new class of AI-enabled PCs or smartphones, don’t just think of it as just another product launch. Think of it as an essential piece of building out the connected edge-to-cloud continuum that AI needs to rapidly scale.
During a panel discussion last week at the Consumer Electronics Show (CES) in Las Vegas, Nevada, Qualcomm’s Durga Malladi, senior vice president and general manager of technology planning and edge solutions, not only defined the edge, but made the case for why it’s a key piece in the larger AI puzzle. “Our definition of the edge is practically every single device we use in our daily lives.” That includes PCs, smartphones and other devices you keep on your person as well as the Wi-Fi access points and enterprise servers that are one hop away from those devices.
“The definition of the edge is not just devices but absolutely something close to it,” Malladi continued. “But the question is why?” Why edge AI inference? First, “Because we can. The computational power we have in the devices today is significantly more than what we’ve seen in the last five years.” Second is immediacy and responsiveness derived from latency (or the lack thereof) when inferencing is done on-device. Third is around contextual data enhancing AI outcomes while also enhancing privacy.
He expanded on the privacy point. From a consumer perspective, Malladi explained the contextual nature of AI and gave the example of asking an on-device assistant when your next doctor’s appointment is. For schedule planning, it’d be great for the AI assistant to know about your medical appointments but perhaps concerning if that data leaves your device; but it doesn’t have to. In the enterprise context, Malladi talked about how enterprises fine-tune AI models by loading in proprietary corporate data to, again, contextualize the information and improve the outcome. “There’s a lot of reasons why privacy becomes not just a consumer-centric topic,” he said.
AI is the new UI
As the conversation expanded, Malladi got into an area of thought that he, I think, debuted last year at the Snapdragon Summit, an annual Qualcomm-hosted event. The idea is that on-device AI agents will access your apps on your behalf, connecting various dots in service of your request and delivering an outcome not tied to one particular application. In this paradigm, the user interface of a smart phone changes; as he put it, “AI is the new UI.”
He tracked computing from command line interfaces to graphical interfaces accessible with a mouse. “Today we live in an app-centric world…It’s a very tactile thing…The truth is that for the longest period of time, as humans, we’ve been learning the language of computers.” AI changes that; when the input mechanism is something natural like your voice, the UI can now transform using AI to become more custom and personal. “The front-end is dominated by an AI agent…that’s the transformation that we’re talking of from a UI perspective.”
He also talked through how a local AI agent will co-evolve with its user. “Over time there is a personal knowledge graph that evolves. It defines you as you, not as someone else.” Localized context, made possible by on-device AI, or edge AI more broadly, will improve agentic outcomes over time. “Lots of work to be done in that space though,” Malladi acknowledged. “And that’s a space where I think, from the tech industry standpoint, we have a lot of work to do.”