Intelligence at the edge is exciting. The edge allows devices to compute and analyze data closer to the user as opposed to a centralized data center far away, which benefits the end user in many ways. It promises low latency as the brains of the system are close by, rather than thousands of miles away in the cloud; it functions with a local network connection rather than with an internet connection, which may not always be available; and it offers a stronger guarantee of privacy because a user’s information is not transmitted and shared to remote servers. We will soon be able to process data closer to (or even inside) endpoint devices, so that we can reap the full potential of the intelligent analytics and decision-making.
But the computing power, storage and memory required to run current AI algorithms at the end point is hampering our ability to optimize processing there. These are serious limitations, especially when the operations are time-critical.
To make intelligence at the edge a reality, being able to understand, represent and handle context is most critical.
What does that mean? It means that we give computing systems the tools to identify and learn what is needed, and only what is needed. Why generate and analyze useless or low-priority data? Capture what is needed for the purpose required and move on. Intelligent machines at the edge should be able to “learn” new concepts needed to perform their tasks efficiently and they should also be able to “systematically forget” the concepts not needed for their tasks.
Humans learn contextually. Computer systems at the edge don’t – at least, not quite yet. But when they can, the power of AI and machine learning will be transformational.
The “Edge” of innovation
There are many definitions of context. Among other things, context can be relational, semantic, functional or positional. For our discussion we will use this [1] definition: “A system is context-aware if it uses context to provide relevant information and/or services to the user, where relevancy depends on the user’s task.”
Here’s a simple example in which object-recognition is strongly influenced by contextual information [1]. Recognition makes assumptions regarding object identities based on its size and location in the scene. Consider the region within the red boundary in both images. Taken in isolation, it is nearly impossible to classify the object within that region because of the severe blur in both images. However, when viewed with the entire image, we can easily identify the object as a car in the street in the left image and a pedestrian in the street in the right image. This example shows that our brains take strong clues from surroundings and orientation to judge what we may be observing based on scene context.
Another example, assume you have a smart kitchen assistant that can be installed in your home, that will be able to learn the layout of the home, determine what you are trying to cook and provide timely assistance based on what it observes and learns about your behavior over time. You could purchase a smart assistant which is powered by a trained model that gives it a basic understanding of key recipes and kitchen tasks, but the assistant doesn’t inherently know the specifics of your kitchen environment that will signal these tasks are taking place.
Once it is installed, it uses a camera to get a sense of your kitchen. It starts to learn its environment based on what it already knows about basics of kitchen tasks. Its auditory sensors pick up the sound of a faucet running to fill the coffee pot. It recognizes you turning on the morning news and then heading to a cabinet to take out a canister and scoop to begin filling the coffee filter.
Then after a couple of days of observations, at 7 a.m. the box knows that it should be time to make your coffee.
Then, it can compare this input to what it knows about the coffee-making task and will only store the information related to this task, disregarding the irrelevant data. So, coffee pot filled with water at 7 a.m.? Check. Relevant. Morning news comes on at 7:02? Delete. Not necessary for the task.
These two examples show what context can provide when we do compute at the edge. Context is all around us and provides some of the most important information for us to make choices.
The benefits of these contextual optimizations can be considerable. In a case like this with an AI assistant at the edge, we can ultimately optimize performance at low power. Low power means lower cost. Plus, less data means processing can occur at lower capacity, another time- and money-saving attribute.
We’re testing this concept using machine vision algorithms for the Intel® Movidius™ VPU (vision processing unit) chipset platform, using just the information from the encoder for analysis. Because we can use mechanized compression, we’re saving 25 times in bandwidth and 10 times in computing power. Such reduction in resources needed can translate to comparable savings in analytics cost or, as in our case, more than 12X increase in the number of simultaneous video streams analyzed on the same platform in real time.
With self-learning algorithms, contextual learning and multimodality, AI and edge technologies will one day, at last, be integrated into a productive and efficient system. And then, the sky is the limit for AI applications at the edge.