Telefonica: “I think one of the killer apps of network programmability can be ML”
Before diving into the role of a programmable data plane in supporting telco AI applications, Telefonica Research’s Senior Research Scientist Eduard Marin Fabregas provided a bit of a history lesson to attendees of the Telco AI Forum 2.0, available on demand here. First things first, he noted the typical network composition of a management plane to monitor and configure devices remotely, a control plane, including protocols to populate forwarding tables, and a data plane made up of all the devices used to forward packets.
This setup, Fabregas said, was designed “with resiliency in mind, which was great in the early days. But it also came with some problems. One of the problems…is that the data plane and the control plane are tightly coupled together on the save device.” Hardware-centric routers lacked programmability, supported a fixed set of functionalities, and the ASICs they are built with led to cost and complexity. “This prevents us from innovating,” he said. “And, of course, this has had some impact on many applications that we could develop in the networks. AI is one of them.”
The rise of software-defined (SDN) controllers, and other technologies like network functions virtualization, container management and eBPF, have enhanced the ability to monitor packets and data flows, and other network telemetry, and have otherwise increased programmability. Specific to SDN, Fabregas said, open, standard interfaces and disaggregation have enabled a larger ecosystem to deliver more innovation. “And this has changed a lot the way we can do ML in the network.”
He continued: “Now what we can do is we can do feature collection on the data plane, so on the router itself.” However, inference cannot be done in the routers. Instead, operators configure the data plane to sample specific data points often either periodically or as triggered by defined events; that information can then be forwarded to the control plane. “The problem is we can’t really do per packet inference at live speed…Still this is not ideal. But this is not the end of the story.”
He said that using programmable data planes for ML tasks is “one of the biggest innovations that we’ve had in many years.” Using the P4 programming language, CSPs can essentially set routers up to conduct specified operations at Terabit speed for real-time decision making and improved network visibility. “We can decide things on the network in real time on the routers themselves…We can extract many more insights from the network.” Because routing and switching infrastructure sits between user devices and the core network, “It can play an important role in many more of the functional aspects behind ML,” Fabregast said.
Fabregast went on to describe a protocol-independent switch architecture (PISA) consisting of a parser, programmable pipeline and de-parser (see above image). Essentially, he said, PISA allows for multiple stages where ML inference can be applied, albeit with challenges around memory. “These devices offer huge opportunities for parallelization, so operations that don’t depend on each other could be placed on the same stage.” Adding ML inferencing in the data plane allows for more complex feature extraction and customization.
Fabregast gave the example of anomaly detection traditionally done in the router by sampling packets, and potentially forwarding those packets to another appliance running an ML model to decide, for instance, if a packet is malicious or not, then sending it back to the router where a particular policy can be applied. Putting that ML model directly on the switch could support anomaly detection of every single packet without affecting throughput. He described this use of a programmable data plane as “a first line of defense. You could think of this as deploying a relatively lightweight, simple ML model perhaps just to detect attacks.”
Big picture, Fabregast said, “The future is self-driven autonomous networks—networks that will make decisions based on data.” Adopting closed-loop automation for monitoring, analysis and action will lead to adaptive, resilient networks. Advancements in hardware, unified standards and APIs to share information between switches, and further model development using synthetic and augmented data, will drive further programmability. “I think one of the killer apps of network programmability can be ML,” he said.