What you can’t see could cost you millions
by Brendan O’Flaherty, CEO, cPacket Networks
Financial trading firms have invested millions in what they thought were “best of breed” network monitoring systems. They’re now finding that these systems are blind to the problems that affect them the most, as unseen traffic bursts and spikes are creating risks measured in millions of dollars per second.
According to a Greenwich Associates report, financial service firms spend more than $1.2 billion annually on financial-market data feeds. With spending growth averaging above 6% annually since 2012, financial service firms and exchanges are looking to leverage these large investments for maximum gain. However, because market trading behaviors are event-driven, these organizations often still lose money from missed trades or errors caused by extreme fluctuations in traffic, or when network traffic exceeds capacity. For financial services firms and their customers, network performance and the cost of downtime are serious business, as split seconds can have million-dollar, or perhaps even billion-dollar, ramifications.
For proof of the market’s inherent volatility, look at the unexpected Brexit vote. This resulted in the worst one-day market-drop on record ($2 trillion), with global equity markets losing a record $3.01 trillion during the course of two days, making it the largest ever loss to S&P’s Global Broad Market Index of all time. Since market exchanges and brokerages earn money on each transaction, millions of dollars are at stake for every second a network is down, or when trades are missed because the market feed data was incomplete.
If lost profits weren’t enough, when a trade can’t be executed within milliseconds of being received, companies risk significant regulatory compliance fines. For example, in 2015 on the Forex market, seven of the world’s top banks were fined close to $10 billion, with four pleading guilty to manipulating rates by influencing when customer transactions were recorded.[1]
The Challenges
There are a number of challenges to monitoring financial market environments. Market trading networks are designed to deliver data in real-time—that is, at the very moment the data is available, allowing traders globally to conduct transactions on an even playing field. These real-time networks create fundamental obstacles that thwart the operation of network performance monitoring (NPM) systems. For example, NPM systems cannot meet stringent regulations such as MiFID II—which is set to take effect on January 3, 2018 and requires nanosecond timestamping of trades—along with new rules around longer storage of trading-related data.
To achieve real-time data distribution, these networks typically leverage UDP (User Datagram Protocol), which allows real-time streaming of global pricing data. While providing the most efficient streaming method available, UDP is unable to retransmit lost data packets, which means that information is permanently lost for all downstream traders.
Network traffic by nature is “bursty,” which makes accurate capacity planning difficult, since it’s not good enough to plan for averages. Spikes and bursts in the traffic have the potential to exceed available bandwidth, causing critical data to be lost. And, because these “microbursts” can be as short as a single millisecond, they are often invisible to standard analysis or NPM tools.
Even the smallest microburst can create a litany of problems, resulting in errors in trading spreads and prices, or even halting a trade entirely. Financial service firms and exchanges also broker their market feeds with one another, leading to Service Level Agreements (SLAs) with their customers. But if any firm lacks proof of whether it was the source of dropped data, the financial liabilities and losses can be major.
Current Monitoring Systems Are Inadequate for Next-Gen Trading Networks
In just a few years, market data network speeds have increased 800x, quickly going from 100 megabits per second (100 Mbps) to 10 gigabits per second (10Gbps)[2]. The problem? Network speeds typically increase faster than the ability to analyze those networks. Current monitoring systems are geared to function properly at 100 Mbps and 1 Gbps speeds, thus they’re unable to support speeds of 10G and higher.
One attempt to cope with the dramatic advances in network speeds (from 1Gbps to 10Gbps) was the introduction of the Network Packet Broker (NPB)[3], at best a work-around designed to bridge the speed jumps from 1Gbps to 10Gbps. However, NPBs create additional cost and complexity by creating a parallel network for monitoring. Worse, they fail when traffic exceeds 50 percent of a network link’s capacity—thus, microbursts that briefly exceed capacity will typically result in lost trading data. Further complicating the situation, NPBs have a tendency to create false errors, in which an identified problem is actually occurring on the copied traffic within the monitoring network.
Additionally, many network monitoring solutions struggle to perform their analysis in real time. They often employ a technique called Packet Capture, in which traffic is stored on a hard drive array before being analyzed. Unfortunately, this creates a delay in the monitoring cycle; thus, when an error occurs, multiple users are impacted before the source of the problem can be isolated. Another limitation is that most monitoring devices analyze data at only one-second resolution, or in a one-second slices of time. This makes microbursts “invisible”; only a higher-resolution tool (with resolution in the millisecond range) can make microbursts visible.
New Approach is Needed
In order for network monitoring systems to be able scale to meet the needs of next-generation market feeds, a new architectural approach is needed—one that reduces or eliminates the complexity of legacy monitoring systems. Supporting tomorrow’s trading environments will require higher-resolution analytics, at faster speeds, and with much broader coverage.
Placing intelligence closer to the wire at the network edge enables real-time drill-down into the events that impact traders the most. Network performance monitoring of high-utilization links of 10Gbps, 40Gbps and 100Gbps requires an analytics engine capable of delivering details required for troubleshooting on-demand, while detecting the bursts and spikes that take networks down, even if only momentarily. Millisecond-resolution analysis is also required for several reasons.
First, it is necessary in order to identify otherwise unseen microbursts that exceed network capacity and drop market data. This enables predictive analytics that warn of potential problems, such as gaps or out-of-sequence trading data. By monitoring the network in this way, network engineers are able to proactively find, troubleshoot and fix problems before end-user trades are impacted.
Secondly, if there are network link speed conversions within the network (right now typically 1Gbps -> 10Gbps, or 40Gbps -> 10Gbps), such as during Packet Broker aggregation, millisecond resolution is required to see if spikes and bursts are causing dropped traffic.
Finally, and most importantly, having better visibility into network behavior and the data streams reassures that profits for financial service firms, exchanges and end customers are generated consistently and without issue, while saving millions of dollars in lost revenue from missed or incorrect trades or regulatory fines.