Hadoop to impact how mobile operators tackle big data
Editor’s Note: With 2016 now upon us, RCR Wireless News has gathered predictions from leading industry analysts and executives on what they expect to see in the new year.
Last January we predicted 2015 would be the year of data-driven applications built from the ground up on Hadoop. The validity of this prediction was confirmed with the Hadoop application focus at Strata this year.
Mobile operators truly have “big data.” I’d like to look forward to 2016 and build on the original post based on what we have experienced working with some of the largest, most visionary mobile operators in the world.
In 2015 we coined the phrase ‘native Hadoop applications vs. alien applications.’ We believe in 2016 big data savvy customers will demand data-driven applications built from the ground up on Hadoop
In 2015, a typical question was “do you support Hadoop?” As customers get past installing Hadoop clusters and are getting more educated on Hadoop, we are seeing more and more questions like:
• How does your application run on a Hadoop architecture?
• How does your application store data in Hadoop?
• Do you support querying through Impala?
• Do you support Yarn?
• Do you support Kerberos and Sentry?
• Is your application built on native Hadoop components?
In 2016, as customers gain more knowledge on Hadoop they will demand native applications over aliens.
Fraud detection will be one of the top three big data/Hadoop application initiatives in every major mobile communications company
Mobile operators understand that data silos reduce the ability to analyze data effectively and nearly all of the leading mobile operators have a data lake strategy. A communication service provider data lake can consist of:
• Batch billing data – TD.35, CDR, TAP 3
• Real-time call packets – ISUP, BSSAP, MAP
• Real-time fixed call packets – ISUP, Diameter
• Real-time VoIP call packets – SIP, H.323, Diameter
• Real-time data packets – GTP, Diameter
• Business data – CRM, billing
Based on our experience, fraud and security applications are the low hanging fruit for rapid Hadoop return on investment against a CSP data lake. The new top three types of real-time fraud attacks are international revenue share fraud, interconnect bypass and premium rate service fraud.
We predict that in 2016 revenue threats in the form of fraud threats, profit threats, and service level agreement threats will be discovered by using real-time analytics against these new CSP data lakes. We also predict fraud and revenue threat analytics applications will be one of the top three Hadoop initiatives for mobile operators in 2016.
Packaged machine learning applications built natively on Hadoop will become the preferred way to detect fraud and revenue threats in mobile networks against massive CSP data lakes
Donald Rumsfeld is famous for the phrase:
“There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things we don’t know we don’t know.”
Last year we discussed the challenge of detecting “known unknowns and unknown unknowns” types of attack. Traditional rules-based approaches simply cannot detect new, unknown zero-day attacks. These types of attacks are now visible to modern machine learning using anomaly detection when you have enough data and signal.
We predict, in 2016 existing rules-based systems will sit side-by-side with modern machine learning (anomaly detection) fraud detection systems. Known unknowns and unknowns unknowns will move from a cloak of invisibility to obvious anomalies.
Packaged graph analysis applications built natively on Hadoop will become the preferred way to detect crime rings attacking mobile networks
Today, criminal attacks against a mobile network typically use “flat-world forensics.” LinkedIn discovers business relationships through one or more degrees of separation using graph analysis.
We believe that 2016 will see the same techniques used to detect organized, criminal relationships through one or more degrees of separation and these applications will be packaged up to make it simple for any analyst to use.
Packaged native Hadoop machine learning applications will become the preferred way to detect anomalous, suspicious behavior in IoT networks
We believe mobile and “Internet of Things” networks in the SIM-connected world will share the need for advanced anomaly detection and machine learning. The IoT industry is estimated to grow in the range of $3.9 trillion to $11 trillion and we believe native Hadoop machine learning applications will be an important part of that industry used to detect anomalous, suspicious behavior.
We predict that in 2016 anomaly detection at the packet level, to detect suspicious behavior, will be required across both mobile and IoT networks.