Companies across a wide range of verticals are moving toward real-time data analysis of information generated by both people and machines, according to a new report by OpsClarity.
A recent survey by the company found 92% of respondents in tech/telecom, retail, health care, government and life sciences were investing in real-time or streaming data analysis rather than batch data analysis. The majority (55%) of those surveyed considered “real-time” data processing to take place in five minutes or less, with 27% saying real-time meant less than 30 seconds. Nineteen percent of respondents considered real-time to be within 30 minutes. Sixty-five percent of those surveyed said they already had near-real-time data pipelines in production. Another 24% said they were planning to actively use them by the end of this year.
Market projected to reach $1.37 billion by 2019
451 Research has estimated that revenue generated by event/stream processing vendors will reach $1.37 billion by 2019, a 29% compound annual growth rate over the next few years. In an executive summary on 2016 trends in data platforms and analytics, 451 Research identified stream processing as one of five trends “indicative of the very real changes that are happening within enterprises as they look to take advantage of the opportunities for generating business intelligence.
“Event/stream processing has been around for many years, but is now seeing increasing take-up outside early adopter markets (financial services) as more companies look to increase their rate of analysis in order to improve the speed at which they can make business decisions and respond to change,” 451 Research concluded. “The velocity of data (the rate at which it is produced) has long been accepted as key aspect of ‘big data,’ and the rise of the ‘internet of things’ is driving more enterprises to consider how they can take advantage of data produced by sensors and other data-generating machines. However … frequency of analysis (the rate at which data is queried by the business) is also a key consideration that is driving change not just in terms of the amount of data that is available for businesses to analyze, but also the way in which they want to analyze it. Stream processing technologies enable enterprises to act on ‘fast data.’”
Among OpsClarity’s other findings:
-Apache and MapReduce were the most popular data processing technologies, and many respondents reported using more than one. Apache Spark was used by 70% of respondents for data processing, and Map Reduce used by 50%.
-Ninety-one percent of those surveyed said that they relied partly or fully on open-source distributions of fast data processing.
-Sixty-eight percent used cloud or hybrid cloud architectures to run their data processing infrastructure and applications.
-Lack of expertise in big data was cited as the biggest barrier to operating and managing data pipelines, followed by lack of holistic visibility into their data processing systems. “It is tedious and time-consuming to configure monitoring and even then, they are stuck with point dashboards for each framework, not being able to look at common performance concerns that affect their overall data pipeline,” OpsClarity reported. Lack of expertise was also cited as a reason for instability in data pipelines, and respondents reported that troubleshooting was inefficient and drawn-out.
Image copyright: solarseven / 123RF Stock Photo