Big data analytics often deal with identifying trends in known areas. But what about unrealized and newly emerging trends amid the spam? How do you recognize the big data “unknowns”?
Katherine Matsumoto, natural language processing expert and product manager for Attensity, said that amid spam mentions and unrelated terms, doing accurate “social listening” to end users can be a challenge.
Attensity focuses on natural language processing analytics. The company, founded in 2000, started out with a focus on text analytics and understanding the context around unstructured data – that is, data that isn’t in an easily categorized or discrete form. Attensity still does text analytics, but in the past few years has extended its expertise in context to social analytics.
Matsumoto described the Internet and its vast trove of social media as “the world’s largest consumer focus group. Everyone is on there talking about what do they like, what do they not like, what do they want, and other interests – and we’ve created Attensity Q to allow users to ask any question they would want to ask of that kind of consumer focus group.”
The company has a particularly strong presence in telecom, consumer electronics and high-tech industries, Matsumoto said, because of the massive volume of customer feedback and sentiment generated by end users. It counts T-Mobile US and Verizon Wireless among its clients.
“Everyone has a phone and a phone plan,” Matsumoto said, and users often have very emotional connections to the mobile products they use, tending to result in social media chatter about their experience and impressions of a brand.
Attensity recently launched new upgrades to its Attensity Q platform that enable identification of emerging themes in real-time data, it said, as well as automatic profiling and detection of unknowns in the data.
The company’s platform combines sentiment and trend analysis with geospatial information and information on trend influencers, and said its approach of analyzing the conversations around emerging trends enables it to act as an “early warning” system for market shifts.
“Analytical monitoring tools typically look for what they’re told to find, thereby potentially missing important data,” said James Purchase, VP of product management at Attensity, in a statement. “By analyzing the conversation thread emerging from a topic, organizations are able to better understand conversations beyond simplistic keywords and see the larger implications for them and their markets.”
Matsumoto told RCR Wireless News that part of the problem of getting a good picture from social listening is the sheer amount of spam – clickbait and tweets about great prices on the iPhone 6, Samsung Galaxy giveaways and the like. Simple metrics on mentions aren’t enough without sufficient context. She estimates that as much as 30% to 50% of content is spam, which means “you can’t even really rely on the metrics that you get back, when one-third to half are not really mentions of the brands or products.”
One of Attensity’s areas of interest has been in accurately filtering out spam and recognizing the context when, say, a mention of “sprint” refers to a race and when it refers to the wireless carrier.
Discovering the conversational context behind mere mentions, Matsumoto said, is key to making social data something on which companies can act. Are people posting, for example, about long lines at a particular carrier’s stores to buy the iPhone 6? Attensity expects conversational discovery to be a key part of business intelligence.
“In order for that data to be actionable, we find customers have to be able to say at a glance: What is it they’re talking about? What are they latching on to? What are the conversations that people are having?” Matsumoto said.