Guest Post: Big Data versus Big Content

This guest post was written by Rik Tamm-Daniels of Attivio. One of the most important distinctions within the Big Data space is the difference between unstructured data (think unparsed data such as logs or sensor data) and unstructured content (any source where the insight you want is locked away in human-created text). The reason I separate these is that the profiles of these data sources are very different, as are the tools required to gain analytic insight from them.

This guest post was written by Rik Tamm-Daniels of Attivio.

One of the most important distinctions within the Big Data space is the difference between unstructured data (think unparsed data such as logs or sensor data) and unstructured content (any source where the insight you want is locked away in human-created text). The reason I separate these is that the profiles of these data sources are very different, as are the tools required to gain analytic insight from them.

For example, log data can come in huge volumes and high velocity, but a large percentage of it just isn’t valuable at a per-record level. However, when you apply analytic algorithms via MapReduce, you can very efficiently crunch these data sets down to meaningful analytic insight to reveal overall trends. On the other hand, consider email as a source. Email has a high value per record, it is not typically created at the same volume and velocity and it requires linguistics and text analytics to extract meaningful analytic insight.

Fortunately, the Big Data conversation is evolving, and the term Big Content is increasingly being used by organizations like Gartner and AIIM. This new term is well justified given the very significant differences in the data itself, and allows us to have a more focused conversation, driven by business-value creation. When companies start to compare the business value of also investing in Big Content versus just Big Data, they start to realize that Big Content is, in many cases, the most important aspect from a value standpoint – an “Unsung Hero” if you will.

Consider what we can learn from applying text analytics to this example of a single airline complaint email:

big content

With these new pieces of data, imagine what the customer experience could be like the next time the customer calls in. The customer service rep would not only see the transaction history, but also the interaction history that represents the current state of the relationship. Using Tableau, this data can be visually summarized to help the CSR understand the state of the relationship at a glance in order to proactively offer an entirely new and transformative level of service.

Now consider the value of a lot of emails. In the case of this example airline, they could instantly answer questions such as “Which of our high-value customers are having a high-ratio of negative interactions?” “What routes are they flying?” and “Why are they unhappy?” - all through a single Tableau dashboard.

sentiment analysis

Taking things a step further, in customer experience analysis alone, there are a large number of Big Content sources of customer interactions (email, CRM case notes, call center notes, social media, SMS, company forums, survey comments, etc…) that are all highly relevant and highly valuable from an analytic perspective.

So, as you jump into the Big Data waters, take some time to think about which sources of information contain the insight you need to ”move the needle” for your business. You’ll likely conclude that Big Content is the place to start realizing Big Value from Big Data. Here are some suggestions by vertical market to help you get started!

Rik Tamm-Daniels is an Attivio co-founder and currently serves as Vice President of Technology for the company's Channels and Alliances division. He is responsible for developing and executing Attivio's technical strategy for OEM, SaaS, SI, VAR and Technology Alliance partner recruitment and enablement.

Attivio's award-winning Active Intelligence Engine® (AIE®) is the leading unified information access platform, enabling customers to bring together all relevant information — internal and external, structured and unstructured — for data discovery, analysis and decision support.