In the Future of Analytics, the Analyst Will Remain Central

Seven years ago, I predicted that IBM’s Watson would become a prevalent platform for data analysis. Watson had just won its famous battle over Jeopardy, and it was already apparent that the technology behind this remarkable system could be adapted to a number of domains. IBM has done just that, having released a version called WatsonAnalytics that exploits statistical and machine learning methods to discover hidden patterns in data sets. Watson’s natural language front-end enables it to converse with analysts in ordinary terms and is (along with Wolfram Alpha), an instance of what I call verbal analytics.

Seven years ago, I predicted that IBM’s Watson would become a prevalent platform for data analysis. Watson had just won its famous battle over Jeopardy, and it was already apparent that the technology behind this remarkable system could be adapted to a number of domains.

IBM has done just that, having released a version called WatsonAnalytics that exploits statistical and machine learning methods to discover hidden patterns in data sets. Watson’s natural language front-end enables it to converse with analysts in ordinary terms and is (along with Wolfram Alpha), an instance of what I call verbal analytics.

What Is Verbal Analytics?

Verbal analytics reflect the scenario in the film Minority Report, where Tom Cruise points to a room-size megapixel display and requests charts and links between distributed information sources. The video wall in that film is derived, in turn, from the Put That There project at the MIT Media Lab.

The value of verbal analytics is indisputable. Experts and novices alike can benefit from gathering information and analyzing patterns by speaking to an automated agent. But the reality is that this technology is still in its infancy.

Verbal Analytics vs. Visual Analytics

Visual analytics have demonstrated an equivalent benefit. The term was first coined in a document inspired by Jim Thomas at the Pacific Northwest National Laboratory (PNNL). The approach outlined in that document links hypothesis generation, analytical reasoning, exploratory data analysis, and visualization to illuminate associations in large and diverse collections of data. Visual analytics is not the chart or visualization that gets created out of the data, but the journey to that chart.

Unlike verbal analytics, visual analytics induce a user to touch, drag, and otherwise manipulate visualizations in order to request details, link sources, or assess trends. And in its best manifestation, it’s feels like you’re playing with data.

This idea was a breakthrough, and it came to market embodied in a new crop of software applications like Tableau. We are now seeing large-scale adoption of visual analytics and as a result a large disruption of the old model of how organizations should conduct analytics.

Analytics as a Conversation

Verbal analytics carry on a verbal conversation with the analyst. Visual analytics carry on a visual conversation. Which approach will win? Both. If I am correct in this prediction, then it will be companies and open-source projects who best understand the value of each approach that are most likely to succeed in the future world of big data analytics. Here’s why.

Verbal and visual processing are fundamental to human actions and understanding. Neither approach can replace the other, because we gather information through several channels and process that information in different ways depending on the channel.

Furthermore, each approach has its optimal context or setting. Our auditory channel is best suited for mobile analytics, for example, and our visual channel is best suited for a desktop, megapixel display, or office environment.

What the Future Looks Like

Will both approaches eventually merge into single analytic platforms? Possibly. Software applications tend to grow into each other’s space because of market demands. This possibility is long-term, however. There is much left to do to realize the potential of both approaches.

Premature attempts to solve all problems in these areas will result in buggy and simplistic applications—the kind of software that drives users to false conclusions or causes users to miss important details. Software like Weizenbaum’s ELIZA program, which fooled observers into thinking they were talking to a real psychotherapist, should make us skeptical of grand claims for “automatic analyzers.” Machine learning has come a long way since the days of ELIZA, but it still has a long way to go.

The Importance of Human Judgment

Perhaps the more troubling question in the rising popularity of machine learning in analytics is whether machine learning will replace the analyst in drawing inferences. William I.B. Beveridge (The Art of Scientific Investigation, 1957) had this to say on that topic:

“… the person who possesses the flair for choosing profitable lines of investigation is able to see further whither the work is leading than are other people, because he has the habit of using his imagination to look far ahead instead of restricting his thinking to established knowledge and the immediate problem.”

IBM Watson uses machine learning and statistical methods to reach its conclusions. And some visualization systems use machine learning to refine visualizations. If inferences lead to action, however, humans must be part of the analytic loop.

Machine learning methods will continue to improve their ability to include extraneous information, to backtrack, and to explore far-fetched hypotheses. But they will never have the imagination that humans possess in any critical domain where we must apply our best judgment. The value of both verbal and visual analytics lies in the computer guiding our decisions, not making them.

So the forecast by some analytic gurus that verbal analytics will replace statistics packages, BI platforms, dashboards, machine learning systems, and every other type of analytic environment is, at best, overheated, and, at worst, ignorant of basic psychology.

Future systems will allow us to speak to the computer in ordinary language and to access data repositories for relevant metadata. But these will not substitute for the immediate impact of an effective visualization and the ability to explore complex data sets through our own eyes.