Jock Mackinlay on Visualization & Big Data
Melinda Minch, software Engineer at Tableau, covers Jock Mackinlay's standing-room only talk at Tableau Customer Conference 2011.
I became interested in big data while I was involved in a project that had to do with Big History, which is the history of the Earth since the Big Bang. The varying scales on which information had to be presented was daunting. My team struggled to present the thousands of years of human history in context with billions of years of geological time.
Jock and a packed house at #TCC11.
The value in big data is similar- billions of records come together to tell a story, but each data point might also be individually interesting. The challenge is to present the whole story in a cohesive and intuitive way.
Jock started with a short explanation of how people use our visual systems and interaction with data to expedite analysis. A quick look at a well-crafted viz can tell a story much more easily than any other means.
He identified 3 categories of big data: wide (lots of columns), tall (lots of rows) and multiple sources.
Wide data: there is a limit to how much data you can put in a single view and make it understandable (Bertin's barrier). Small multiples and coordination between views using things like actions and dashboards help you make sense of wide data. They break multiple dimensions up into vizzes that are easy to understand at a glance.
Tall data: you need filtering and aggregation to make tall data less tall- either drill in to smaller areas of the data or take a higher level view with aggregations. The main challenge is making your display fast and interactive enough for users to choose a level of detail that is appropriate for the questions they want to answer. The pause automatic updates feature can make vizzes with tall data much more manageable and responsive.
Multiple sources: joining data tables can be hard. Make sure that you're joining on the right fields so that you're comparing the right data. Data blending is a great way to accomplish that.
These are techniques to help bring big data to a human scale.