Primer: What exactly is clustering, and why would you use it?

One great new feature in Tableau 10 is clustering. What was once only available to people who could use complex statistical tools can now be done with a drag and a drop. I know my mean from my mode and what a normal distribution looks like, but like many of our customers, I’m no statistical expert. I wanted to answer a simple question: What exactly is clustering and why would you use it?

One great new feature in Tableau 10 is clustering. What was once only available to people who could use complex statistical tools can now be done with a drag and a drop.

I know my mean from my mode and what a normal distribution looks like, but like many of our customers, I’m no statistical expert. I wanted to answer a simple question: What exactly is clustering and why would you use it?

Clustering is like grouping. This quick demonstration explains how it can be compared to our regular grouping feature, but in this case, you let a statistical model decide your groups for you:

That’s pretty cool, right? How can I use that in the real world? Below is an example that a global tourism company might use, if it were looking to expand its markets:

What’s going on with clustering?

As a non-statistical expert, I’m well aware of the perils of using statistical models. “Correlation does not equal causation” is such a familiar refrain, I know to be careful when using trend lines and regressions. When I first started using clustering, I was a little wary. Is it risky to draw conclusions from the groups made by the clustering model? I sought help from the experts in Tableau.

“Clustering is slicing your data much like creating bins in your data," says Bora Beran, product manager at Tableau. "The nice thing about methods like clustering is that the results aren’t extrapolations like forecasting.”

“Clustering is just a different way of aggregating or grouping the data," says David Sigerson, a Tableau sales consultant and ex-employee of SPSS. "Clustering allows you to use multiple variables to create that grouping.”

That convinces me I’m pretty good to go. Clustering: It's like grouping, but instead of manually grabbing some marks and making the groups yourself, you’re using a model (k-means clustering, if you want be precise) to do the grouping for you.

Other use cases

I also looked around the internet for some use cases. You probably know about market segmentation, hospital research, and the like. Here are some of the more unusual and interesting examples I found. I hope they inspire you to get clustering!

What will you use clustering for? Let us know in the comments! And check out Bora Beran's blog post for another great explanation on the power of clustering.

Learn more about Tableau 10