Analytics anyone can use.
Data prep anyone can use.
Analytics for organizations.
Cloud analytics for organizations.
This data intrigued me for a couple of reasons. First of all, I'm a sports fan. Second, I was inspired by the New York Times visualization of Olympic medal counts. (And by the way, they recently showed a screenshot of a medal treemap from ManyEyes as well.)
It's not easy to find a complete, historical, up-to-date, simply-formatted Olympics data set. Thanks to the work and guidance of Ross Bunker, one of the developers here, I was able to get something workable. Here I'll be looking at country-by-country medal counts for modern Olympic summer games.
First, here's a Tableau map showing how the various countries did in Beijing. The size of each pie represents the total medal count; the size of each pie slice represents gold, silver, and bronze medals respectively. As you probably heard, the US took home the most total medals, but China dominated the gold medal tally, and this certainly shows up here.
The overall geographic distribution of medaling countries is quite remarkable as well. I put the year on the Pages shelf, so if you download the packaged workbook you can flip through the years to see how the geographic distribution of medalists changes over time.
When I looked through that, I was particularly surprised by America's domination of the 1904 summer games -- so I wondered if there was some way to examine rankings by year and pick out the lopsided ones. After some more consultation with Ross, I was able to construct a table showing the top (N) countries by year. Here's a snippet:
The games that stood out as lopsided were the early ones -- when it wasn't so easy or compelling for athletes to travel to the Olympics, and the host country typically cleaned up -- and the ones boycotted by the US (Moscow, 1980) and the Soviet Union (Los Angeles, 1984).
Now this is where it got interesting. I was lamenting to Ross that we could not use auto-generated latitude and longitude in Tableau calculations, and he told me that, actually, you can do that. First, you generate a very basic map and put all the dimensions and measures of interest on the Level of Detail shelf. Then, you export data using one of the File => Export options and connect to the resulting data source.
In the packaged workbook, you'll see two "(Source)" maps that I exported data from. Now, by connecting to the exported data, I was able to compute the approximate distance they traveled by digging out some old spherical-trigonometry formulas. If you edit the [Distance (miles)] calculated field you'll see how I did it.
What's the distribution of distances traveled by medaling teams? Here it is, binned into 500-mile increments.
The fact that there are a handful of distinctive peaks is interesting at first, but after looking at individual underlying data I think it's mostly an artifact of the uneven distribution of Earth's population.
Another question: What's the joint distribution of distances traveled by teams and the medals they were awarded?
Here, each dot represents a medal type, country, and year -- for instance, the number of silver medals won by the Soviet Union in 1972 (Munich) is a single data point. We can see those distinct clumps again -- which I've annotated in general terms at the top. It's a mixed picture, but broadly speaking it seems that teams that travel less win more. In particular there is some advantage to being the home team, as one might expect, and as China showed this year. Now, is the advantage because more athletes will travel when the distances are shorter, or because they're better rested or more familiar with the surroundings? I don't have the data to fully answer those questions yet but I plan to keep digging.
And, finally, let me note my newfound admiration for Barbara Kendall of New Zealand, the world's longest-traveling summer Olympic gold medalist. In 1992, she traveled 12,000+ miles from New Zealand to Barcelona and won the women's sailboard competition. Or put another way, she's the orange dot all the way on the right.
[NOTE: This post was updated to correct some unclear wording in the second paragraph.]