By Raif Majeed 2008/09/04
Now that the Olympics are done and the results are in, it's time to look at the data. Using Tableau maps and some creative exporting, I was able to ask new questions about who won and why.

This data intrigued me for a couple of reasons. First of all, I'm a sports fan. Second, I was inspired by the New York Times visualization of Olympic medal counts. (And by the way, they recently showed a screenshot of a medal treemap from ManyEyes as well.)

It's not easy to find a complete, historical, up-to-date, simply-formatted Olympics data set. Thanks to the work and guidance of Ross Bunker, one of the developers here, I was able to get something workable. Here I'll be looking at country-by-country medal counts for modern Olympic summer games.

First, here's a Tableau map showing how the various countries did in Beijing. The size of each pie represents the total medal count; the size of each pie slice represents gold, silver, and bronze medals respectively. As you probably heard, the US took home the most total medals, but China dominated the gold medal tally, and this certainly shows up here.

Geographic View by Medaling Country and Year

The overall geographic distribution of medaling countries is quite remarkable as well. I put the year on the Pages shelf, so if you download the packaged workbook you can flip through the years to see how the geographic distribution of medalists changes over time.

When I looked through that, I was particularly surprised by America's domination of the 1904 summer games -- so I wondered if there was some way to examine rankings by year and pick out the lopsided ones. After some more consultation with Ross, I was able to construct a table showing the top (N) countries by year. Here's a snippet:

The games that stood out as lopsided were the early ones -- when it wasn't so easy or compelling for athletes to travel to the Olympics, and the host country typically cleaned up -- and the ones boycotted by the US (Moscow, 1980) and the Soviet Union (Los Angeles, 1984).

Now this is where it got interesting. I was lamenting to Ross that we could not use auto-generated latitude and longitude in Tableau calculations, and he told me that, actually, you can do that. First, you generate a very basic map and put all the dimensions and measures of interest on the Level of Detail shelf. Then, you export data using one of the File => Export options and connect to the resulting data source.

In the packaged workbook, you'll see two "(Source)" maps that I exported data from. Now, by connecting to the exported data, I was able to compute the approximate distance they traveled by digging out some old spherical-trigonometry formulas. If you edit the [Distance (miles)] calculated field you'll see how I did it.

What's the distribution of distances traveled by medaling teams? Here it is, binned into 500-mile increments.

The fact that there are a handful of distinctive peaks is interesting at first, but after looking at individual underlying data I think it's mostly an artifact of the uneven distribution of Earth's population.

Another question: What's the joint distribution of distances traveled by teams and the medals they were awarded?

Here, each dot represents a medal type, country, and year -- for instance, the number of silver medals won by the Soviet Union in 1972 (Munich) is a single data point. We can see those distinct clumps again -- which I've annotated in general terms at the top. It's a mixed picture, but broadly speaking it seems that teams that travel less win more. In particular there is some advantage to being the home team, as one might expect, and as China showed this year. Now, is the advantage because more athletes will travel when the distances are shorter, or because they're better rested or more familiar with the surroundings? I don't have the data to fully answer those questions yet but I plan to keep digging.

And, finally, let me note my newfound admiration for Barbara Kendall of New Zealand, the world's longest-traveling summer Olympic gold medalist. In 1992, she traveled 12,000+ miles from New Zealand to Barcelona and won the women's sailboard competition. Or put another way, she's the orange dot all the way on the right.

[NOTE: This post was updated to correct some unclear wording in the second paragraph.]


Nice workbook!

I had been trying to get that distance between 2 points thing to work a few months back, but no luck. This example will serve as a great method for replicating that logic. You should consider simplifying this for users by making this formula a canned choice in your functions in a future release.

Daniel, thank you for the feedback! That's a great feature request -- let me bring that up with the maps team.

Alex, you bring up a great point. I hadn't heard of Vincenty's formulae before -- thanks for the pointer -- but it definitely is important to account for asphericity as appropriate. Country-level geocoding also introduces some uncertainty in the method I've used.

I should also mention that the Earth-radius value I used (3960 mi) seems to have been about a half-percent too low, resulting in similar distance errors.


What about using the Shape Shelf to encode home vs away in the joint distribution dot plot, (Home is a filled square and away is a filled circle), or with text color or another dimension on the column shelf for your table? That way folks would not really have to reach back and remember where the Games were held a particular year.

Peace and All Good!

Michael, thanks for the suggestions! :)

Unstated in my original post was that I was looking for some way to encode distance, meaningfully but not confusingly, along with other information. I couldn't come up with anything before "press time."

I like the idea of highlighting home vs away on the joint distribution plot -- shape is reasonable, plus a text label to identify the team/country in question.

In the table, I might use size encoding to identify home/away. Color would also work; I'd just have to give up the color encoding of medal types.


Oh, yeah. forgot that you are already using color in the table.