My Summer Job Made Me Angry, So I Looked to the Data

By Guest Author 2016/09/06

Note: The following is a guest post by Malte Witt, a student of Tourism Management at the Munich University of Applied Sciences (MUAS). This piece kicks off our back-to-school series.

Okay, I'll admit, the headline might be a little bit too much. But really, the inspiration for this post came from my workplace.

I spent my summer days helping people book their holidays. More specifically, I work for, which is a distribution platform for high-quality, high-cost holidays.

A few of those weeks were really annoying because I had to let many customers know that their dream trips were already sold out. I had a feeling the time span between the initial inquiry and the desired departure date was getting smaller every day. Naturally, I needed to validate my feelings with hard facts.

Luckily, my boss allowed me to extract some data from our database as long as it was anonymized.

Why Do You Fail Me, Excel?

I first tried visualizing the data in Excel, which soon turned out to be a huge pain. It seems as though Excel is amazing for cleaning and shaping the data (at least amazing enough for my purposes), but not that great at visualizing the data.

Good thing I am a student. Enter: the free Tableau license for students. I applied for the free license, got a response on the same day, and am now a happy customer for life. Well done, Tableau.

With Tableau, it was an absolute breeze to visually explore the data and then create the graphs you see below. From now on, my workflow will probably always be: importing, cleaning, and shaping the data in Excel, then visualizing and exploring the data in Tableau.

Let's Validate Those Feelings

With Tableau, it was really easy to aggregate the average number of days between inquiry and departure date for each month over a couple of years. A small side note: I hate the fact that bar graphs are so good at visualizing this data set. Bar graphs are boring, but effective.

I could end the post here. The graph clearly shows that inquiries made between April and July are on a much shorter notice than inquiries made during the winter months.

Why are the numbers still so high? I did not bother to calculate standard deviation, but of course the range of values is very high. My guess is that for July, a very simplified data set would look something like n = 4,4,4,4,160,160,150.

I was hooked and wanted to know more. How long does it take our customers on average to make the decision to book?

This validates my theory even further. In April, customers are surprised, like: "Whaaat? Easter holidays again?!" and need to book a cool trip on very short notice. During August, customers realize the weather in Germany is terrible and they become very booking-happy. I can't really explain the data in February, though. If anyone has an idea, I'd love to hear it.

One more metric that strengthens my theory is the average number of days between booking and departure.


I was so into my analysis by this point that I didn't really think about whether I needed any more graphs; my mind just demanded it. So I plotted the average-trip duration by month as well as the average cost. (Unfortunately, the costs come without labels, because, you know, business secrets.)

One major thing I’m taking away from this data analysis is my new workflow with Tableau. I now have a tool at my disposal which allows me to visually understand my data. Before, I had to work through the numbers in Excel and then choose the right graph. Now I can just drag and drop, and switch things around at my leisure without worrying about anything else but the view.

I hope you enjoyed my frustration-born journey into my employer’s data. As always, comments and constructive criticism are very welcome.

To learn more about Malte's data journey, check out his blog, The Sigma.


Submitted by Roedolf Smit (not verified) on

Great work.

Regarding February data: Valentines day? Last minute decision to buy a trip as a gift, but the trip may be future dated by some margin. This explains both the short time between enquiry and booking on the one hand, and the relatively long lead time between booking and travel on the other.

On the last 2 graphs: Could you note whether month is 'Month of Booking" or "Month of Travel" as I think that might be an interesting distinction.

Submitted by Jana (not verified) on

Re: February:. My theory is that because February is the most depressing month, people decide to book a trip at this time so that they have something to look forward to, to cheer them up.

Submitted by uma r. on

Very Well explained with smooth explanation. It encouraged me to create project for myself to explore insight of data.

Submitted by Garrett (not verified) on

I'm not sure what your demographics are like, so this might not be super relevant... but Spring Festival (Chinese New Year) often falls in February. That's supposedly the largest human migration event in recorded history :)

Submitted by Brian H (not verified) on

My two cents regarding February: in January, many people are living on a pretty restricted budget due to having to pay off the costs of Christmas excess and maybe also some bills (e.g. professional subscriptions? Insurances? High energy bills?) which become payable at the start of the calendar year.
Once the 'January blues hump' is overcome, they may be in the happy position of being able to afford a 'fancy weekend away' break so they make a quick decision to treat themselves for their personal financial upturn'.
I'd also add the 'Valentines Day' effect mentioned above by Roedolf.

새 댓글 추가 

non-humans click here