Comments

Drinking from the Twitter Firehose with Python

Editor's note: As part of #DataInTheWild we are excited to have one of Tableau's resident Python gurus Eric Hannell contributing this guest blog post for us.

Hi, I'm Eric and I'm a Product Consultant here at Tableau (you can find me on Twitter and Linkedin). I really enjoy building things (like my Tableau colour palette creator app), and sometimes they even work!

I love Tableau, and I am intrigued by Twitter, so I wanted to find a way to get Twitter into Tableau (to learn more about me I am also on twitter @erichannell). DataSift is a fantastic option for this but I wanted to flex my (not-yet-so-developed) Python muscles so I decided to have a go at writing a script myself. Editor's note: if you'd like to learn Python check out our post from last month on Learn to be a Python Charmer.

Here was my plan:

  1. Get tweets
  2. Save tweets
  3. TABLEAU!

Lets visualize those tweets!

In other words, I thought that it would be great to be able find tweets I wanted based on a specific word (for instance a certain hashtag). More than that, I wanted to also be able to get some additional information about the tweeter (like their username, location, etc.) which I could then visualize in Tableau. Something like:

Lets visualize those tweets!
I wanted to be able to do something like this – Twitter to Tableau

Luckily for me, many very smart people have worked very hard to create amazing modules for Python that allow you to do almost anything, like connect to Twitter. I found a module called “TwitterSearch” which allows you to pretty seamlessly connect to the Twitter API and start find content. There are many other options (find a list of them here) but I decided to go with TwitterSearch.

Note: To get anything out of Twitter you need to supply TwitterSearch with a bunch of somewhat cryptic sounding codes (the” consumer key”, “consumer secret”, “access token” and “access token secret”).
These are basically codes that the Twitter API uses to confirm that you are who you say you are when you go asking it for data. Getting these codes is very important and involves you having to set up a twitter account (if you do not have one already) and then creating a “Twitter App” (less scary than it sounds). This tutorial explains how to get the tokens.

Now, with those keys, let’s look at some code:

All of the code is available on github


The first piece of TwitterSearch code

First we import the modules that we are going to use. We need TwitterSearch to search Twitter and then the csv module to output the data in a csv file.

Then we define a function that is going to find things on twitter and save them into a csv file. Line 9-11 creates the csv file with a few headers called user, time, tweet, latitude and longitude. And then lines 13-15 set up a search on twitter. In lines 17-21 we supply our keys so that we can access Twitter.

Next we need to get some tweets:


The code to get the tweets

In line 31 we are saying: for each tweet that is found do something. Then in rows 32 to 38 we are capturing the name of the user, the time that they tweeted and the content of their tweet (lines 36 and 37 are removing extra spaces that make the text look garbled). In lines 39-44 we are capturing the geo-location data if we can get it (apparently only about 3% of people share this data). Finally on line 46, we write the user, time, tweet, latitude and longitude data to the csv file.

Then to kick everything off we ask the user for a term to search twitter for and set the maximum number of tweets to be returned and then call the function:


The magic piece of code where you put the terms you want to search for and many tweets we want to return

Finally here is our output:

I searched for ‘Tableau’ and set my maximum number of tweets too be returned as 2,000. The output looked like this:

The output for ‘Tableau’

These two tweets (and the other 1,998 more tweets) are all saved in a file called Tableau.csv.

Then we can open the CSV in Tableau and take a look at the data (note this is just one of may vizzes you could make!):

We can now visualize our tweets!

Hi Twitter user JNVT!

I hope this has inspired you to have a go a drinking from the Twitter firehose yourself. To fork the code I used please visit github

Comments

Drinking from the Twitter Firehose with Python

Editor's note: As part of #DataInTheWild we are excited to have one of Tableau's resident Python gurus Eric Hannell contributing this guest blog post for us.
Hi, I'm Eric and I'm a Product Consultant here at Tableau (you can find me on Twitter and ...

Read more...
Comments

Roundup of Tableau Public's Data in the Wild Sources

February is #DataInTheWild month at Tableau Public. In today's digital world, data really is all around us. These days when you walk down the street to buy milk from the grocery store, you generate data with your fitbit. When you fall asleep after a long day at work, you ...

Read more...
Comments

Webinar: Import.io Live at Tableau

On Thursday 19 February we were lucky enough to have Alex Gimson the Community Evangelist at Import.io host a webinar in the Tableau offices in London. Alex demonstrated each of import.io’s four data extraction tools (for an overview of each tool check out Alex’s guest ...

Read more...
Comments

Excavating Data Gems from Wikipedia

It wasn't that long ago that I was in college being warned by my teachers not to use Wikipedia to do research for papers. Believing that academics were often too cautious and afraid when it comes to technology, I ignored them. The trick was looking at the footnotes of the ...

Read more...
Comments

Instantly Turn Web Pages into Data with Import.io

Editor's note: As part of #DataInTheWild we are excited to have a master at extracting data from the web,Alex Gimson the Community Evangelist at import.io, contributing this guest blog post for us.
Want to turn the web into data? My name ...

Read more...
Comments

February is all about Data Scraping - #DataInTheWild

This month we are looking at Data Scraping (collecting data from the web itself). The web is full of data in all shapes and forms that are just waiting to be ‘caught’ by you. From the page listing of all the waterproof jackets on Berghaus website, to the list of explorers ...

Read more...
Comments

Start Blogging! In the Blogosphere!

Now for something completely different.
One of my resolutions for 2015 was to start a blog. Another, helpfully suggested by master blogger and senior team member Jewel Loree, was to get more involved with the online Tableau Public community.
Sharing my thought ...

Read more...
Comments

Viz of the Day: It's all about the dialogue

THE TABLEAU PUBLIC TEAM BELIEVES THAT EVERY VISUALIZATION should either start or continue a rich and constructive dialogue around data. It's an amazing time to be alive: countless new data sets are published every day, talented and bold individuals and organizations around ...

Read more...
Comments

Surviving a Content Deluge

In today’s hyper-connected world, the amount of content out there that needs to be digested on a daily basis can be very intimidating – sometimes it can be downright stressful. I don't enjoy being intimidated or stressed, so for my New Years Resolution, I set out to figure out a way to manage all of this content. Luckily, as ...

Read more...
Comments

How to Steal from the Best

Editor's note: This Data Resolutions 2015 guest blog post was written by Hanne Løvik who works at ABC Nyheter.
Most people's new year resolutions are to work out more, eat less, learn Python. My new year resolution is to steal and cheat more! I hope to tempt you over to the dark side of data-journalism and ...

Read more...