Yellow Pages Canada offers information on local businesses, products, and services. The company works with billions of rows of data—so big that they collected 57 billion rows within just 25 months. The enterprise data team at Yellow Pages stores this data in a Cloudera Hadoop data lake — and then analyzes the data in Tableau with help from Tableau and Cloudera Partner, AtScale. Today, the team is analyzing live data—fast—with Tableau Desktop and then sharing insights across the enterprise by publishing to either Tableau Server or Tableau Online.
In this video, Richard Langlois, Enterprise Data Management Director speaks about his experience running Tableau and AtScale on top of Hadoop. Today, the team can analyze big data 10 to 20 times faster leading to better, faster decision making.
Tableau: How big is the database that you’re working with?
Richard Langlois, Enterprise Data Management Director: The size of the database we're using, one table we have 57 billion rows. This is only for 25 months of data. So we do a rolling 25.
So if you want to do all that on Hadoop—even if you have something like Hadoop—it's fast, but it might not be fast enough.
Tableau: Can you talk about your experience with AtScale and Tableau?
Richard: Actually, to connect Tableau to AtScale, it's like connecting to any other sources. And Tableau is quite good at connecting to a lot of different sources.
And the job of that AtScale is to intercept the query that Tableau does, create, aggregate behind the scene, and rewrite the SQL that Tableau generates so that the user will never know.
Now, but when you're trying to do analysis and you don't want to slow people's line of thought, right? They have something they're trying to solve. So if they can have an answer back much faster it's great.