Cloudera Impala Integration: Fast Analytics for Hadoop
Last week, I blogged about the Big Data announcements we made in conjunction with the Strata-Hadoopworld conference. There was one additional highlight that didn't make my blog post because the technology had not been announced yet at the conference. It is actually a very significant announcement for customers using Tableau on top of Cloudera's hadoop distribution (CDH). At the conference, Cloudera announced a new hadoop technology called Impala, which is a real-time query processing engine for hadoop. Tableau was chosen as one of its first partners to integrate with Cloudera Impala and we previewed an early version of the Impala connector at the conference.
Cloudera Impala was developed in response to one of the biggest complaints of using Hadoop (and Hive) for analytics: latency. While hadoop was great for batch-oriented type of workloads that churned through massive volumes of data, it did not provide a fast and interactive experience for users doing ad-hoc analytics. Even very simple queries could take 20 seconds or more because of the Map-Reduce overhead occurring behind-the-scenes.
Impala by-passes the Map-Reduce layer in hadoop. In their internal tests, Cloudera has reported that Impala is anywhere from 3x-90x faster than Hive depending on the type of query and workload. This should provide significant performance gains over Tableau's existing Hive connectivity.
Customers interested in trying out a preview version of the Impala connector should contact their Tableau account manager for details about how to participate in the early access program. For more information about Impala, I encourage you to read Cloudera's blog post that describes the technology in more detail.
PS - many thanks to Franz Funk for creating the dashboard above that we showed at the Strata-Hadoopworld conference!
Learn more about the powerful combination of Tableau, Hadoop, and Impala
Want to learn more about how Tableau and Hadoop can work together? To find more case studies, user stories, news and info, visit our Hadoop resource page.
Articles sur des sujets connexes
Abonnez-vous à notre blogue
Chez Tableau, nous tombons chaque jour sur des nouvelles excitantes au sujet des données, de l’analyse de données et des visualisations. Communiquer ces nouvelles sur notre blogue est un élément clé pour mener à bien notre mission d’aider les utilisateurs à voir et à comprendre leurs données. Vous y trouverez des conseils pour utiliser Tableau plus efficacement, des témoignages de gens ordinaires sur la façon dont ils surmontent des défis liés à leurs données et bien plus encore. Le blogue de Tableau est une excellente ressource pour les passionnés des données.