Update 2-20-2015: The connector for Spark SQL is now released and available for version 8.3.3 and newer.
We are thrilled to announce that Tableau is launching a new native Spark SQL connector (currently in beta), providing users an easy way to visualize their data in Apache Spark.
Spark is an open source processing engine for Big Data that brings together an impressive combination of speed, ease of use and advanced analytics. Spark enables applications in Hadoop clusters to run in-memory at up to 100x faster than MapReduce, while also delivering significant speed-ups when running purely on disk. Spark SQL provides an interface for users to query their data from Spark RDDs as well as other data sources such as Hive tables, parquet files and JSON files. Spark’s APIs in Python, Scala & Java make it easy to build parallel apps. Lastly, Spark provides strong support for streaming data and complex analytics where iterative calculations are used such as in machine learning and graph algorithms - this is where Spark shines brightest. Spark’s versatility has led users to call it “the swiss army knife” of processing engine platforms as users can combine all of these capabilities in a single platform and workflow.
Spark is also the hottest open source big data project currently on the planet. The level of involvement from the open source community has grown rapidly over the last year with over 330 contributors in the last 12 months alone. Spark is more than just hype though. Within the last 8 months, all of the major Hadoop distributors, including Cloudera, Hortonworks and MapR, have committed to ship Spark as a part of their distribution as well as help accelerate the development of the project.
Tableau’s integration with Spark brings tremendous value to our customers by providing a fast and versatile data processing engine at their fingertips. Our integration also provides new capabilities to the Spark community - users can visually analyze their data without writing a single line of Spark SQL code. That’s a big deal because creating a visual interface to your data expands the Spark technology beyond data scientists and data engineers to all business users. The Spark connector takes advantage of Tableau’s flexible connection architecture that gives customers the option to connect live and issue interactive queries, or use Tableau’s fast in-memory database engine. Tableau also provides users the capability to blend Spark data with data from any of our other 40+ direct connectors, empowering users to leverage their existing data assets wherever they are.
In the beta launch, Tableau is supporting both Windows and Mac Spark SQL as a named connector in Tableau Desktop. Now to see Tableau and Spark SQL in action, we have created a short video demonstrating how users can connect to a Spark cluster and interact with data in Tableau.
Read: To learn more about what Tableau’s integration means to Spark users and Tableau’s recent addition to Databrick’s “Certified on Spark” program, please check out our guest post on the Databricks blog.
Respond: Do you have an interesting big data use case? We’d love to hear about it. Please reach out and let us know.