Tableau & Spark SQL: Big data just got even more supercharged
Update 2-20-2015: The connector for Spark SQL is now released and available for version 8.3.3 and newer.
Update II 4-04-2017: Learn more about Tableau for Big Data, or see other native integrations.
We are thrilled to announce that Tableau has launched a new native Spark SQL connector, providing users an easy way to visualize their data in Apache Spark.
What is Apache Spark SQL?
Spark is an open source processing engine for Big Data that brings together an impressive combination of speed, ease of use and advanced analytics. Spark enables applications in Hadoop clusters to run in-memory at up to 100x faster than MapReduce, while also delivering significant speed-ups when running purely on disk. Spark SQL provides an interface for users to query their data from Spark RDDs as well as other data sources such as Hive tables, parquet files and JSON files. Spark’s APIs in Python, Scala & Java make it easy to build parallel apps. Lastly, Spark provides strong support for streaming data and complex analytics where iterative calculations are used such as in machine learning and graph algorithms - this is where Spark shines brightest. Spark’s versatility has led users to call it “the swiss army knife” of processing engine platforms as users can combine all of these capabilities in a single platform and workflow.
Spark is also the hottest open source big data project currently on the planet. The level of involvement from the open source community has grown rapidly over the last year with over 330 contributors in the last 12 months alone. Spark is more than just hype though. Within the last 8 months, all of the major Hadoop distributors, including Cloudera, Hortonworks and MapR, have committed to ship Spark as a part of their distribution as well as help accelerate the development of the project.
What value does Tableau's integration with Spark SQL provide?
Tableau’s integration with Spark brings tremendous value to our customers by providing a fast and versatile data processing engine at their fingertips. Our integration also provides new capabilities to the Spark community - users can visually analyze their data without writing a single line of Spark SQL code. That’s a big deal because creating a visual interface to your data expands the Spark technology beyond data scientists and data engineers to all business users. The Spark connector takes advantage of Tableau’s flexible connection architecture that gives customers the option to connect live and issue interactive queries, or use Tableau’s fast in-memory database engine. Tableau also provides users the capability to blend Spark data with data from any of our other 40+ direct data connectors, empowering users to leverage their existing data assets wherever they are.
Installation Guide for Spark + Tableau
Ready to connect Spark and Tableau? We've got a simple guide all set up to help you out.
Read the Tableau Spark Connector guide
Read: To learn more about what Tableau’s integration means to Spark users and Tableau’s recent addition to Databrick’s “Certified on Spark” program, please check out our guest post on the Databricks blog.
Respond: Do you have an interesting big data use case? We’d love to hear about it. Please reach out and let us know.
관련 스토리
블로그 구독
Tableau에서는 날마다 데이터, 분석 및 비주얼리제이션에 관한 흥미로운 소식을 찾고 있습니다. 블로그에서 그러한 소식을 공유하는 것은 사람들이 자신의 데이터를 보고 이해할 수 있도록 지원한다는 Tableau의 사명에서 매우 중요한 부분입니다. Tableau를 더 효과적으로 사용하는 요령부터 사람들이 일상에서 데이터 관련 문제를 어떻게 해결하는지 모두 찾아볼 수 있는 Tableau 블로그는 데이터 애호가들을 위한 곳입니다.