Analytics anyone can use.
Data prep anyone can use.
Analytics for organizations.
Cloud analytics for organizations.
We’re introducing a new performance feature for some extracts in the newest Tableau beta (2018.3)—multiple table storage. When creating extracts, you can now select a new “multiple tables” storage option. When selected, the individual database tables will be stored in the .hyper extract file separately, mirroring the database structure. In certain cases, this will result in smaller extract file sizes, faster extract creation, and potentially faster queries.
This feature changes how data is stored in the extract file behind the scenes. Previously, extracts always joined all data tables together to produce one output data table, regardless of the number of tables in the data tab. Now, with the new "multiple tables" option, users can choose to store each table independently in the extract file. There is still only a single extract file produced and stored on disk—the tables comprising it are stored separately within the single file.
Single table storage (the default storage type) and multiple table storage each have better file size and performance characteristics in different scenarios, so we allow you to choose.
The storage type affects file size because certain types of joins cause data storage redundancy. Previously, Tableau stored the result of the join, so it would store all the redundant data, often resulting in large extract files. If the number of rows after your join is larger than the sum of the rows in your input tables, then your data source is a great candidate for multiple table storage. Joins that are likely to cause data storage redundancy include joins between fact tables and entitlement tables in some row-level security scenarios.
In addition to file size differences, multiple table storage and single table storage can affect extract creation speed and visualization query speed. For single table storage, your source database will perform the join during extract creation. With multiple table storage, however, Tableau Desktop will perform the join inside Tableau’s data engine during visualization query time. So, multiple table storage extracts may initially be created faster because they only require copying the individual tables, without requiring a join. On the other hand, multiple table storage extracts might be slower during query time because of the join required at that time.
These performance differences are more noticeable with large amounts of data. If you are working with a large data set, you'll want to experiment with both techniques to determine which gives you the best performance and size benefits.
If you can’t decide which to use, stick with the default, single table storage, because multiple table storage has some functional limitations—including no incremental refresh and no extract filters. We plan to address these limitations in future releases.
Here's an example where multiple table storage results in substantial performance improvements.
My data is an Excel file with two tables:
I join the tables via an inner join on "Sub-Category" and measure the results.
The multiple table storage extract performs substantially better because it prevents data storage redundancy at extract creation time.
Here's an example where multiple table storage results in performance degradations; single table storage is the preferable storage type for this scenario.
My data is an Excel file with two tables:
I join the tables via an inner join on "Product Name" and measure the results.
In this scenario, the inner join on "Product Name" does an implicit filter on the Superstore table. So, all the non-bookcase orders are omitted from the single table extract file, resulting in a smaller file size.
Note: These time measurements were done with the same computer: Windows 10 with 32GB RAM, 16 cores; dual processors – (2) Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz, 2401 MHz, 8 core(s), 16 logical
One important use case for multiple table storage is row-level security scenarios where entitlement tables are joined and extracted with your data. Previously, joining an entitlement table and a fact table into a single table would result in too many rows for extract creation to be feasible. Now, these tables can be stored separately in the extract file—before the join—so the extract creation is much faster.
For more information about row-level security and how multiple table storage can be leveraged, refer to Bryant Howell’s blog post.
We are working hard to get Tableau 2018.3 out the door, but we appreciate your feedback to catch any issues and to ensure the highest quality for new features.
Join the Tableau pre-release community to:
Not all functionality described above may be available in the beta program today. Some features will be added in the coming weeks. The beta program is available for existing Tableau customers. Customers with an active maintenance license can upgrade for free when Tableau 2018.3 is released.