Tableau Metadata Model, Version 6.1

Visão geral | O que você aprenderá: 

This whitepaper provides a detailed view into the metadata approach encompassed in Tableau’s products. It answers questions about how Tableau effectively meets the needs of the IT department while equipping the business with answers they need today.

We've also pulled out the first several pages of the whitepaper for you to read. Download the PDF on the right to read the rest.

Most Business Intelligence platforms fall into one of two metadata camps: either model the entire enterprise as a first step, or model nothing. Traditional Business Intelligence solutions require highly skilled individuals with expert knowledge of the business, the data and the platform to model the metadata as part of an extensive exercise. The start-up cost and the change cost are high, but the enterprise gains control and structure. Many ‘next generation’ Business Intelligence Platforms claim that one of their innovations is to do away with metadata. This is a great solution for quickly getting started with simple data, but rapidly runs into problems when the platform usage spreads beyond a small team and into enterprise scale deployments. The start-up costs are low, but control and structure are sacrificed.

Tableau has taken a hybrid approach that satisfies the requirement for rapid analysis with true ad hoc reporting and also fulfills the needs for broader metadata management. The metadata system in Tableau has been thoughtfully designed so that IT can add value by providing a rich metadata layer – yet business people are empowered to modify and extend it. Tableau has been so successful in making metadata seamless, approachable and transparent, that customers often believe Tableau does not have a metadata layer.

Tableau has developed products with the following philosophical points about metadata:

  • Metadata in existing systems should be leveraged when beneficial
  • Analysis be possible without a metadata modeling exercise
  • Metadata is a useful abstraction, but should not be constraining
  • Metadata are defaults, they can be changed at runtime
  • Business users don’t need to understand metadata to be successful

With these points in mind, Tableau has created a simple, elegant and powerful metadata system known as the Data Source. It is a 3 tier system with 2 layers of abstraction and a run time model (VizQL Model). Again, a user can become an expert in Tableau and successfully deploy and operate a complete Business Intelligence platform with no knowledge of how Tableau handles metadata. However, individuals familiar with the concepts of metadata will have a better appreciation for Tableau’s approach to metadata after reading this document.

Tableau Metadata in Action

Despite the richness of the metadata model and the power it offers, using Tableau is very simple. An example is the best way to see this. Consider a new user to Tableau that wants to understand some data in their corporate data warehouse.

First, the user connects to the database by entering the appropriate information: the server name, their credentials and the database and tables to access.

The user then instantly sees a list of all available Dimensions and Measures (also known as fields or variables). By double clicking on on Sales and Order data, he creates a result. That’s it. Start to finish in a couple of seconds.

Tableau automatically categorized the fields and assigned default roles for them. When the user added the Sales measure, Tableau automatically applied an aggregation. When Order Date was added Tableau automatically built a date hierarchy and summarized the Sales by Year. Additionally, the patented Show Me! technology automatically chose a line chart as the best practice method for visualizing a trend over time.

The Metadata Model


The first layer of abstraction is the Connection. The Connection stores information about how to access the data and what data to make available to Tableau. The connection attributes for the database, the tables, views and columns to use, and any joins or custom SQL used to access the data. Note that the Connection is not a copy of the data or an import of the data. As such it is not useful to define filters, calculations, aggregations or any other data modeling at this layer. The connection defines the scope of data available to Tableau and this is the layer were database level security is applied. Since it is an abstraction layer, changes can be made at any time.

Data Model

The second layer of abstraction, the Data Model, is the meat of the data source. Tableau automatically characterizes fields as Dimensions or Measures . When connecting to cubes, this is simply read from the robust metadata in the cube. When connecting to relational data, Tableau chooses whether a field is a dimension or measure based on intelligent heuristics that leverages the database’s metadata. The Data Model is the repository for additional metadata such as data types, roles, defaults, aliases and more. Additionally, it is the repository for user generated fields such as calculations, sets, groups, etc.

The data model is independent of the Connection. This means that it is both insensitive to changes to the Connection and its components are reusable in unrelated Connections. This abstraction enables the Connection to automatically adapt to changes in the underlying data structure without requiring manual changes. This process itself is so elegant that this has eliminated the need for a whole set of features and tools which legacy BI vendors have often called as ‘harvesting’ or ‘change management tools’. This time savings and risk mitigation are passed on to Tableau customers as a natural process of using Tableau.

Tableau further leverages this data model abstraction by allowing the Connection and Data Model to be defined organically. As authors forage for data, explore and analyze the data, and refine and ask more questions, the Data Model can grow and adapt as needed, for example by renaming fields, adding definitions or creating new calculations. Therefore, there is no need to perform any metadata modeling exercises prior to getting started with your analysis or project.

For example, database tables can be added to the Connection at any time and the Data Model will automatically adapt by making the new fields immediately available for analysis. If a field is removed from the database or renamed, it can simply be remapped to a substitute column or automatically removed from all sheets where it is used.

For relational databases, the Data Model also has two connection states. It can use the Connection to dynamically access a live database, or it can be using an extract, or import, of the data in the Connection. This means that a business user can choose at any time whether to use a live connection to the data or to use a local cache of data extracted to Tableau’s Fast Data Engine. It is simply a state that can be toggled at any time.

For example, a user has built a workbook using a live connection to a data warehouse. Recently, they have decided to take advantage of Tableau’s fast Data Engine. The user simply chooses the option to create an extract and the entire data source, and all the sheets that use it are now pointed to the extracted data that site in-memory on the user’s machine. At any time, the user can choose to toggle back to the live data or even refresh their snapshot of the data.

Want to read more? Download the rest of the whitepaper!

Continuar lendo...

Você também pode gostar de...