How does Tableau connect to enterprise data warehouses?
Tableau connects directly to the leading EDW databases, including Teradata, Oracle, SQL Server, Netezza, Vertica, Postgres, Essbase and MySQL.
Do I need to do anything special to my database for Tableau to access it?
No. Tableau will access any data that reside in the database “as is.” There is no need to create special schemas, special metadata, or alter your database in any other way.
We have hundreds of tables in our EDW. How do business users create the right connections?
Tableau behaves like all good applications and provides security clearance before it is granted access to any tables in your database. Getting “connected” to an enterprise data warehouse is as straight forward as providing an active User Name and Password at appropriate Tableau prompts. User Name and Password then grant access to appropriate tables. In fact, if your EDW is like most deployments, you may allow users to connect to tables directly, but often you prefer to give them views that are optimized to their understanding of the data or to join indexes. Tableau can use any or all of those connection means and even can connect to SQL statements to each of those. Those connections can be saved and published for easy access by others so that users can rapidly see their data the way that you, the DBA, want them to see it.
Without a semantic layer, how does Tableau differentiate dimensions and measures?
During its initial read of the database data dictionary, Tableau applies certain criteria to determine if data should be categorized as either dimensions or measures. This initial categorization is presented to the Tableau user and can be changed as required. For example, it is a simple “drag and drop” move for a Tableau user to re-categorize data between dimensions and measures. Once a Tableau connection is saved, the definitions of these dimensions and measures are stored with the connection and can be shared across the Tableau community.
How do business users decipher the cryptic column names that we use?
For users connecting directly to data, initial “deciphering” may require a consultation with a DBA or at least with the data dictionary. Either way, if a user is not sure of the business meaning of column names as presented in the Tableau data display, they should consult one internal sources.
Alternatively, IT management staff may choose to create a Tableau “data connection”. Within this connection, they can literally specify connections to the right tables, views and indexes and specify how the metadata should be presented. This could include connecting to SQL statements, renaming columns, hiding columns, defining roll-up groups, writing custom calculations and aliasing data stubs or content. The metadata can be customized without changing the underlying data; there’s always a traceable path as to the source data elements.
Thus, with regard to understanding the meaning of the display of terms and names on Tableau results, Tableau provides the capability to display aliases for database terms. These aliases appear only on Tableau results.
Some tables have dozens of unusable columns. How do business users manage around this?
One of the great advantages of Tableau is that it does not presume to know the business value of data in a data warehouse. Consequently, within the security policies of a business, Tableau makes any and all data available to a Tableau user who has been granted access to the data. Not all data are relevant. However, decisions with regard to relevance are made by the Tableau user. Part of the definition of a Tableau data connection allows for hiding unneeded fields. An individual user may show/hide the unneeded fields, or the data steward creating the shared connections can hide the unneeded columns. And of course, if the data steward is creating database views for users to access, only the columns defined in the views will be displayed.
I have some hand-optimized SQL that defines the set of data I want to work with. Can Tableau use this SQL?
Yes. When initially establishing a connection to a database, there is a choice to either browse the list of tables (and views) available on the database to work with, or there is the option to provide a custom SQL statement to use as the starting point for your analysis. This custom SQL statement is useful when performing tasks, such as unions or working with highly optimized SQL statements to define a subset of data for analysis.
Can Tableau work with views?
Yes. Creating database views is a very common best practice to help shield the complexity of the database from the business users.
How big can my data be? Are there any size limits when using Tableau?
There is no size limit to the databases that Tableau can access. Tableau’s design is such that it doesn’t take data out of the database to host it somewhere else. It accesses data directly from the database.
Tableau is a visual analysis application, so the individual visualizations, reports and dashboards that are created in Tableau generally retrieve a relatively smaller number of rows from the database, automatically building the queries to have the database aggregate and filter the data before returning it to Tableau. As such, the data volume returned to Tableau is generally modest compared to the terabytes that are stored in the database.
How does Tableau prevent queries that would cripple the data warehouse?
Because Tableau is a good citizen, it will take full advantage of all database services, both those provided by the technology and those provided by DBAs. There is no substitute for performance testing and ongoing collection of statistics and monitoring to ensure the behavior and performance of any application, and Tableau applications are no exception. However, one of the great advantages of Tableau is the natural restrictions it places on the selection of data simply because of the human constraints of the viewer. Tableau only selects data required to populate the visual in front of the viewer.
Generally a Tableau visualization includes only a few Dimensions and Measures simply because the viewer is using visuals to reduce the complexity of the data in the EDW in the first place. Tableau does not ask for all of the members of a Cube to be populated so that it has all of the data that might be required when a Cube is “sliced and diced”. Tableau only needs to request the data that are required of the visual in front of the viewer at any moment in time view is constrained the number of dimensions the viewer can visually comprehend. So between good DBA practices and the natural constraints of visualizing data at any moment in time, Tableau’s behavior with large databases is generally exceptional.
Is Tableau secure? What data can end users access?
If the database requires a username and password to establish a connection, then Tableau will require the same. Typically an individual user is given grant authority to only a subset of the tables and views in the database, which is how DBAs control the data that can be accessed.
Does Tableau Server also require individual user accounts?
I would like to shield those users from having to login again.
Tableau Server has the option for the original author of a set of analysis to specify the username and password that the database which should always be used for that workbook. This is like an ‘application password’, where anyone accessing that workbook will be automatically connected to the database as a pre-defined username. This capability is optional and may be restricted by the Tableau Server administrator.
Does Tableau support data level security (row and column)?
Yes. Any data level restrictions defined by the DBA as part of view definitions is automatically inherited by Tableau. When using Tableau Server in conjunction with database, additional row level data constraints may be specified to subset the data retrieved based on the individual viewing the content.
Can I work disconnected from my database? Or do I need to be online all the time?
Yes. Tableau has the concept of an Extract File to allow working offline from the database. The extract file allows you to take a snapshot of some data from the original database and hold it locally on your PC. This allows you to work disconnected from the database. Once you complete your offline work, simply select a menu item to return your workbook to querying the live database, and it will automatically not use the data snapshot any longer.
I have a large number of tables in my database. Do I have to include them all when I make a connection in Tableau?
No. Tableau encourages working only with the set of tables that is relevant to the current topic being analyzed. This is generally called a subject-oriented perspective. The basic principle is to include only the minimum set of tables in the active connection to provide the data for the current subject. For example, if you are doing some retention analysis from an HR database, there is no reason to have your connection definition include tables from a finance or customer database. By taking a subject-oriented view, the business analyst has just the measures and dimensions applicable to their questions at hand. This simple concept is a key best practice in making the data warehouse useful to business analysts. Data stewards and DBAs can help by creating Tableau connection definitions (called TDS files) that are subject oriented and sharing them with the rest of the Tableau community.
Does Tableau require me to have a username?
The desktop version of Tableau does require each individual to have a username and password for each database accessed. In addition, Tableau has capabilities to enable secure means of sharing data.