New in Tableau Catalog: Improved search and monitoring for data quality warnings
Data discovery and trust have been core principles of Tableau Catalog (part of Tableau Data Management) since its introduction with Tableau 2019.3. With every release, we continue to add features that help users find and use trusted data with confidence.
Tableau Catalog is unique in that users can search for data in Tableau and jump into analysis right away—no need to switch between tools and accounts. In 2021.1, we focused on improving the search relevancy so people can find the data they need and get to analysis faster. With data discovery as an important part of the cataloging experience, we want you to get the most relevant search results when looking for databases and tables in Tableau Server or Online.
Our customers love data quality warnings, so we’ve also added a new feature based on a popular request! Starting with Tableau 2021.1, you can configure monitoring for two events: extract data source refresh failures and flow run failures. When a refresh fails, a quality warning is automatically generated, visible on the asset itself and on downstream content items like workbooks and dashboards.
Find the right tables and databases faster
Searching for data isn’t trivial. When you know the exact name of a table or a database, it simplifies the connection process. But what if there are tens or hundreds of tables with similar names? How will users know which is the right one to use?
Tableau Catalog lets users search by asset name and description (across databases, files and tables), as well as search for tables by column names and column descriptions. With the new improvements, the results will be ranked higher if the search term matches the asset’s name—the closer the match, the higher the ranking. Matching columns will still show up in the search results, but will be ranked lower than matched table names. This way, the results aren’t overwhelming, but people can still find the right table even with just some keywords.
Tableau Catalog offers trust indicators—like certification badges and data quality warnings—to help others find the most relevant results. Certified assets will be ranked higher than non-certified assets with similar names; assets with data quality warnings will be ranked lower. This means as a data steward, you can feel confident that people are finding the data they should be using for analysis.
In the example below, one of the test databases used for new features validation has been certified and is recommended for use. After searching for “test”, the certified database is at the top of the search results. This way, you don’t need to guess which asset is the correct one to use—you can get right into analysis after a quick and easy search.
Automate communicating problems with your data
People who work with data understand that data refreshes can fail. There can be various reasons, and it’s impossible to prevent failures completely. With data quality warnings, users who are aware of the problems around data sets can communicate issues by setting warnings on databases, tables, published data sources, or flows.
Until 2021.1, these warnings had to be set manually or through REST APIs. It’s now possible to automate communications around failed extracts or flows through the UI. If you're running flows during off-hours, your early bird analysts can be informed of the status of the data without requiring manual action from data owners or stewards. With monitoring for data quality warnings, everyone can have more confidence that the data can be trusted.
Let’s look at published data sources first. You can set up extract refresh monitoring on a datasource so that if the extract refresh fails, a warning will automatically appear on the datasource and downstream content items. If the refresh then succeeds the next time, it will clear the warning. These automated warnings are set in addition to any manual data quality warnings the data source already has. The warning can also be cleared if the monitoring is disabled.
In the example below, extract refresh monitoring is set on the published data source Seattle Crime, which refreshes on a weekly cadence. In order to set this, simply navigate to the actions menu, select “Quality Warning” and “Extract Refresh Monitoring”.
If the extract refresh fails (for example, if the database is unreachable), the warning will automatically appear on the data source. Similar to other types of data quality warnings, it will also be visible to anyone working with dashboards or workbooks that rely on this data source. Thanks to this automation, people can easily see if the data they need is as fresh as they expect—no more unnecessary questions and panic.
For organizations using Prep Conductor, you can also leverage prep flow failure monitoring. Just like with extracts, monitoring can be set on prep flows to inform people using downstream content if a flow has failed. These warnings are automatically created on the flows and visible everywhere in the downstream lineage.
To learn more about these new Tableau Catalog features, check out our Help articles:
If your organization isn’t using Tableau Catalog or Prep Conductor, we invite you to learn more about Tableau Data Management.