A Tableau champion's guide to migrating workbooks and data sources
Note: The following is a guest post by Cole Shelton of InterWorks.
Every company adopts Tableau at a different pace. Some Tableau deployments take off like wildfire while others plod along, inching forward slowly but gaining momentum with the inevitability of a glacier.
If you are a Tableau champion, adoption throughout your organization is wonderful to behold as your employees discover the secrets of their own data. But as with all things, growth comes with challenges, and scaling Tableau within an organization is no different.
Ensuring data quality
One of the primary challenges organizations face is data quality. There will almost surely be data of such importance that its accuracy and correctness cannot be in doubt; financial data, sales data, or any other data used to make critical business decisions fall in this category.
In order to ensure data quality, organizations need quality assurance, or QA processes. Those QA processes will be multi-stepped and most likely require multiple environments to support them. However, one of the challenges to managing multiple Tableau environments is that you need to migrate Tableau workbooks and published data sources from one environment to another.
Before we continue, you might be wondering “What is an environment?” An environment, in the context of software or hardware, is a separate area or instance that is segregated from all other environments. These environments are then bridged by processes that move data or artifacts (in Tableau’s case workbooks and data sources) across the environments.
Common names for these environments are development, test/QA, staging/pre-production, and production. There are typically three to four environments (staging/pre-production is often omitted). These environments can exist as either separate projects or as sites on the same Tableau Server instance. In some cases, each environment might be a separate instance of Tableau Server, but in this scenario customers need to read their Tableau EULA very carefully. Tableau deems “development of visualizations and any similar content creation” as a production activity that therefore cannot occur on a non-production instance of Tableau Server. So if your enterprise does indeed require separate development, QA, or pre-production servers, those servers generally need to be licensed for production use.
The QA process is straightforward when we look at it from a high level. Below is a simplification of the workflow. Realistically, you will have a back-and-forth between the QA analyst and developer as issues are found.
- Developer creates workbook or data source
- Developer tests workbook or data source for correctness
- Developer publishes workbook or data source to development environment of Tableau Server
- Developer tests workbook or data source in development environment
- Developer pushes to test environment of Tableau Server
- QA analyst tests workbook or data source in test environment
- Release manager pushes to production environment
- Release manager does final testing on production environment
How do I move all the things?
Once you’ve decided on a process to migrate your workbooks, the next step is to determine how you will be migrating them across the environments. Below we will outline the options that are available and then discuss how to choose the right one for you.
There are a few different options for migrating your workbooks. There are tradeoffs to each and we will discuss the pros and cons of each. Currently, the options available for migrating your workbooks and data sources are:
- Tabcmd Script
- REST API
- Enterprise Deployment Tool by InterWorks
Tabcmd is the utility that comes with Tableau and allows you to interact with Tableau Server through a command line. By using tabcmd within a script, you can migrate your workbooks. You simply do a “tabcmd get” from the source and “tabcmd publish” to your destination.
Sounds easy enough, right? The challenge is that if you need to make any changes, like modifying data sources, you’ll have to programmatically manipulate the .twb XML, which is very error-prone.
Tabcmd also does not carry over permissions for the workbook and you must manually apply permissions to the new workbook.
Another limitation is that tabcmd’s publish will fail if there are two live data-source connections in the workbook.
Tableau REST API
An API is a programming interface and the REST simply denotes the way in which a programmer interacts with the API. Tableau’s API allows a programmer to interact with Tableau Server through various API calls. Similar to tabcmd, you may "get" and "publish" workbooks and data sources, but unlike tabcmd, you can also set permissions programmatically.
The largest drawback to using the REST API is that it will require a developer to create the code. And even if you have access to a developer, it will take time to write the code and then test it.
For more info on the Tableau REST API, check out this Online Help guide.
TabMigrate is an open-source project released by Tableau and is a software utility created by Tableau developers to migrate workbooks and data sources. You can use TabMigrate as is, or, if you have development capabilities at your disposal, you can customize it to your needs.
TabMigrate has a minimal UI that allows you to export all the workbooks and data sources from one server to a local folder then upload those files to another server. You can manually do transformations on the files once you have them in your local folder. There is some editing of XML and other files required to configure the migration.
TabMigrate makes it possible to do repeatable bulk migrations as well as limited transformations, like data source replacement. One limitation of TabMigrate is that it does not set permissions on migrated data sources.
Enterprise deployment tool by InterWorks
Enterprise Deployment Tool (EDT) is a third-party software tool for doing workbook and data source migrations. EDT has a wizard-type interface that walks you through selecting workbooks, applying transformations, setting permissions, then deploying your workbooks and data sources. It also includes rollback functionality, letting you rollback an entire deployment if you use its archive workbook feature.
EDT allows you to save your migrations and rerun them later, or execute them from a command line or via script.
EDT is the most full-featured option on this list, but it’s also the only one you have to purchase, which is its biggest limitation.
What’s the right decision?
The right decision is what works best for you. The first step you need to take is to understand your needs. Start by asking these questions:
- How many workbooks or data sources are you deploying? How much time is required?
- Are you using separate projects, sites, or Tableau Servers as your environments?
- Do you have separate data sources for each environment? For example, the test environment uses a separate database than the development environment.
- Are any transformations being applied during the migration (e.g. watermarking workbooks)?
- Do you have a software developer available to help with your project?
The more complex your deployment scenario, the harder and longer it will be to write the scripts necessary to do the deployment. As your deployments get larger and more complex, it makes sense to begin looking at TabMigrate and EDT to see which one will best suit your needs.