Porting Tableau Server to Linux, Part 2
In Part I of our blog post, we discussed the nuts and bolts of porting Tableau Server to Linux. Now in Part II, we discuss all the other issues that came up when it was time to bundle up and release the software.
Packaging, distributions, and the Linux way
Releasing Tableau software on Linux is not as simple as putting up code on github and letting the world go at it. Tableau will need to support our customers running the software. It would not be possible to support every distro or just give customers a few binaries and a config file and let them figure it out. Here’s what we did to provide what our customers need.
First steps: standardize on CentOS, systemd
Internally, we proceeded conservatively. First, stakeholders from Dev and IT with consultation from Sales agreed that we would standardize on CentOS. IT liked CentOS because it's considered by many to be the most stable free version of Redhat Enterprise Linux (aka RHEL) and Sales liked it because a quick survey showed that many of our biggest customers also used it. With that decided, it was up to Dev to decide which other Linux distros and which versions we would support. The biggest decision was to only support distros that used systemd rather than init.
Customers do not launch Tableau Server manually. Tableau Server has deep integration with the OS so that it's reliable and seems indistinguishable from other services installed on their machines. We chose systemd (even though systemd was controversial) because it was the service manager on the distros we planned to support and it provided useful standardization we could build on. We also decided that we would not support two service managers, so there would be no support for init. The consequence of this decision was that Tableau Server would not support RHEL 6-based systems (only version 7 and forward) which included CentOS 6 and Amazon Linux 1. At the time, this seemed like a risky choice since RHEL 6-based systems were so prevalent, but I believe that we made the right decision. Even though we standardized our internal processes on CentOS 7+, we initially supported other distros for our customers, including Ubuntu 16.04 LTS and other RHEL based systems like Oracle Linux 7.
When packaging the bits, we chose the native distro package format, either .rpm or .deb instead of just any other format like a .zip file or a tarball. We wanted to control the install experience and have the script run both pre and post install to verify that the machine met specifications and to configure particular system settings. We did this to ensure both a consistent experience and to minimize install headaches since Linux machines can vary so much.
The Linux way or the Windows way?
When choosing how to bundle and release Tableau Server on Linux, there was a tension between doing things the "Linux way" and the "Windows way." The Windows way is designed to be a narrow path where it's hard to make a mistake. Our one-click Windows installer is a good example of this. To install Tableau Server on Windows, you download a single executable, click Next, Next, Next, and it's running. It's been an important feature for us. We very much wanted a similar experience with the Linux installer. We knew a lot of Tableau Server on Linux customers would be transitioning from Windows and we also wanted to make the experience as easy as possible for customers who didn't know a ton about Linux. As a result, Tableau Server on Linux can appear to be not very "Linux-y." The Linux mantra is "one tool, one job" and Tableau Server on Linux is definitely not that. The product wants to take over the configuration of your machine and it's necessary to perform specific steps manually to make it work.
The push and pull between parity with Windows and doing things more in a Linux way went further than this, however. The new Tableau Services Manager (TSM) interface requires authentication. Because authentication on Windows was initially going to be based on being a member of the Windows "Administrators" group we were going to allow anyone in the Linux "wheel" group to have TSM access. This turned out to be a very bad decision because being a member of wheel and having access to the TSM interface are orthogonal to each other. Many system admins may want users who can control Tableau Server to not have general admin privileges on the machine itself. Instead, we create (or the user chooses) a new group to which membership allows TSM interface access.
The previous version of Tableau Server on Windows (known as classic tabadmin) worked a certain way during upgrades and multi-node installation that we initially tried to copy on Linux. First, the process of installing a new version of the software begins by upgrading the server which entailed taking a backup, stopping the existing server and upgrading to a new version. Second, when installing the software on a new node, the new node would pull the install bits from the first node and write them to disk. While this was very nice for the Tableau Server Administrator on Windows, this would never work on Linux.
A new way of upgrading
Installing Tableau Server on Linux involves running the installer script as root so that we can make the appropriate system modifications. We could never allow the software to download newer versions of itself and install them as root while running as the unprivileged running user. That breaks very important Linux security considerations. This forced us to consider separating the installation of the software from the upgrade of the software.
The installation is an OS operation and the upgrade is a Tableau operation. This meant we needed an entirely new way of upgrading. It meant that the system administrator would be responsible for installing each new version of the software on each node in a Tableau Server cluster. This meant that multiple versions of the software would have to live side-by-side, something not possible with the old “classic tabadmin” version of Tableau Server on Windows. It turned out that not only was allowing side-by-side Server installs necessary from a security point of view, but it also was very much appreciated by administrators. They could use whatever software distribution system they normally used to install Tableau Server and then kick off the actual upgrade at a later time. The scheme worked well enough so that Tableau Server on Windows with TSM now uses the same strategy for upgrades.
If you have ever poked around in the installed bits of Tableau Server on Linux, you may notice that the C++ libraries used to run server are all in the
bin package. Here’s another way the Linux version followed the Windows version because of inertia. On Windows, it's extremely common for dependent libraries to live alongside the binaries that use them. This is not common at all on Linux where dependent libraries are usually stored in a nearby
lib directory. Regardless of where we placed dependent shared objects, we needed a way to allow binaries to locate their dependent libraries. On Windows, the fact that the binaries and libraries are adjacent is very convenient because the
PATH is searched and the current directory is always implicitly on the
PATH. Binaries on Linux follow no such convention. We had no desire to force the issue by setting
LD_LIBRARY_PATH either. Luckily, on Linux we could take advantage of the
RPATH linker tag set to
$ORIGIN which means that any binary or shared object should search for its dependent libraries relative to the path of itself. See the
RPATH of one of our libraries:
% cd /opt/tableau/tableau_server/bin.20182.0000.0.0.0 % chrpath -l libtabsys.so libtabsys.so: RPATH=$ORIGIN/.
ldd on the shared object shows how it does indeed successfully find its dependent libraries relative to itself. This is a very effective technique that allowed us to ensure that the binaries we ship only depended on the libraries we ship and not on other system libraries that happened to be installed at the time.
Tableau Server comes with a few command line utilities, two of which are the old
tabcmd and the new
tsm CLI. Both are pure Java programs. In the previous version of Windows Server and initially for the
tsm CLI, we had planned to copy Windows and provide a C++ Java launcher program to make running these Java programs easy.
While it's possible to use the
-jar argument to launch a Java application with a single argument using the default Java executable, this was very awkward. The launcher program worked just fine but required that we ship a new C++ binary and C++ runtime. It was all very heavyweight and bulky. Instead of any of these options, we decided to use a technique known as a bash jar that is somewhat common on Linux. The idea is that a Java jar file is really just a zip file and zip files have the property that all leading text that isn't part of the archive is ignored. You take the single jar that makes up your Java application and prepend a tiny bash script that launches a Java command with the appropriate options and points the java command at itself. Then when the java command parses the file blob, it will skip the leading bash script and successfully read the jar. The best consequence of this technique is that both programs are a single file that can run without any other dependencies beside an installed Java Runtime Environment.
Since we didn't retrofit
tabcmd on Windows, it still uses the C++ Java launcher program. If you compare the
tabcmd standalone installers for Windows and Linux, you can see that the Linux version is about 10x smaller (7MB vs 65 MB) because of this. This technique was so useful that the new
tsm program on Windows uses it also.
Windows users assume their fonts will be everywhere. If a Windows users runs Tableau Desktop and authors a viz, the chances are extremely high that whatever font they use will also exist on the Windows machine running Tableau Server. We knew that our customers would not accept vizzes that looked different because of non-existent fonts if they transitioned their data from a Windows server to a Linux server.
While Linux distributions provide very good font replacements for many of the most popular Windows fonts, we needed the original. Tableau workbook formatting can be very sensitive to changes in font metrics and we did not want to risk cascading formatting differences in all of our customers’ carefully curated dashboards. This was doubly important for Tableau Public and Tableau Online where we planned to transition from Windows to Linux and the change needed to be completely transparent to all of our users.
We knew we needed to license the fonts but which fonts? We found the answer by looking at what fonts were in use already on Tableau Public:
Armed with this information, we negotiated licenses for the following fonts: Arial, Calibri, Courier New, Georgia, Meiryo, Times New Roman, Trebuchet MS, Verdana. We may add more in the future. It took a surprisingly long time to secure access to those fonts: nine months! Who knew it could be so exciting to ship Arial?
What about performance?
The first question we usually get about Tableau Server on Linux is something like, “How does it run compared to the Windows version? It must be (faster/slower/more buggy/less buggy/insert your favorite word here)!”
The answer is yes, it’s all of those things.
The Linux version generally runs about the same as the Windows version. Many vizzes on Linux are faster than on Windows and vice versa. One day, we may release a whitepaper comparing performance but we haven’t done that yet. Performance is very data and environment dependent. Some customers have reported performance gains when they transitioned their data to Linux. Others see the same performance. Amusingly, some customers think of Linux in total as a stable rock (Linux generally has a very good reputation industry wide) and somehow think that running Tableau Server on Linux will eliminate all crashes. Sorry, it will not. Dereferencing a bad pointer on either OS will result in a crash. And remember, it’s exactly the same code.
The performance of Tableau Server on Linux was definitely not equivalent the first time we ran it internally in a semi-production environment against a real workload. After a short time, all the processes that render vizzes (vizqlserver processes) were hanging at 100% CPU. We swarmed over the problem and it was the first time we spearheaded the use of
perf top to diagnose the hotspot.
First, it turned out that we had hit a bug in the unixODBC library that had just been found and fixed by someone else. But even with this bug fixed, there was still a ton of time spent in a function that parses the tiny
/etc/odbcinst.ini text file.
perf top shows globally where the CPUs on your system are spending time and we were spending over 40% of the time on all CPUs in
strcmp! The fix was pretty simple and didn’t affect the Windows code since it was in the platform-specific driver code. The moral of the story is that database drivers for Windows can be more mature than drivers on Linux. In fact, when we find vizzes that are much faster to render on Windows than Linux it is frequently because the DB queries are faster.
I also want to plug another great performance analysis tool in common use on Linux: the flamegraph. We have proactively used flamegraphs to see if hotspots stand out. One of our earliest uses found that we were spending about 7% of all of our time creating ICU date formatters over and over again. The fix to cache them was minimal and the before and after flamegraph screenshot shows how on the left, we are spending time calling
TabICU::ICUSimpleDateFormat, while that call tower is gone in the image on the right.
Identifying and fixing performance problems on Windows is also very feasible and good tools definitely exist.
Does Tableau Desktop Run on Linux?
Tableau Desktop absolutely runs on Linux:
The effort to make Tableau Desktop run on Linux was very minimal. Everything "just worked" which is testament to the code base already running on both Windows and OSX. We use Tableau Desktop internally for testing all the time. It's the easiest way to verify DB connectivity or check how a viz renders without running a Server.
Unfortunately (and I don't speak for the company, it's just my personal opinion), we will never release Tableau Desktop because there is not a lot of demand for it and it would be very difficult to support given the variety of graphics cards and libraries that exist for Linux systems.
Porting Tableau Server to Linux was a gratifying group effort. Similar to the Mac port, we needed to educate the entire company on how to build, test, and debug on Linux. IT needed to expand our build grid and provide developers with Linux desktop systems. Our entire support organization needed to be trained on Linux. New documentation specific to Linux was written. While management has been very supportive, because they understand the importance of a Linux Server in the enterprise market today, it's an on-going process to port the entire company to Linux.