I design learning resources at Tableau, which means that I find myself working with messy data sets a lot. When I got back from maternity leave in January, one of the first things I did was try to catch up on all of the newest Tableau Prep Builder features. I’ve been using Prep Builder since the beta phases and I still find hidden, powerful features in each new release.
If you’re new to Tableau Prep Builder or just want a refresher, dig into these four features to start solving some of your toughest data prep problems.
1. Find your next step with smart recommendations
I downloaded this messy data set of radio station song and artist play counts and brought it into Prep Builder. The first thing I wanted to fix was a time column—the data is a date/time field, and all I really needed was date. I changed the data type to String and braced myself to figure out what calculation I needed to write to make my modification. But then I saw the lightbulb icon in the user interface and thought “what’s that”? I hovered and saw it’s a recommendation. Sure, I’ll take a recommendation.
WHAT?! That is precisely what I wanted to do!! How did it know?? How did this software read my mind and make it so easy?? I’ve already switched my mentality from “how do I do this” to “what is it that I need to do?” That is a game changer. I’m now on the lookout for that lovely lightbulb, wondering what else Prep Builder can figure out for me.
2. Rename fields in the Input step (clean anywhere)
Next I decided to tackle another data set I’ve been wanting to work with for a while. But it has dozens of fields with terrible names and I always give up prepping it because there’s so much work involved to figure out which column is which. It would be great, I thought to myself, if I could rename all the fields in the Input step, when they’re in a list format instead of in cards in the Profile Pane. If only I could just double click to rename—oh wait, now I can!
Now I finally feel like I can work with this data set without tearing my hair out.
The ability to rename fields in the Input step is part of a new paradigm where I can clean anywhere in the flow. If I’m working in a Join step, I can made edits right in the Join Clause pane. Regardless of where I am in the flow, I can make changes where I see the need rather than creating a separate Clean step. I love how this keeps me in the flow (get it? get it??) of my data prep.
3. Document your process with descriptions on changes
One bad habit I have when prepping data is getting really carried away and not stopping to document what I’m doing and why. This means that when I come back to the .tfl later, I often have no idea why I made the changes. What I need is the ability to say “I replaced the newline character (\n) with a tilde (~) because you can’t split on a \n, but that’s what I needed to do. If I swap the \n for a ~, I can then split on the ~ instead!” and have it right there in the flow. Descriptions on Changes lets me do exactly that! I simply right clicked on the Change and selected Add Description. Voila! Now, when I go back to this flow, I can jump right in without re-familiarizing myself with each change.
4. Save time by setting specific Data Roles
Before Tableau Prep Builder, I’ve encountered data sets where a field should have data type validation during the data collection step but didn’t—like being able to enter an invalid email address into an email address field. I’d have to write a calculation to look for the @ symbol, find the values where that calculation was false, and go back and edit them in the original field. It was a pain.
Now, I can set the data type to email (or URL, or a geographic role) and Prep Builder can then show me just the invalid email addresses. I can modify or remove the invalid values right there in the same field. BOOM. DONE.
These are just a sample of a multitude of amazing features in Tableau Prep Builder. With all the new features, I find myself getting from messy to clean data faster and faster. When I need new example data sets for projects, I seek out interesting ones for the excuse to use Prep Builder, rather than getting sad and moving to a boring-but-usable data. I’m still delighted by how great that feels. I invite you to find your own messy data sets and see what you can uncover!
P.S. Looking for fun, public data sets to explore in Tableau Prep Builder? Read this blog post from Jacob Olsufka for a list of resources.