People-friendly data is often captured and recorded in a wide format with many columns. Machine-readable data is better in a tall format, with fewer columns and more rows.
Tableau prefers data to be formatted in a machine-readable format. To reshape your data for easier analysis in Tableau, you can perform a pivot. In Tableau, pivoting means transposing data from a crosstab format into a columnar format–from wide, short tables into thin, tall tables. Tableau Prep makes pivoting visual, allowing you to see how your data changes with every step. In this post, we’ll show how to perform multiple pivots in Tableau Prep to shape your data for analysis.
The challenge: performing multiple pivots
The original data.
When we analyze data in Tableau, we need it to be tidy. Each column should have a single unique meaning. So what do you do when your field meanings aren’t unique? When multiple columns means the same thing, you can use pivot. But what if you have multiple groups fields?
For example, in this data set, there are three different bestseller lists: Young Adult, Hardcover Fiction, and Paperback Nonfiction. However, the data is structured in a way that there are two columns per list, one for author and one for title. Therefore, there are three columns containing information about authors and three columns for titles. When there are multiple columns containing the same information, this is a sign the data should be pivoted.
For analysis, the ideal structure for this data would be individual columns for: Rank, Author, Title, and Bestseller List.
The image above shows the desired end state.
However, if we pivot the data as it is, we wind up with author information interspersed in the same column as title information, rather than a column for each piece of information. We need to pivot both author and title information, but we can’t pivot them together.
If the data is pivoted simultaneously, author and title information is interspersed in the same column.
The solution: coordinated pivots
To get around this, we need to perform coordinated pivots. Essentially, we need to pivot the columns containing author information and do a separate pivot for the columns containing title information. Then we need to rejoin those columns back together.
In Tableau Prep, we’ll do this by branching the flow so we can work with author and title data separately.
- On one branch of the flow, we’ll remove the title columns and perform a pivot on the author columns, ending up with a column for Bestseller List and Author.
- On another branch, we’ll remove the author columns and pivot the titles, ending up with a column for Bestseller List and Title.
- Then we can rejoin those two branches by doing a join on Rank and Bestseller List, ensuring that the correct title and authors are brought back together.
Performing coordinated pivots in Tableau Prep
Let’s go through it piece by piece.
- First, we’ll click the plus icon on the input step and select Add Step.
- In this new clean step, we’ll remove the columns that have to do with title. In the profile pane, simply click on each Title column’s card in turn and click Remove Field.
- When we’re left with just the Rank and Author columns, click the plus icon and select Add Pivot. Now we’ll actually pivot those columns. In the field list to the left, multiselect the three Author fields. Drag those three fields to the drop area under Pivot1 Values.
- In the Pivot Results area, we see the Rank column, untouched, and a new column for Author (taken from the shared value in each column’s name) and Pivot1 Names. We can rename this Bestseller List.
- Back in the flow pane, click the plus by the input step again, and select Add Branch.
- On this cleaning step, remove the three author fields using the same actions as in #2.
- Click the plus after this step and add a pivot. Repeat #3 with the three title fields. Rename the Pivot2 Names column Bestseller List.
- Now we need to bring the author and title information back together. In the flow pane, drag one of the pivot steps onto the other, letting go in the New Join drop area.
- The join defaults to a join clause of Rank and Rank. But we also need to take into account the fact there are three books of rank 1, one from each of the bestseller lists. We can add another join clause by clicking the plus icon next to the Applied Join Clauses area. Create the join clause for Bestseller Lists.
- In the flow pane, add a step after the join step.
- Remove the redundant fields brought in from the join—Bestseller List-1 and Rank-1
This data is ready for analysis in Tableau! We can add an output step and run the flow to generate the cleaned data set.
The final output.