9 data blending tips from #data14de

By Andy Cotgreave 2014/05/27

Data blending is the ability to bring data from multiple data sources into one Tableau view, without the need for any special coding. Do you do data blending? Or wish you knew more about it? Here are 9 tips from one of our Tableau sessions delivered at the Tableau Conference in Munich this week.

Bethany Lyons is one of our product consultants in EMEA. She's an expert on blending, table calculations and delivering high-energy conference sessions (she will be repeating these sessions in London, so register to get a chance to see her in action). Here are some of her tips. For more resources, follow the links at the end of this post.

1. In the data window, the blue tick indicates the Primary data source (Sales in the image below). An orange tick indicates a secondary data source (Promotions).

2. Blending runs a query on each data source and joins the aggregated results of those queries.

3. The first field you drop onto a worksheet defines the primary datasource. Any other data source you use will be a secondary data source. The secondary source fields are shown on shelves with the orange tick marks.

4. A default blend is equivalent to a left outer join. However, by switching which data source is primary, or by filtering nulls, it is possible to emulate left, right and inner joins.

5. You cannot emulate a full outer join using blending.

6. Sometimes you see NULL values when you blend. This is when there is no corresponding row based on the join key in the secondary data source. In the example below, Patient is in the primary datasource. Illness is from the secondary data source. Sam and Will have no data in the secondary data source.

7. If you do not override the default choice, Tableau will automatically switch the blend field to the most appropriate for the view.

8. You can ask new business questions in a view just by changing the blend fields.

9. The * is often a cause of frustration but needn't be. What it means is that the primary and secondary data sources do not share data that enables you to create the view you have attempted to draw. The * is an indication that your data sources do not contain enough information to blend and display the view you have created.

Bethany went into much more detail than this list! If you want more information on blending, check out these resources:
Watch our data blending tutorial
Read a knowledge base article on choosing between joining or blending data.

The slides and recording of Bethany's talk are available to conference attendees.


Submitted by ramu k. on

what is the main difference between join concept and data blending? if u don't mine please explain with examples?

Thanks & Regards

Submitted by Michael Mixon (not verified) on

One other tip is that you can join on calculated fields, not just native fields. I use this all the time when the fields I want to join on don't line up cleanly (e.g. in one connection APAC and China are separate values, whereas in the other connection they are combined into a value called "APAC+China"). So I can create a calculated field in the first connection to combine APAC and China into a new values (e.g. "APAC+China") and then blend on that new calculated field.

Submitted by Alex B. on

Both are ways to include *columns* from multiple tables into one resulting (wider) table. The main difference is *where* the two tables are combined. Traditional database joins are performed by the data source (the server). Blending is performed by Tableau (the client).

You specify how to join tables by editing the data source, selecting the multiple tables option and then defining the join keys -- or by providing custom SQL. You specify how to blend query results by using the edit relationships menu item on the data menu. Server side joins are more efficient in general.

One useful aspect of data blending is that you can combine data from different data sources even if they reside on different servers. So blends can be a good solution if you need to combine columns from a spreadsheet with columns from a database table that has some key in common. The alternative would be to import your spreadsheet into a table on your database server so that you can join the two tables. Sometimes that is difficult to arrange.

Neither approach is designed to append rows from multiple tables, such as appending the May data to the (identically formatted) April data to make a table with rows from both April and May. For that you need a SQL UNION ALL clause (or need to append to a Tableau extract)

Submitted by Lari M. on

Ramu, one way to think of the difference between joins and blends is that joining is done from within a single data source (joining two tables from the same database, or two sheets from the same Excel file). Blending is done across sources (a table from a database with a sheet from an Excel file).

Submitted by Ryan L. on

JOINING must be always done at the most granular level and is done on the database server side
but BLENDING is flexible.

To set up BLENDING,
Add pairing columns from 2 data sources in the 'Edit Relationship' Dialog box, they are all potential BLENDING options (Country, State, County, City) which can be used.
You can use all BLENDING pairing columns in one worksheet(view) but you can also just use one single BLENDING pairing column (State) in another worksheet of the same workbook by unlinking others.

With that said, JOINING is global to the workbook while BLENDING is local for every individual worksheet of the same workbook.

Submitted by Dan Audette (not verified) on

I am definitely a fan of data blending. One piece of advice though.....extract. When connecting to certain data sources, the only way to get data blending to correctly work is to extract both data sources and then data blend the extracts. Trust me, this will save a lot of time and heartache.

Submitted by Laurent (not verified) on

I've tried everything I could to use combine datasets (stored in txt or csv) from different years into a single one, but no luck so far. I'm using Tableau 8,2 Professional Edition!

Thank you so much

Submitted by Mohit Agarwal (not verified) on

Is it necessary to include the linking column in the report? I'd like to link two data sources by a common field but I want to group the data at a higher level. Any time I remove the linking column, I receive an error.
Here's a simple example of what I'm trying. I have a spreadsheet with a list of hospitals and doctors that work there. Another that has doctors and their patients. I blend the data on doctor but I want to show total number of patients that have been to a hospital. Right now I have to include Hospital -> Doctor -> Patient Count and I want Hospital -> Patient Count.

In response to Laurent's question (#7) blending isn't used to add more rows to a result set, but to add more columns. You can't UNION result sets with blending. See response #3.


Submitted by Srinivas N. on

Can Tableau blend data from SQL Server DB and Impala? How this works w.r.t performance? Basically I want to maintain all my Dimension data and Aggregate data into SQL Server and Transaction data (daily) in Tableau

Submitted by John H (not verified) on

Why can't we do full outer join? This is a big limitation for us.