Data blending is the ability to bring data from multiple data sources into one Tableau view, without the need for any special coding. Do you do data blending? Or wish you knew more about it? Here are 9 tips from one of our Tableau sessions delivered at the Tableau Conference in Munich this week.

Bethany Lyons is one of our product consultants in EMEA. She's an expert on blending, table calculations and delivering high-energy conference sessions (she will be repeating these sessions in London, so register to get a chance to see her in action). Here are some of her tips. For more resources, follow the links at the end of this post.

1. In the data window, the blue tick indicates the Primary data source (Sales in the image below). An orange tick indicates a secondary data source (Promotions).

2. Blending runs a query on each data source and joins the aggregated results of those queries.

3. The first field you drop onto a worksheet defines the primary datasource. Any other data source you use will be a secondary data source. The secondary source fields are shown on shelves with the orange tick marks.

4. A default blend is equivalent to a left outer join. However, by switching which data source is primary, or by filtering nulls, it is possible to emulate left, right and inner joins.

5. You cannot emulate a full outer join using blending.

6. Sometimes you see NULL values when you blend. This is when there is no corresponding row based on the join key in the secondary data source. In the example below, Patient is in the primary datasource. Illness is from the secondary data source. Sam and Will have no data in the secondary data source.

7. If you do not override the default choice, Tableau will automatically switch the blend field to the most appropriate for the view.

8. You can ask new business questions in a view just by changing the blend fields.

9. The * is often a cause of frustration but needn't be. What it means is that the primary and secondary data sources do not share data that enables you to create the view you have attempted to draw. The * is an indication that your data sources do not contain enough information to blend and display the view you have created.

Bethany went into much more detail than this list! If you want more information on blending, check out these resources:
Watch our data blending tutorial
Read a knowledge base article on choosing between joining or blending data.

The slides and recording of Bethany's talk are available to conference attendees.

您可能会感兴趣的内容…

评论

One other tip is that you can join on calculated fields, not just native fields. I use this all the time when the fields I want to join on don't line up cleanly (e.g. in one connection APAC and China are separate values, whereas in the other connection they are combined into a value called "APAC+China"). So I can create a calculated field in the first connection to combine APAC and China into a new values (e.g. "APAC+China") and then blend on that new calculated field.

Ramu, one way to think of the difference between joins and blends is that joining is done from within a single data source (joining two tables from the same database, or two sheets from the same Excel file). Blending is done across sources (a table from a database with a sheet from an Excel file).

I am definitely a fan of data blending. One piece of advice though.....extract. When connecting to certain data sources, the only way to get data blending to correctly work is to extract both data sources and then data blend the extracts. Trust me, this will save a lot of time and heartache.

I've tried everything I could to use combine datasets (stored in txt or csv) from different years into a single one, but no luck so far. I'm using Tableau 8,2 Professional Edition!

Thank you so much

Is it necessary to include the linking column in the report? I'd like to link two data sources by a common field but I want to group the data at a higher level. Any time I remove the linking column, I receive an error.
Here's a simple example of what I'm trying. I have a spreadsheet with a list of hospitals and doctors that work there. Another that has doctors and their patients. I blend the data on doctor but I want to show total number of patients that have been to a hospital. Right now I have to include Hospital -> Doctor -> Patient Count and I want Hospital -> Patient Count.

In response to Laurent's question (#7) blending isn't used to add more rows to a result set, but to add more columns. You can't UNION result sets with blending. See response #3.

Thanks,
Mohit

Why can't we do full outer join? This is a big limitation for us.


订阅我们的博客