# Scatter Plots

### Understanding and Using Scatter Plots

A scatter plot displays data points on a chart at the point at which two measures intersect. Scatter plots make it easy to analyze the relationship between two numbers, as they display all data points in the same view. The x-axis (horizontal line) and y-axis (vertical line) each contain their own field. Scatter plots display data points as dots or symbols along the x- and y-axes of a chart.

Scatter plots, like line charts, use Cartesian coordinates to visualize data. A Cartesian coordinates system uses coordinates (identified by numbers) to plot the points on a chart. Since scatter plots use two fields (the x- and y-axes), they make it easier to interpret complex data.

### How To Read Scatter Plots

A scatter plot uses two fields to show the relationship between pairs of variables in a single chart. In general, the x-axis is the chart’s independent variable, and the y-axis is the chart’s dependent variable. An independent variable is not changed by the other variables in your measurement. Alternately, independent variables affect dependent variables. A scatter plot’s purpose is to show how changes in the independent variable change the dependent variable.

First, to read a scatter plot, make sure you understand what the independent (x-axis) and dependent variables (y-axis) are measuring.

Next, we examine the view to see if we can identify a correlation between fields in the view. If the variables correlate they will fall along a line or curve. The stronger the correlation the tighter the data points will follow the line or curve. Some common correlations you can find using scatter plots include positive, negative, and null.

A **null correlation** means there is no clear correlation between the variables.

A **positive correlation** is when both variables move in the same direction. This means you could see a positive correlation as long as both the variables either increase or decrease.

A **negative correlation** is when both variables move in opposite directions. This means you can see a negative correlation when one variable increases and the other variable decreases.

Scatter plots help identify correlations between variables. But it’s important to remember that correlation does not equal causation. Scatter plots don’t necessarily answer why a variable changes.

Finally, locate data points on the chart and see where they fall on the axes. Look at the other points to see how the independent variable affects the dependent variable.

Looking at individual data points after looking at how they affect each other can help you better understand the relationship.

### What Type of Analysis Do Scatter Plots Support?

Scatter plots support finding correlations between two variables. A simple form of a scatter plot might help reveal if the level of precipitation affected the number of umbrellas sold on a given day.

### When and How to Use Scatter Plots for Visual Analysis

You can use scatter plots to investigate whether there is a relationship between two variables. Doing so can show if one variable is a good predictor of another.

For example, a scatter plot can help you see if there’s a connection between an ice cream shop's sales and the average daily temperature. In this scenario, the average daily temperature will constitute your independent variable (x-axis). And the dependent variable (y-axis) is a scale of your ice cream sales, starting from zero. Each data point on this chart would represent one day.

Plot the points by the average daily temperature and daily ice cream sales. After placing all the data points, you can look to see if there’s a correlation between your ice cream sales and the temperature. As mentioned before, correlation doesn't show causation on a scatter plot. But, if you see more ice cream sales when the temperature is hotter than you could infer a relationship between these two variables.

### Great Examples of Scatter Plots

This chart looks at the correlation between sales and profits made by a store. You can tell from the chart that higher sales do not necessarily equate to higher profit.

- Each point is clearly labeled.
- There are not too many marks on the view, so viewers can see the points with minimal overlap.
- There is one consistent color.
- There is one consistent shape.

### Bad Examples of Scatter Plots and Alternatives

#### POOR EXAMPLE

This chart looks at the correlation between sales and profits, but it does not use a good format to create an easy to interpret scatter plot.

- This scatter plot uses too many different shapes
- This scatter plot uses too many different colors
- The chart visually overwhelms the user with too much information

#### BETTER ALTERNATIVE

A better alternative would be a side-by-side bar chart. The profits for each of the subcategories are visible, as well as the subcategories that made no profit. Chairs and phones clearly have the highest sales. Meanwhile, copiers have the highest profit.

### Sources

https://www.tableau.com/learn/whitepapers/which-chart-or-graph-is-right-for-you

https://datavizcatalogue.com/methods/scatterplot.html

https://www.carbondesignsystem.com/data-visualization/basic-charts#scatter

https://chartio.com/learn/charts/what-is-a-scatter-plot/

https://corporatefinanceinstitute.com/resources/knowledge/other/scatter-plot/

https://mste.illinois.edu/courses/ci330ms/youtsey/scatterinfo.html

https://www.itl.nist.gov/div898/handbook/eda/section3/eda33q.htm

https://www.statisticssolutions.com/directory-of-statistical-analyses-cluster-analysis/

https://help.tableau.com/current/pro/desktop/en-us/clustering.htm