Finding Outliers in a Scatter Diagram
- David Gurney
Last updated 2/4/2019
An outlier in a scatter diagram is a data point which is the maximum distance from the regression line. If two data points are the same maximum distance from the regression line, then they are both outliers. The outliers are marked in each scatter diagram that is created below. Move the "size" slider to select a new sample size. Check the box next to "Show Distances" to show line segments that are perpendicular to the regression line and extend from each data point to the regression line. The lengths of these line segments are the distances of the corresponding points to the regression line. These distances are shown next to each point.
Usually, finding the distance from each point to the regression line is not necessary. One can decide which points are outliers just by looking or by comparing the distances to the regression line for only a few of the points. Updated January 29, 2018