Find Outliers With Box Plots

The box & whisker plot graph uses four descriptive statistics to visualize outliers and data anomalies. Box plot outlier and anomaly detection analysis requires three descriptive statistics called "percentiles": the 25th percentile, 50th percentile, and 75th percentile. If the 25th percentile of a dataset is $100.23, 25% of values in this data fall below $100.23. And 75% of values are below $365.00 if the box plot statistic is the 75th percentile. The 50th percentile of a dataset, in this case $220, indicates half the values fall below, and the other half, above, $220. In business statistics, the 50th percentile is called the "median" of a dataset.

 

A box and whisker graph uses the 25th percentile, 50th percentile, and 75th percentile values to calculate the interquartile range (IQR) statistic for a dataset. The IQR is calculated by subtracting the 25th percentile ($100.23) from the 75th percentile ($365): $365-$100.23 = $264.77 would represent the IQR for the dataset based on the the 25th, 50th, and 75th percentile values. The IQR statistic is used to calculate the top and bottom whiskers of the box plot graph. Any data point that is greater than the top whisker value on a box and whisker plot is a "High Outlier": also called a data anomaly in business statistics. Values that are below the bottom whisker in a box and whisker graph are "Low Outlier" data anomalies.

 

The free outlier and anomaly detection template works on one column of data in your spreadsheet. Analyze 3 columns of data at once for outliers and anomalies with the Scorecard outlier and anomaly detection template. The Time Series outlier and anomaly detection template allows you to automatically find patterns and trends in 12 columns of data using the free template's outlier and anomaly detection technique. You can read the free case study to learn more about this.