Boxplot and IQR
Learn how to read boxplots, calculate IQR, and detect outliers in data.
Are mean and variance enough to summarize data?
In statistics, mean and variance are commonly used to summarize datasets.
However, these values alone do not always provide the full picture, especially regarding skewness or the presence of outliers.
For example:
- The mean is highly sensitive to extreme values (outliers).
- The variance shows overall spread but does not reveal where most data points are concentrated.
To address this, we often use a boxplot.
What is a boxplot?
A boxplot is a visual summary of data distribution. It highlights:
the median, quartiles, spread of the data, and potential outliers all in one diagram.
A boxplot consists of:
- Box
Represents the middle 50% of the data (from Q1 to Q3). - Line inside the box
The median (Q2). - Whiskers
Typically extend to the minimum and maximum values, excluding outliers. - Dots (●)
Outliers that fall outside the whiskers.
Quartiles and IQR
To define quartiles, first sort the data in ascending order:
- Q1 (first quartile): the 25th percentile
- Q2 (median): the 50th percentile
- Q3 (third quartile): the 75th percentile
The interquartile range (IQR) is:
This measures the spread of the middle 50% of the data.
Defining outliers
A common rule defines outliers using IQR:
- Lower bound:
- Upper bound:
Any point outside this range is considered an outlier and shown as a dot beyond the whiskers.
Example
Dataset:
[2, 4, 5, 7, 8, 10, 15, 18, 20]
- Data is already sorted.
- Median (Q2) = 8
- Q1 = 4.5 (median of 2 and 7)
- Q3 = 16.5 (median of 15 and 18)
- IQR = 16.5 - 4.5 = 12
Outlier detection:
- Lower bound =
- Upper bound =
All points fall within this range, so no outliers exist.
Interactive Demo
In the demo below, you can choose from preset datasets to explore boxplots.
The tool will automatically calculate quartiles, IQR, and highlight outliers.
Interactive Boxplot Explorer
Choose different datasets to explore how boxplots reveal data distribution, quartiles, and outliers.
Well-balanced data with no outliers
📱 On mobile: Scroll horizontally to see the full plot
Dataset Values
How to read this vertical boxplot:
- • The box contains the middle 50% of data (from Q1 to Q3)
- • The thick line inside the box shows the median (Q2)
- • The whiskers extend to the furthest non-outlier points
- • Red dots represent outliers beyond 1.5 × IQR from the quartiles
- • Blue dots on the right show all individual data points
- • Quartile labels are positioned on the left with connecting dotted lines