Histogram Bin Size
An introduction to frequency distributions, histograms, and how to choose an appropriate bin width for visualization.
What is a Frequency Distribution?
When analyzing data, we often want to know how frequently each value (or range of values) appears.
A frequency distribution table organizes data into intervals (called classes or bins) and counts how many data points fall into each.
Steps to create one:
- Check the range of the data
- Divide the range into several intervals (bins)
- Count how many observations fall into each interval
Example: Suppose we have test scores ranging from 0 to 100. If we use intervals of 10 points each:
- 0–10 points: 2 students
- 10–20 points: 5 students
- 20–30 points: 8 students
… and so on.
What is a Histogram?
A histogram is the graphical representation of a frequency distribution.
On the horizontal axis, we place the intervals (bins), and on the vertical axis, the frequencies.
Unlike a bar chart (which is used for categorical data), histograms show continuous intervals, so the bars touch each other with no gaps.
Why Bin Size Matters
When drawing a histogram, we must decide the bin width (the size of each interval).
This choice dramatically affects how the distribution looks.
1. If the bin width is too large
- The histogram looks overly smooth and hides important features.
- Example: Grouping 0–100 into one bin shows only the overall count, but no distribution shape.
2. If the bin width is too small
- The histogram looks too jagged.
- Random noise in the data may overshadow the overall pattern.
Rules for Choosing Bin Width
There are several guidelines in statistics to help choose bin width.
Sturges’ Rule
For data points, the recommended number of bins is:
Step-by-step (example with ):
- Compute
- Add 1:
- Round to a convenient integer: about 8 bins
If the data range is , a corresponding bin width is:
Freedman–Diaconis Rule
This rule sets the bin width as:
where is the interquartile range.
Step-by-step:
- Find (25th percentile) and (75th percentile)
- Compute
- Compute (the cube root of sample size)
- Plug into the formula to get
- Optionally, compute
This method is robust to outliers and works well for skewed data.
Practical Advice
- Start with Sturges’ rule as a quick guess
- Check whether the histogram reveals the main shape without being too noisy
- Adjust as needed (slightly larger or smaller )
- Prefer Freedman–Diaconis when there are outliers or heavy skew
The goal is not to blindly follow formulas, but to choose bins that best reveal the structure of the data.
Interactive Demo
Use the tool below to experiment with bin widths and see how the shape of the histogram changes:
Interactive Histogram: Effect of Bin Size
Current Bins
10
Sturges' Rule
-Infinity
1 + log₂(n)
Freedman-Diaconis
10
2×IQR/n^(1/3)
What to Look For:
Too Few Bins:
Histogram looks overly smooth, important features are hidden
Too Many Bins:
Histogram looks jagged, noise overshadows the pattern
Just Right:
Clear shape visible without excessive noise - try the recommended values!
Key Takeaways
- Frequency distribution: organizes data into intervals
- Histogram: a graphical representation of the frequency distribution
- Bin width: too large hides details; too small exaggerates noise
- Use Sturges’ or Freedman–Diaconis as a starting point, then adjust pragmatically