Histogram Bin Size

August 17, 2025

An introduction to frequency distributions, histograms, and how to choose an appropriate bin width for visualization.

HistogramData VisualizationBeginner

What is a Frequency Distribution?

When analyzing data, we often want to know how frequently each value (or range of values) appears.
A frequency distribution table organizes data into intervals (called classes or bins) and counts how many data points fall into each.

Steps to create one:

Check the range of the data
Divide the range into several intervals (bins)
Count how many observations fall into each interval

Example: Suppose we have test scores ranging from 0 to 100. If we use intervals of 10 points each:

0–10 points: 2 students
10–20 points: 5 students
20–30 points: 8 students
… and so on.

What is a Histogram?

A histogram is the graphical representation of a frequency distribution.
On the horizontal axis, we place the intervals (bins), and on the vertical axis, the frequencies.

Unlike a bar chart (which is used for categorical data), histograms show continuous intervals, so the bars touch each other with no gaps.

Why Bin Size Matters

When drawing a histogram, we must decide the bin width (the size of each interval).
This choice dramatically affects how the distribution looks.

1. If the bin width is too large

The histogram looks overly smooth and hides important features.
Example: Grouping 0–100 into one bin shows only the overall count, but no distribution shape.

2. If the bin width is too small

The histogram looks too jagged.
Random noise in the data may overshadow the overall pattern.

Rules for Choosing Bin Width

There are several guidelines in statistics to help choose bin width.

Sturges’ Rule

For $n$ data points, the recommended number of bins $k$ is:

k = 1 + \log_2(n)

Step-by-step (example with $n=100$ ):

Compute $\log_2(100)\approx 6.64$
Add 1: $k \approx 7.64$
Round to a convenient integer: about 8 bins

If the data range is $R = \max(x)-\min(x)$ , a corresponding bin width is:

h = \frac{R}{k}

Freedman–Diaconis Rule

This rule sets the bin width $h$ as:

h = \frac{2 \cdot IQR}{n^{1/3}}

where $IQR = Q_3 - Q_1$ is the interquartile range.

Step-by-step:

Find $Q_1$ (25th percentile) and $Q_3$ (75th percentile)
Compute $IQR = Q_3 - Q_1$
Compute $n^{1/3}$ (the cube root of sample size)
Plug into the formula to get $h$
Optionally, compute $k \approx R/h$

This method is robust to outliers and works well for skewed data.

Practical Advice

Start with Sturges’ rule as a quick guess
Check whether the histogram reveals the main shape without being too noisy
Adjust as needed (slightly larger or smaller $h$ )
Prefer Freedman–Diaconis when there are outliers or heavy skew

The goal is not to blindly follow formulas, but to choose bins that best reveal the structure of the data.

Interactive Demo

Use the tool below to experiment with bin widths and see how the shape of the histogram changes:

Interactive Histogram: Effect of Bin Size

Data Distribution Type:

Number of Bins: 10

350

Current Bins

Sturges' Rule

-Infinity

1 + log₂(n)

Freedman-Diaconis

2×IQR/n^(1/3)

What to Look For:

Too Few Bins:

Histogram looks overly smooth, important features are hidden

Too Many Bins:

Histogram looks jagged, noise overshadows the pattern

Just Right:

Clear shape visible without excessive noise - try the recommended values!

Sample size: 0 data points | Range: N/A

Key Takeaways

Frequency distribution: organizes data into intervals
Histogram: a graphical representation of the frequency distribution
Bin width: too large hides details; too small exaggerates noise
Use Sturges’ or Freedman–Diaconis as a starting point, then adjust pragmatically