Histogram Bin Size

An introduction to frequency distributions, histograms, and how to choose an appropriate bin width for visualization.

HistogramData VisualizationBeginner

What is a Frequency Distribution?

When analyzing data, we often want to know how frequently each value (or range of values) appears.
A frequency distribution table organizes data into intervals (called classes or bins) and counts how many data points fall into each.

Steps to create one:

  1. Check the range of the data
  2. Divide the range into several intervals (bins)
  3. Count how many observations fall into each interval

Example: Suppose we have test scores ranging from 0 to 100. If we use intervals of 10 points each:

  • 0–10 points: 2 students
  • 10–20 points: 5 students
  • 20–30 points: 8 students
    … and so on.

What is a Histogram?

A histogram is the graphical representation of a frequency distribution.
On the horizontal axis, we place the intervals (bins), and on the vertical axis, the frequencies.

Unlike a bar chart (which is used for categorical data), histograms show continuous intervals, so the bars touch each other with no gaps.

Why Bin Size Matters

When drawing a histogram, we must decide the bin width (the size of each interval).
This choice dramatically affects how the distribution looks.

1. If the bin width is too large

  • The histogram looks overly smooth and hides important features.
  • Example: Grouping 0–100 into one bin shows only the overall count, but no distribution shape.

2. If the bin width is too small

  • The histogram looks too jagged.
  • Random noise in the data may overshadow the overall pattern.

Rules for Choosing Bin Width

There are several guidelines in statistics to help choose bin width.

Sturges’ Rule

For nn data points, the recommended number of bins kk is:

k=1+log2(n)k = 1 + \log_2(n)

Step-by-step (example with n=100n=100):

  1. Compute log2(100)6.64\log_2(100)\approx 6.64
  2. Add 1: k7.64k \approx 7.64
  3. Round to a convenient integer: about 8 bins

If the data range is R=max(x)min(x)R = \max(x)-\min(x), a corresponding bin width is:

h=Rkh = \frac{R}{k}

Freedman–Diaconis Rule

This rule sets the bin width hh as:

h=2IQRn1/3h = \frac{2 \cdot IQR}{n^{1/3}}

where IQR=Q3Q1IQR = Q_3 - Q_1 is the interquartile range.

Step-by-step:

  1. Find Q1Q_1 (25th percentile) and Q3Q_3 (75th percentile)
  2. Compute IQR=Q3Q1IQR = Q_3 - Q_1
  3. Compute n1/3n^{1/3} (the cube root of sample size)
  4. Plug into the formula to get hh
  5. Optionally, compute kR/hk \approx R/h

This method is robust to outliers and works well for skewed data.

Practical Advice

  • Start with Sturges’ rule as a quick guess
  • Check whether the histogram reveals the main shape without being too noisy
  • Adjust as needed (slightly larger or smaller hh)
  • Prefer Freedman–Diaconis when there are outliers or heavy skew

The goal is not to blindly follow formulas, but to choose bins that best reveal the structure of the data.

Interactive Demo

Use the tool below to experiment with bin widths and see how the shape of the histogram changes:

Interactive Histogram: Effect of Bin Size

350

Current Bins

10

Sturges' Rule

-Infinity

1 + log₂(n)

Freedman-Diaconis

10

2×IQR/n^(1/3)

What to Look For:

Too Few Bins:

Histogram looks overly smooth, important features are hidden

Too Many Bins:

Histogram looks jagged, noise overshadows the pattern

Just Right:

Clear shape visible without excessive noise - try the recommended values!

Sample size: 0 data points | Range: N/A

Key Takeaways

  • Frequency distribution: organizes data into intervals
  • Histogram: a graphical representation of the frequency distribution
  • Bin width: too large hides details; too small exaggerates noise
  • Use Sturges’ or Freedman–Diaconis as a starting point, then adjust pragmatically
← Back to Encyclopedia