Variance and Standard Deviation

August 17, 2025

A beginner-friendly guide to variance and standard deviation: definitions, calculation formulas, unbiased estimation, and why we square deviations.

BeginnerVarianceStandard Deviation

Variance and Standard Deviation: From Formulas to Why We Square

Variance and standard deviation are the central measures for describing how spread out data are.
In this article, we’ll cover:

Definitions and intuition
Hand calculations and the shortcut formula
Why sample variance uses $n-1$ (Bessel’s correction)
Why deviations are squared instead of absolute values
Interactive demos to build intuition

0. Notation

Data points: $x_1,\dots,x_n$
Sample mean: $\displaystyle \bar{x}=\frac{1}{n}\sum_{i=1}^n x_i$
Population variance: $\sigma^2=\mathrm{Var}(X)=\mathbb{E}\!\big[(X-\mu)^2\big]$
Sample variance (unbiased): $\displaystyle s^2=\frac{1}{n-1}\sum_{i=1}^n (x_i-\bar{x})^2$
Standard deviation: $\sigma=\sqrt{\sigma^2}$ , $s=\sqrt{s^2}$

1. Intuition: What are Variance and Standard Deviation?

Because deviations $x_i-\bar{x}$ sum to zero, we square them and average:

s^2=\frac{1}{n-1}\sum_{i=1}^n (x_i-\bar{x})^2.

Taking the square root returns to the original unit (meters, seconds, dollars, …):

Variance = average squared distance from the mean
Standard deviation = typical distance from the mean

Interactive Demo ①: Small vs. Large Variance (Histogram + ±σ bands)

This demo renders a responsive histogram (D3) for three presets—Low / Medium / High variance—computed with the sample variance ( $n-1$ ).
It overlays a red mean line (labelled “ $\mu$ ” for readability; numerically equal to $\bar{x}$ here) and shaded $\pm1\sigma$ , $\pm2\sigma$ bands to visualize typical spread.
It also lists raw data and live stats, so you can link the shape of the histogram to the numbers.

Interactive Demo: Small vs. Large Variance

Low Variance (n = 18)

Raw Data:

[4.6, 4.7, 4.8, 4.8, 4.9, 4.9, 4.9, 5.0, 5.0, 5.0, 5.0, 5.1, 5.1, 5.1, 5.2, 5.2, 5.3, 5.4]

Mean (x̄)

5.00

Variance (s²)

0.04

Std Dev (s)

0.21

Distribution Histogram

Key Observations:

• Low variance → data clustered tightly around the mean (narrow bell)
• High variance → data spread widely from the mean (wide, flat bell)
• The smooth curve approximates the normal distribution shape
• Standard deviation bands show typical spread ranges

2. Shortcut Formula for Variance

Expanding the square gives a useful computational shortcut:

\sum_{i=1}^n (x_i-\bar{x})^2 = \sum_{i=1}^n x_i^2 - n\,\bar{x}^{\,2}. \tag{Shortcut Formula}

Derivation (step by step):

$(x_i-\bar{x})^2=x_i^2-2x_i\bar{x}+\bar{x}^2$
Sum both sides: $\sum (x_i-\bar{x})^2=\sum x_i^2 -2\bar{x}\sum x_i + n\bar{x}^2$
Use $\sum x_i=n\bar{x}$ → $\sum x_i^2-n\bar{x}^2$

Thus,

s^2=\frac{1}{n-1}\Big(\sum x_i^2-n\bar{x}^2\Big).

3. Worked Example

Data: $2,4,4,4,5,5,7,9$ ( $n=8$ )

Mean: $\bar{x}=5$
Sum of squared deviations: $32$
Sample variance: $s^2=32/7\approx4.57$
Standard deviation: $s\approx2.14$

4. Why Divide by $n-1$ ? (Unbiased Estimation)

If we divide by $n$ :

\frac{1}{n}\sum (x_i-\bar{x})^2,

the expectation is biased low:

\mathbb{E}\!\left[\frac{1}{n}\sum (X_i-\bar{X})^2\right]=\frac{n-1}{n}\sigma^2.

Instead, dividing by $n-1$ gives

s^2=\frac{1}{n-1}\sum (x_i-\bar{x})^2,

which satisfies $\mathbb{E}[s^2]=\sigma^2$ —the unbiased estimator.

Interactive Demo ②: $n$ vs. $n-1$ (Unbiased vs. Biased)

This widget draws random samples from a fixed “population,” then shows side by side:

variance with denominator $n$ (biased),
variance with denominator $n-1$ (unbiased), and
the true population variance.

You’ll see the $n$ -denominator underestimates on average, especially for small $n$ , while $n-1$ lines up with the population variance as theory predicts.

Interactive Demo: Understanding Biased vs Unbiased Variance

Sample Size (n)8

Population Mean (μ)5

Population Std (σ)1.5

Current Sample Data:

[]

Sample Histogram

Sample Mean (x̄)

Step-by-Step Calculations

Key Observations:

Why n-1? When we use the sample mean x̄ to calculate deviations, we lose one degree of freedom. The biased estimator (÷n) systematically underestimates the population variance, especially for small samples.

Bessel's Correction: Dividing by (n-1) instead of n corrects this bias. The unbiased estimator's expected value equals the true population variance: E[s²] = σ².

Try different sample sizes: Notice how the bias is more pronounced with smaller samples (n=3-10) but becomes negligible as n grows large. The histogram shows how your sample compares to the true population distribution.

5. Why Do We Square Deviations?

Compatible with the mean (least squares).
Minimizing $\sum(x_i-m)^2$ yields $m=\bar{x}$ ; minimizing $\sum|x_i-m|$ yields the median.
Clean algebra.
$\mathrm{Var}(X)=\mathbb{E}[X^2]-(\mathbb{E}[X])^2$ relies on squaring.
Smooth and differentiable.
Essential for optimization and regression (absolute value has kinks at $0$ ).
Geometric meaning.
Squared deviations are squared Euclidean distances; orthogonal decompositions (ANOVA, regression) behave like Pythagoras.
Additivity for independent sums.
If $X\perp Y$ , then $\mathrm{Var}(X+Y)=\mathrm{Var}(X)+\mathrm{Var}(Y)$ .

(Caveat: squaring is sensitive to outliers; use robust alternatives like the median and MAD when needed.)

6. Summary

Variance = squared deviations averaged; Standard deviation = square root in original units.
Shortcut formula simplifies hand calculations.
$n-1$ ensures unbiased estimation of population variance.
Squaring offers algebraic, geometric, and optimization advantages.

Variance and Standard Deviation: From Formulas to Why We Square

0. Notation

1. Intuition: What are Variance and Standard Deviation?

Interactive Demo ①: Small vs. Large Variance (Histogram + ±σ bands)

Interactive Demo: Small vs. Large Variance

Low Variance (n = 18)

Distribution Histogram

Key Observations:

2. Shortcut Formula for Variance

3. Worked Example

4. Why Divide by n−1n-1n−1? (Unbiased Estimation)

Interactive Demo ②: nnn vs. n−1n-1n−1 (Unbiased vs. Biased)

Interactive Demo: Understanding Biased vs Unbiased Variance

Sample Histogram

Step-by-Step Calculations

Key Observations:

5. Why Do We Square Deviations?

6. Summary

4. Why Divide by $n-1$ ? (Unbiased Estimation)

Interactive Demo ②: $n$ vs. $n-1$ (Unbiased vs. Biased)