Least Squares Regression

July 17, 2025

A gentle explanation of simple linear regression by combining the mathematical derivation with a visual demo of best-fit lines.

RegressionLeast Squares

What Does “Closest Line” Mean?

In simple linear regression, we fit a line of the form:

y_i = \beta_0 + \beta_1 x_i + \varepsilon_i

Here, $\beta_0$ is the intercept, $\beta_1$ is the slope, and $\varepsilon$ is the errors.
Given data points $(x_i, y_i)$ , the goal of least squares is to find the line that is best “close” to all these points—in other words, the sum of squared vertical distances from each point to the line is minimized.

Defining the Objective Function

We measure quality by the sum of squared distances (errors):

S(\beta_0, \beta_1)=\sum_{i=1}^n\varepsilon_i = \sum_{i=1}^n (y_i - (\beta_0 + \beta_1 x_i))^2

Our task: find $\beta_0, \beta_1$ that minimize $S$ .

Derivation: Formula for Slope and Intercept

Step 1: Partial Derivatives for Minimization

Take partial derivatives with respect to $\beta_0$ and $\beta_1$ , and set them to zero.

With respect to $\beta_0$ :

\frac{\partial S}{\partial \beta_0} = -2 \sum_{i=1}^n (y_i - \beta_0 - \beta_1 x_i) = 0

With respect to $\beta_1$ :

\frac{\partial S}{\partial \beta_1} = -2 \sum_{i=1}^n x_i (y_i - \beta_0 - \beta_1 x_i) = 0

Step 2: Normal Equations

From these two conditions we get:

$\sum y_i = n \beta_0 + \beta_1 \sum x_i$
$\sum x_i y_i = \beta_0 \sum x_i + \beta_1 \sum x_i^2$

Solving these, we obtain:

\beta_1 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2}

\beta_0 = \bar{y} - \beta_1 \bar{x}

where $\bar{x} = \tfrac{1}{n} \sum x_i$ and $\bar{y} = \tfrac{1}{n} \sum y_i$ .

Visual Demo of the Intuition

In the visualization below, you can drag the points around and see:

The regression line updating in real time
Vertical red lines showing distances from each point to the line
As you adjust the line, notice that these red distances—and particularly their sum of squares—change

When the sum of squared red distances becomes minimized, that line is the least squares solution.

Interactive Least Squares Regression

Regression Line: y = 1.10 + 0.95x

Sum of Squared Errors: 0.70

Drag the points to see how the regression line adjusts to minimize the sum of squared errors

Data points (drag to move)

Regression line

Vertical distances (errors)

Mean point (x̄, ȳ)

Key Insights:

• The regression line always passes through the mean point (x̄, ȳ)
• Red dashed lines show the vertical distances from each point to the line
• The algorithm minimizes the sum of squared red distances
• Try moving points to see how the line responds instantly

Summary

Least squares regression finds the line that is as close as possible to all data points, in the sense of minimizing the sum of squared vertical distances. The derivation may be algebraic, but the underlying concept is beautifully intuitive and easy to visualize with the demo above.