Sensitivity, Specificity, and ROC

July 17, 2025

How to evaluate medical test performance using a 2×2 table, ROC curves, and AUC calculations.

Medical StatisticsMedical Test

Today’s topic: sensitivity, specificity, and the ROC curve!

Let’s start with a basic truth: no medical test is perfect.

Sometimes a test says you’re sick when you’re not — that’s a false positive.
Sometimes it says you’re healthy when you’re actually sick — that’s a false negative.

The 2×2 Table: Four Possible Test Outcomes

To understand how well a test performs, imagine a simple 2×2 table:

	Disease Present	Disease Absent
Test Positive	True Positive (TP)	False Positive (FP)
Test Negative	False Negative (FN)	True Negative (TN)

This leads to four key outcomes:

True Positive (TP): You are sick, and the test detects it.
False Positive (FP): You’re healthy, but the test says you’re sick.
True Negative (TN): You’re healthy, and the test confirms it.
False Negative (FN): You are sick, but the test misses it.

Predictive Values: What Do Test Results Really Mean?

Positive Predictive Value (PPV)

The proportion of positive test results that are true positives:

\text{PPV} = \frac{\text{TP}}{\text{TP} + \text{FP}}

Negative Predictive Value (NPV)

The proportion of negative test results that are true negatives:

\text{NPV} = \frac{\text{TN}}{\text{TN} + \text{FN}}

These help answer practical questions like:

“If I tested positive, what are the chances I actually have it?”

Sensitivity and Specificity: Focusing on the Test Mechanics

Sensitivity (True Positive Rate)

How well the test detects people who are truly sick:

\text{Sensitivity} = \frac{\text{TP}}{\text{TP} + \text{FN}}

Specificity (True Negative Rate)

How well the test identifies healthy individuals:

\text{Specificity} = \frac{\text{TN}}{\text{TN} + \text{FP}}

These values help summarize what a test result might mean.

But what if a test gives too many false positives?
You might think: “Why not just raise the cutoff?”

Or if it’s missing real cases: “Why not lower it?”

And that’s the key issue—changing the cutoff shifts both sensitivity and specificity.
Improving one often comes at the cost of the other. This is the tradeoff we need to understand.

The Tradeoff: Adjusting the Test Threshold

Here’s the tricky part: sensitivity and specificity are a tradeoff.

Changing the test’s cutoff threshold — the point where a result is called “positive” — shifts the balance:

Lowering the threshold increases sensitivity but reduces specificity.
Raising the threshold does the opposite: higher specificity, lower sensitivity.

Each threshold setting gives a new 2×2 table. Imagine a slider that controls this threshold. You can try it with this interactive artifact!

Interactive Threshold Slider with 2×2 Table

Test Threshold: 0.50

0.00 (More Sensitive)1.00 (More Specific)

2×2 Confusion Matrix

Disease Present

Disease Absent

Test Positive

True Positive

False Positive

Test Negative

False Negative

True Negative

Calculated Metrics

Sensitivity (True Positive Rate)

70.0%

TP / (TP + FN) = 28 / (28 + 12)

Specificity (True Negative Rate)

71.7%

TN / (TN + FP) = 43 / (43 + 17)

Positive Predictive Value

62.2%

TP / (TP + FP) = 28 / (28 + 17)

Negative Predictive Value

78.2%

TN / (TN + FN) = 43 / (43 + 12)

Test Score Distribution

🔴 Disease Present🔵 Disease Absent📏 Yellow border = Test Positive

Instructions: Move the threshold slider to see how changing the cutoff value affects the 2×2 table and calculated metrics. Lower thresholds increase sensitivity but decrease specificity, and vice versa. This demonstrates the fundamental tradeoff in diagnostic testing.

So how do we choose the best threshold?

The ROC Curve: A Visual of All Possible Tradeoffs

To see the entire range of outcomes, we use the ROC curve. For many threshold values, we plot:

x-axis: False Positive Rate (FPR = 1 − Specificity)
y-axis: True Positive Rate (Sensitivity)

A perfect test reaches the top-left corner (FPR = 0, TPR = 1). A random test lies along the diagonal.

You can see how to make ROC curve with this simulation!

Live ROC Curve Builder

Current Threshold: 0.500

Current Point Metrics

88.8%

Sensitivity (TPR)

15.0%

False Positive Rate

True Positives: 71

False Positives: 18

True Negatives: 102

False Negatives: 9

Area Under Curve (AUC)

0.960

Perfect test: 1.000
Random test: 0.500
Current test: Excellent

About AUC

The AUC represents the probability that the test will correctly rank a randomly chosen positive case higher than a randomly chosen negative case.

Calculated using the trapezoidal rule for numerical integration.

ROC Curve

🔴 Red dot: Current threshold point
🔵 Blue line: Complete ROC curve
Gray dashed: Random classifier (AUC = 0.5)

Instructions: Adjust the threshold slider to see how each point on the ROC curve is generated. Click "Animate ROC Building" to watch the curve being constructed point by point. The AUC represents the overall discriminative ability of the test - a perfect test would hug the top-left corner (AUC = 1.0), while a random test follows the diagonal (AUC = 0.5).

AUC: Summarizing Test Performance with One Number

The Area Under the ROC Curve (AUC) captures overall test quality:

Trapezoidal Rule:

Approximate area by summing trapezoids between ROC points:

\text{AUC}_{\text{trap}} = \sum_{i=1}^{n-1} (FPR_{i+1} - FPR_i) \cdot \frac{TPR_{i+1} + TPR_i}{2}

Rectangular Rule (Left Riemann Sum):

Simpler but slightly less accurate:

\text{AUC}_{\text{rect}} = \sum_{i=1}^{n-1} (FPR_{i+1} - FPR_i) \cdot TPR_i

In Summary

Use a 2×2 table to define sensitivity, specificity, and predictive values.
Test thresholds create a tradeoff between sensitivity and specificity.
ROC curves visualize this tradeoff, and AUC gives a single-number summary.

Understanding these tools helps clinicians choose and interpret medical tests with clarity and confidence.