ATE, ATT, and ATC
A clear explanation of ATE, ATT, and ATC in causal inference, using a regression-based example with visual illustrations.
Introduction: Who Is the Effect For?
In causal inference, we want to understand how much impact a treatment or intervention (like attending cram school) has. But it’s crucial to clarify who the effect is being measured for.
For example, if test scores increased after attending a cram school, are we referring to:
- All students?
- Only the students who attended?
- Students who didn’t attend, but might have?
These are all different questions. In this post, we’ll explain three key concepts that answer them:
- ATE: Average Treatment Effect — the effect if everyone received the treatment.
- ATT: Average Treatment effect on the Treated — the effect for those who actually received the treatment.
- ATC: Average Treatment effect on the Controls — the effect for those who didn’t receive the treatment.
We’ll walk through an example using regression and clarify when and how each effect is used.
A Simple Example: Does Cram School Raise Test Scores?
Suppose we survey 100 students:
- T (treatment): Did the student attend a cram school? (1 = yes, 0 = no)
- Y (outcome): Final test score
Average test scores:
- Students who attended: 85
- Students who did not attend: 72
A naive conclusion might be: “Cram school increases scores by 13 points!” But that’s not necessarily a causal conclusion.
What Are ATE, ATT, and ATC?
ATE: Average Treatment Effect
”What if everyone attended cram school versus if no one did? How much would the average score differ?”
This is useful when evaluating general policy impact.
ATT: Effect on the Treated
”For students who actually attended, how much better did they do compared to if they hadn’t attended?”
This tells us how much the treatment helped the people who received it.
ATC: Effect on the Untreated
”For students who did not attend, what would have happened if they had?”
This tells us whether the treatment would have been helpful to those who missed it.
Why Differentiate?
The method you use can affect which of these you’re actually estimating.
For example, if students who attend cram school are more motivated or already better-performing, the ATT may differ greatly from the ATE. Being precise about who the effect is for prevents misleading conclusions.
Estimating ATE via Simple Regression
We can estimate the ATE using a simple regression model:
Where:
- is student ‘s test score
- indicates cram school attendance (1 or 0)
- estimates the average effect of attending (≈ ATE)
Here, gives us an estimate of ATE.
Estimating ATT and ATC: It’s Not So Simple
ATT and ATC are effects for specific subgroups.
For ATT, we ask: “How would students who attended have done if they hadn’t?” But we can’t directly observe that. Worse, students who choose to attend cram school often share characteristics:
- Higher prior achievement
- More motivation
- Supportive home environment
These confounding variables (covariates) affect both the decision to attend and the test scores.
Why Adjust for Covariates?
To fairly compare treated and untreated students, we must find untreated students who resembled the treated group in background:
- Similar prior test scores
- Similar family or study environments
This isn’t just about statistical fairness — it’s also about clearly choosing which group is the basis of comparison:
- Compare to the treated group → you estimate ATT
- Compare to the untreated group → you estimate ATC
The choice of comparison group shapes the interpretation of the effect.
Using Regression with Interaction Terms
One way to adjust for covariates is regression with interactions:
- is a covariate (e.g., prior achievement)
- lets the treatment effect vary by background
This model captures how treatment effects differ by individual characteristics.
To estimate ATT, focus on the covariates of those who actually received treatment. For ATC, base your analysis on the untreated group’s characteristics.
Interactive Visualization
This interactive tool visualizes the differences between ATE, ATT, and ATC using a simple simulation.
Understanding Causal Effects: ATE, ATT, and ATC
Explore the differences between Average Treatment Effect (ATE), Average Treatment Effect on the Treated (ATT), and Average Treatment Effect on the Controls (ATC) using a cram school example.
Average Treatment Effect (ATE)
Key Question:
"What if EVERYONE attended vs. NO ONE attended?"
Compares outcomes if the entire population received treatment versus if no one did.
Effect Size:
8.2 points
On average, cram school would increase test scores for the entire population.
Student Population (n=100)
Average Score: 85 points
Average Score: 72 points
What We're Comparing for ATE
Scenario A: Everyone attends cram school
Average score would be ~80 points
Scenario B: No one attends cram school
Average score would be ~72 points
Difference: +8.2 points
Population-wide effect of the policy
Quick Comparison
ATE
All students in the population
+8.2 pts
ATT
Students who attended cram school
+13 pts
ATC
Students who didn't attend cram school
+9 pts
Summary
- ATE, ATT, and ATC help us define who the effect is for.
- Different analytical methods estimate different effects — so we must be clear about our target.
- Estimating ATT and ATC requires accounting for covariates and deciding which group to compare against.
- Regression models with interaction terms allow flexible modeling of treatment effects across subgroups.
Understanding these distinctions helps make causal analysis more honest and insightful.