DoWhy Tutorial 03: Backdoor Adjustment And Confounding
This notebook teaches the most common observational causal design: estimating a treatment effect after adjusting for observed common causes. The key idea is the backdoor path. A backdoor path is a noncausal path from treatment to outcome that starts by going backward into a common cause.
We will build a dataset where treatment assignment is confounded, show why the naive comparison is biased, use the causal graph to choose an adjustment set, estimate the effect with DoWhy, and then study failure cases: partial adjustment, bad controls, colliders, and unobserved confounding.
Learning Goals
By the end of this notebook, you should be able to:
Define confounding in terms of common causes and backdoor paths.
Explain why treated and untreated groups can differ even before treatment.
Identify a valid adjustment set for a total-effect question.
Estimate the same effect with manual regression and DoWhy.
Recognize partial adjustment, mediator adjustment, collider adjustment, and unobserved confounding as different failure modes.
Write a clear backdoor-adjustment summary with assumptions and limitations.
The Backdoor Idea In One Picture
Suppose A is treatment and Y is outcome. If a pre-treatment variable C causes both A and Y, then there is a path:
A <- C -> Y
That path creates a relationship between treatment and outcome even if treatment had no causal effect. Backdoor adjustment tries to block such paths by comparing treated and untreated units that are similar in C.
The goal is not to control for everything. The goal is to control for the right pre-treatment common causes.
Setup
This setup cell imports the packages used in the notebook, creates output folders, fixes a random seed, and suppresses known third-party compatibility warnings. It follows the same structure as the earlier DoWhy tutorials so students can focus on the causal content.
from pathlib import Pathimport osimport platformimport sysimport warningsSTART_DIR = Path.cwd().resolve()PROJECT_ROOT =next( (candidate for candidate in [START_DIR, *START_DIR.parents] if (candidate /"pyproject.toml").exists()), START_DIR,)NOTEBOOK_DIR = PROJECT_ROOT /"notebooks"/"tutorials"/"dowhy"OUTPUT_DIR = NOTEBOOK_DIR /"outputs"FIGURE_DIR = OUTPUT_DIR /"figures"TABLE_DIR = OUTPUT_DIR /"tables"CACHE_DIR = PROJECT_ROOT /".cache"/"matplotlib"for directory in [OUTPUT_DIR, FIGURE_DIR, TABLE_DIR, CACHE_DIR]: directory.mkdir(parents=True, exist_ok=True)os.environ.setdefault("MPLCONFIGDIR", str(CACHE_DIR))warnings.filterwarnings("default")warnings.filterwarnings("ignore", category=DeprecationWarning)warnings.filterwarnings("ignore", category=PendingDeprecationWarning)warnings.filterwarnings("ignore", category=FutureWarning)warnings.filterwarnings("ignore", message=".*IProgress not found.*")warnings.filterwarnings("ignore", message=".*setParseAction.*deprecated.*")warnings.filterwarnings("ignore", message=".*copy keyword is deprecated.*")warnings.filterwarnings("ignore", message=".*disp.*iprint.*L-BFGS-B.*")warnings.filterwarnings("ignore", module="dowhy.causal_estimators.regression_estimator")warnings.filterwarnings("ignore", module="sklearn.linear_model._logistic")warnings.filterwarnings("ignore", module="seaborn.categorical")warnings.filterwarnings("ignore", module="pydot.dot_parser")import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsimport networkx as nximport statsmodels.formula.api as smfimport dowhyfrom dowhy import CausalModelRANDOM_SEED =23rng = np.random.default_rng(RANDOM_SEED)sns.set_theme(style="whitegrid", context="notebook")print(f"Python executable: {sys.executable}")print(f"Python version: {platform.python_version()}")print(f"DoWhy version: {getattr(dowhy, '__version__', 'unknown')}")print(f"Notebook directory: {NOTEBOOK_DIR}")print(f"Output directory: {OUTPUT_DIR}")
The notebook is ready if this cell prints a DoWhy version. All outputs from this notebook use a 03_ prefix.
The Causal Question
We will use a product-style teaching example. A platform exposes some users to a feature, and we want to know whether the exposure increases future user value.
The causal question is:
What is the total average effect of feature_exposure on weekly_value?
The word total means we include all causal pathways from exposure to weekly value, including pathways through post-exposure activity.
Roles And Timing
Backdoor adjustment depends heavily on timing. This table lists each variable, when it is measured, and whether it is appropriate to adjust for when estimating the total effect.
role_table = pd.DataFrame( [ {"variable": "feature_exposure","role": "treatment","timing": "treatment time","total_effect_adjustment_guidance": "treatment, not a control", }, {"variable": "weekly_value","role": "outcome","timing": "future outcome window","total_effect_adjustment_guidance": "outcome, not a control", }, {"variable": "user_engagement","role": "observed confounder","timing": "pre-treatment","total_effect_adjustment_guidance": "adjust", }, {"variable": "prior_sessions","role": "observed confounder","timing": "pre-treatment","total_effect_adjustment_guidance": "adjust", }, {"variable": "account_age_weeks","role": "observed confounder","timing": "pre-treatment","total_effect_adjustment_guidance": "adjust", }, {"variable": "baseline_value","role": "observed confounder","timing": "pre-treatment","total_effect_adjustment_guidance": "adjust", }, {"variable": "post_exposure_activity","role": "mediator / bad control for total effect","timing": "post-treatment","total_effect_adjustment_guidance": "do not adjust for total effect", }, {"variable": "support_ticket","role": "collider / bad control","timing": "post-treatment","total_effect_adjustment_guidance": "do not adjust", }, {"variable": "treatment_probability","role": "simulation diagnostic","timing": "known only because this is simulated","total_effect_adjustment_guidance": "do not use as a real observed column", }, ])role_table.to_csv(TABLE_DIR /"03_variable_roles_and_timing.csv", index=False)role_table
variable
role
timing
total_effect_adjustment_guidance
0
feature_exposure
treatment
treatment time
treatment, not a control
1
weekly_value
outcome
future outcome window
outcome, not a control
2
user_engagement
observed confounder
pre-treatment
adjust
3
prior_sessions
observed confounder
pre-treatment
adjust
4
account_age_weeks
observed confounder
pre-treatment
adjust
5
baseline_value
observed confounder
pre-treatment
adjust
6
post_exposure_activity
mediator / bad control for total effect
post-treatment
do not adjust for total effect
7
support_ticket
collider / bad control
post-treatment
do not adjust
8
treatment_probability
simulation diagnostic
known only because this is simulated
do not use as a real observed column
The pre-treatment common causes are the backdoor adjustment variables. The mediator and collider are useful for teaching, but they should not be included as controls for the total effect.
Create A Confounded Dataset
This cell simulates data where the treatment effect is known. Treatment assignment depends on baseline engagement, previous sessions, account age, and baseline value. Those same variables also affect the outcome, so the naive exposed-versus-unexposed comparison will be biased.
The outcome also includes a mediated pathway through post_exposure_activity. That lets us show why controlling for a post-treatment mediator changes the estimand.
Rows: 5,000
Known direct effect: 1.8000
Known mediated component: 0.4000
Known total effect: 2.2000
feature_exposure
weekly_value
user_engagement
prior_sessions
account_age_weeks
baseline_value
post_exposure_activity
support_ticket
treatment_probability
0
1
9.413987
0.553261
1
7.155994
3.637375
1.278976
0
0.641172
1
1
10.329205
0.217601
2
21.010390
3.269387
2.433627
0
0.637095
2
1
7.837427
-0.057990
3
7.334888
3.320991
0.279148
0
0.529487
3
0
1.002724
-2.318936
1
9.944195
0.331314
-0.964946
0
0.094779
4
0
7.957376
0.431494
3
6.325859
3.002517
0.226412
0
0.606391
The generated data include the truth because this is a tutorial. In real observational data, the true effect is unknown; that is why assumptions and diagnostics matter.
Basic Dataset Summary
Before modeling, inspect treatment prevalence, outcome scale, confounder ranges, mediator scale, and collider frequency.
The treatment rate is not extreme, so the data contain both exposed and unexposed units. That does not make the treatment randomized; it only means there is enough raw comparison data to proceed.
Confirm That Treatment Assignment Is Confounded
The treatment was assigned with higher probability to stronger baseline users. This plot shows imbalance in baseline variables before any outcome modeling.
The exposed group starts out stronger on baseline dimensions. If we simply compare outcomes, we will mix the treatment effect with pre-existing differences.
Quantify Baseline Imbalance
A standardized mean difference compares treated and untreated groups in standard-deviation units. It is a compact way to summarize covariate imbalance before adjustment.
Large absolute standardized mean differences tell us the treated and untreated groups are not directly comparable at baseline. This is the data symptom of confounding.
Draw The Backdoor DAG
The graph below states the causal assumptions for the total-effect question. Baseline variables cause both treatment and outcome, so they create backdoor paths. The post-exposure activity variable is a mediator, and support ticket is a collider.
The blue baseline variables are the adjustment variables. The green mediator and red collider are downstream of treatment, so they are not part of the adjustment set for the total effect.
Path Reasoning
This table translates the graph into path-level decisions. A valid total-effect adjustment set blocks backdoor paths without blocking causal paths.
path_table = pd.DataFrame( [ {"path": "feature_exposure <- user_engagement -> weekly_value","path_type": "backdoor path","action_for_total_effect": "block by adjusting for user_engagement", }, {"path": "feature_exposure <- prior_sessions -> weekly_value","path_type": "backdoor path","action_for_total_effect": "block by adjusting for prior_sessions", }, {"path": "feature_exposure <- account_age_weeks -> weekly_value","path_type": "backdoor path","action_for_total_effect": "block by adjusting for account_age_weeks", }, {"path": "feature_exposure <- baseline_value -> weekly_value","path_type": "backdoor path","action_for_total_effect": "block by adjusting for baseline_value", }, {"path": "feature_exposure -> weekly_value","path_type": "direct causal path","action_for_total_effect": "keep open", }, {"path": "feature_exposure -> post_exposure_activity -> weekly_value","path_type": "mediated causal path","action_for_total_effect": "keep open for total effect", }, {"path": "feature_exposure -> support_ticket <- weekly_value","path_type": "collider path","action_for_total_effect": "leave closed by not conditioning on support_ticket", }, ])path_table.to_csv(TABLE_DIR /"03_backdoor_path_reasoning.csv", index=False)path_table
path
path_type
action_for_total_effect
0
feature_exposure <- user_engagement -> weekly_...
backdoor path
block by adjusting for user_engagement
1
feature_exposure <- prior_sessions -> weekly_v...
backdoor path
block by adjusting for prior_sessions
2
feature_exposure <- account_age_weeks -> weekl...
backdoor path
block by adjusting for account_age_weeks
3
feature_exposure <- baseline_value -> weekly_v...
backdoor path
block by adjusting for baseline_value
4
feature_exposure -> weekly_value
direct causal path
keep open
5
feature_exposure -> post_exposure_activity -> ...
mediated causal path
keep open for total effect
6
feature_exposure -> support_ticket <- weekly_v...
collider path
leave closed by not conditioning on support_ti...
The adjustment set is not a list of all useful predictors. It is the set needed to block noncausal paths while preserving the causal paths that define the total effect.
Naive Association Versus Backdoor Adjustment
We now estimate the treatment coefficient several ways. The naive model has no controls. The valid backdoor model controls for the observed pre-treatment common causes. The bad-control models add post-treatment variables that should not be included for the total effect.
The naive estimate is much larger than the total effect because baseline differences are mixed into the treatment comparison. The valid backdoor estimate should be close to the known total effect. The mediator-adjusted estimate should move toward the direct effect because part of the pathway has been controlled away.
Plot The Adjustment Results
This plot compares each specification with the known total and direct effects. It makes the consequences of under-adjustment and over-adjustment easier to see.
fig, ax = plt.subplots(figsize=(11, 5.8))sns.pointplot( data=manual_estimates, x="estimate", y="specification", linestyle="none", color="#2563eb", ax=ax,)for i, row in manual_estimates.reset_index(drop=True).iterrows(): ax.plot([row["ci_95_lower"], row["ci_95_upper"]], [i, i], color="#2563eb", linewidth=1.5)ax.axvline(truth["true_total_effect"], color="#111827", linestyle="--", linewidth=1.4, label="Known total effect")ax.axvline(truth["direct_effect"], color="#92400e", linestyle=":", linewidth=1.8, label="Known direct effect")ax.set_title("Backdoor Adjustment Versus Under-Adjustment And Bad Controls")ax.set_xlabel("Coefficient on feature_exposure")ax.set_ylabel("")ax.legend(loc="lower right")plt.tight_layout()fig.savefig(FIGURE_DIR /"03_backdoor_adjustment_estimates.png", dpi=160, bbox_inches="tight")plt.show()
The valid backdoor specification is the one that targets the total effect. Adding more controls is not automatically better; controls must match the causal question.
Build The DoWhy Graph
Now we translate the graph into DOT syntax for DoWhy. This graph includes the confounders, the mediated pathway, and the collider, so DoWhy can reason about which variables are common causes.
This graph is the formal version of the design we drew above. DoWhy will use it to identify the effect before estimating it.
Create A DoWhy CausalModel
This cell creates the DoWhy model from data, treatment, outcome, and graph. We exclude treatment_probability because that is a simulation-only diagnostic rather than an observed production variable.
DoWhy reports the pre-treatment common causes. The mediator and collider are not listed as common causes, which is what we want for the total-effect question.
Identify The Backdoor Estimand
Identification asks whether the causal effect can be expressed from observed data under the graph assumptions. For this graph, DoWhy should identify a backdoor-adjustment estimand.
Estimand type: EstimandType.NONPARAMETRIC_ATE
### Estimand : 1
Estimand name: backdoor
Estimand expression:
d ↪
──────────────────(E[weekly_value|baseline_value,prior_sessions,user_engagemen ↪
d[featureₑₓₚₒₛᵤᵣₑ] ↪
↪
↪ t,account_age_weeks])
↪
Estimand assumption 1, Unconfoundedness: If U→{feature_exposure} and U→weekly_value then P(weekly_value|feature_exposure,baseline_value,prior_sessions,user_engagement,account_age_weeks,U) = P(weekly_value|feature_exposure,baseline_value,prior_sessions,user_engagement,account_age_weeks)
### Estimand : 2
Estimand name: iv
No such variable(s) found!
### Estimand : 3
Estimand name: frontdoor
No such variable(s) found!
### Estimand : 4
Estimand name: general_adjustment
Estimand expression:
d ↪
──────────────────(E[weekly_value|baseline_value,prior_sessions,user_engagemen ↪
d[featureₑₓₚₒₛᵤᵣₑ] ↪
↪
↪ t,account_age_weeks])
↪
Estimand assumption 1, Unconfoundedness: If U→{feature_exposure} and U→weekly_value then P(weekly_value|feature_exposure,baseline_value,prior_sessions,user_engagement,account_age_weeks,U) = P(weekly_value|feature_exposure,baseline_value,prior_sessions,user_engagement,account_age_weeks)
The printed estimand states the key assumption: after conditioning on the observed common causes, there is no remaining unobserved common cause of treatment and outcome. That assumption is not a statistical output; it is a design claim.
Estimate The Backdoor Effect With DoWhy
Now we estimate the identified estimand using DoWhy’s linear-regression estimator. In this teaching data, the graph is correct and all common causes are observed, so the estimate should be near the known total effect.
dowhy_linear_estimate = backdoor_model.estimate_effect( identified_estimand, method_name="backdoor.linear_regression",)print(dowhy_linear_estimate)print(f"DoWhy backdoor estimate: {float(dowhy_linear_estimate.value):.4f}")print(f"Known total effect: {truth['true_total_effect']:.4f}")print(f"Known direct effect: {truth['direct_effect']:.4f}")
*** Causal Estimate ***
## Identified estimand
Estimand type: EstimandType.NONPARAMETRIC_ATE
### Estimand : 1
Estimand name: backdoor
Estimand expression:
d ↪
──────────────────(E[weekly_value|baseline_value,prior_sessions,user_engagemen ↪
d[featureₑₓₚₒₛᵤᵣₑ] ↪
↪
↪ t,account_age_weeks])
↪
Estimand assumption 1, Unconfoundedness: If U→{feature_exposure} and U→weekly_value then P(weekly_value|feature_exposure,baseline_value,prior_sessions,user_engagement,account_age_weeks,U) = P(weekly_value|feature_exposure,baseline_value,prior_sessions,user_engagement,account_age_weeks)
## Realized estimand
b: weekly_value~feature_exposure+baseline_value+prior_sessions+user_engagement+account_age_weeks
Target units: ate
## Estimate
Mean value: 2.208594718775249
DoWhy backdoor estimate: 2.2086
Known total effect: 2.2000
Known direct effect: 1.8000
The DoWhy estimate should align with the valid manual backdoor regression because they are using the same adjustment logic. The value is near the total effect, not the direct effect.
Compare Manual And DoWhy Estimates
This table places the manual estimates and DoWhy estimate together. It is a useful sanity check because DoWhy’s estimator should match the ordinary adjusted regression when the same linear specification is used.
The main lesson is not that DoWhy gives a different regression coefficient. The lesson is that DoWhy wraps the coefficient in a graph, an estimand, and a set of assumptions.
Refute The Backdoor Estimate
Refuters are stress tests. They do not prove the causal effect is correct, but they help catch estimates that behave strangely under simple perturbations.
def scalar_or_nan(value):try:returnfloat(np.asarray(value).reshape(-1)[0])exceptException:return np.nanrefuter_specs = [ ("placebo_treatment_refuter", {"method_name": "placebo_treatment_refuter","placebo_type": "permute","num_simulations": 20, },"A fake treatment should not reproduce the original effect.", ), ("random_common_cause", {"method_name": "random_common_cause","num_simulations": 20, },"Adding a random irrelevant common cause should not materially change the effect.", ), ("data_subset_refuter", {"method_name": "data_subset_refuter","subset_fraction": 0.80,"num_simulations": 20, },"Random subsets should produce estimates in the same neighborhood.", ),]refuter_rows = []for label, kwargs, expected_behavior in refuter_specs: result = backdoor_model.refute_estimate(identified_estimand, dowhy_linear_estimate, **kwargs) payload =getattr(result, "refutation_result", None) p_value = np.nanifisinstance(payload, dict): p_value = scalar_or_nan(payload.get("p_value")) new_effect = scalar_or_nan(result.new_effect) estimated_effect = scalar_or_nan(result.estimated_effect) refuter_rows.append( {"refuter": label,"estimated_effect": estimated_effect,"new_effect": new_effect,"effect_shift": new_effect - estimated_effect,"p_value": p_value,"expected_behavior": expected_behavior, } )refuter_summary = pd.DataFrame(refuter_rows)refuter_summary.to_csv(TABLE_DIR /"03_refuter_summary.csv", index=False)refuter_summary
refuter
estimated_effect
new_effect
effect_shift
p_value
expected_behavior
0
placebo_treatment_refuter
2.208595
-0.001197
-2.209791
0.486609
A fake treatment should not reproduce the orig...
1
random_common_cause
2.208595
2.208530
-0.000065
0.393349
Adding a random irrelevant common cause should...
2
data_subset_refuter
2.208595
2.213936
0.005341
0.367017
Random subsets should produce estimates in the...
The placebo result should move toward zero, while the random-common-cause and subset checks should stay close to the original estimate. These are basic checks, not a replacement for thinking carefully about unobserved confounding.
Visualize Refuter Results
This plot shows the original estimate next to the refuter-generated effects.
The refuter plot should show a clear separation between the original effect and the placebo effect. Stable subset and random-common-cause results are reassuring, but they do not test every possible source of bias.
What If A Confounder Is Unobserved?
Backdoor adjustment only works if the important common causes are observed and measured well enough. This section creates a second dataset with a latent confounder. We will compare naive adjustment, observed-only adjustment, and an oracle adjustment that includes the latent variable.
The oracle model is not available in real work. It is included only to show what unobserved confounding can do.
Observed controls help, but they do not fully remove bias when an important common cause is unobserved. The oracle row shows why measurement matters: a valid adjustment set is not just a graph idea; it requires actual columns that measure the relevant causes.
Visualize Hidden Confounding
This plot compares the hidden-confounding estimates to the true effect from the simulation.
fig, ax = plt.subplots(figsize=(10, 4.8))sns.pointplot( data=hidden_confounding_results, x="estimate", y="specification", linestyle="none", color="#2563eb", ax=ax,)for i, row in hidden_confounding_results.reset_index(drop=True).iterrows(): ax.plot([row["ci_95_lower"], row["ci_95_upper"]], [i, i], color="#2563eb", linewidth=1.5)ax.axvline(hidden_true_effect, color="#111827", linestyle="--", linewidth=1.4, label="Known true effect")ax.set_title("Backdoor Adjustment Cannot Fix Unobserved Confounding By Itself")ax.set_xlabel("Coefficient on feature_exposure")ax.set_ylabel("")ax.legend(loc="lower right")plt.tight_layout()fig.savefig(FIGURE_DIR /"03_hidden_confounding_results.png", dpi=160, bbox_inches="tight")plt.show()
The observed-only model remains biased because the true common cause is not fully measured. This is why causal reports should say “under observed-confounding assumptions” rather than pretending adjustment is automatic proof.
Assumption Register
A backdoor analysis should document its assumptions plainly. This table turns the graph into reviewable claims.
assumption_register = pd.DataFrame( [ {"assumption": "All major common causes of exposure and weekly value are measured.","why_it_matters": "Backdoor adjustment fails if important common causes are missing.","diagnostic_or_response": "Use domain review, pre-treatment covariate audits, sensitivity checks, and negative controls where possible.", }, {"assumption": "Adjustment variables are pre-treatment.","why_it_matters": "Post-treatment controls can block causal pathways or open collider paths.","diagnostic_or_response": "Create a variable timing table before modeling.", }, {"assumption": "The chosen covariates block backdoor paths without blocking the total-effect pathway.","why_it_matters": "The estimand should match the causal question.","diagnostic_or_response": "Write path reasoning before estimating the effect.", }, {"assumption": "There is adequate overlap across treatment groups after adjustment.","why_it_matters": "Adjustment becomes unstable when comparable controls or treated units are absent.","diagnostic_or_response": "Inspect propensity overlap and common support in a dedicated weighting/overlap analysis.", }, {"assumption": "The outcome model is adequate for the estimator being used.","why_it_matters": "A valid estimand can still be estimated poorly by a misspecified model.","diagnostic_or_response": "Compare estimators and inspect residual/model diagnostics.", }, ])assumption_register.to_csv(TABLE_DIR /"03_backdoor_assumption_register.csv", index=False)assumption_register
assumption
why_it_matters
diagnostic_or_response
0
All major common causes of exposure and weekly...
Backdoor adjustment fails if important common ...
Use domain review, pre-treatment covariate aud...
1
Adjustment variables are pre-treatment.
Post-treatment controls can block causal pathw...
Create a variable timing table before modeling.
2
The chosen covariates block backdoor paths wit...
The estimand should match the causal question.
Write path reasoning before estimating the eff...
3
There is adequate overlap across treatment gro...
Adjustment becomes unstable when comparable co...
Inspect propensity overlap and common support ...
4
The outcome model is adequate for the estimato...
A valid estimand can still be estimated poorly...
Compare estimators and inspect residual/model ...
This table is the honest part of the analysis. It tells a reader where the result is strongest and where judgment or additional diagnostics are still needed.
Final Backdoor Checklist
This checklist is a reusable workflow for backdoor adjustment in future notebooks.
backdoor_checklist = pd.DataFrame( [ {"step": "Define treatment and outcome","question_to_answer": "What intervention and outcome window are being compared?", }, {"step": "Mark timing","question_to_answer": "Which variables are definitely measured before treatment?", }, {"step": "Draw common causes","question_to_answer": "Which pre-treatment variables plausibly cause both treatment and outcome?", }, {"step": "Avoid bad controls","question_to_answer": "Are any proposed controls mediators, colliders, or descendants of treatment?", }, {"step": "Check baseline imbalance","question_to_answer": "How different are treated and untreated users before treatment?", }, {"step": "Identify with DoWhy","question_to_answer": "What adjustment set and assumptions does DoWhy print?", }, {"step": "Estimate and compare","question_to_answer": "How do naive, partial, valid, and bad-control estimates differ?", }, {"step": "Run refuters","question_to_answer": "Does the estimate behave sensibly under placebo and perturbation checks?", }, {"step": "State limitations","question_to_answer": "Which unobserved confounders or measurement gaps could still bias the result?", }, ])backdoor_checklist.to_csv(TABLE_DIR /"03_backdoor_checklist.csv", index=False)backdoor_checklist
step
question_to_answer
0
Define treatment and outcome
What intervention and outcome window are being...
1
Mark timing
Which variables are definitely measured before...
2
Draw common causes
Which pre-treatment variables plausibly cause ...
3
Avoid bad controls
Are any proposed controls mediators, colliders...
4
Check baseline imbalance
How different are treated and untreated users ...
5
Identify with DoWhy
What adjustment set and assumptions does DoWhy...
6
Estimate and compare
How do naive, partial, valid, and bad-control ...
7
Run refuters
Does the estimate behave sensibly under placeb...
8
State limitations
Which unobserved confounders or measurement ga...
The checklist makes backdoor adjustment less mysterious. The hard part is not calling an estimator; the hard part is defending the adjustment set.
Final Causal Summary
This final table shows how to summarize the tutorial result without overclaiming.
final_summary = pd.DataFrame( [ {"item": "Causal question","summary": "Total average effect of feature exposure on weekly value.", }, {"item": "Valid adjustment variables in this teaching graph","summary": "user_engagement, prior_sessions, account_age_weeks, and baseline_value.", }, {"item": "Known total effect","summary": f"{truth['true_total_effect']:.3f}", }, {"item": "DoWhy backdoor estimate","summary": f"{float(dowhy_linear_estimate.value):.3f}", }, {"item": "What the naive estimate does wrong","summary": "It mixes treatment effect with baseline differences between exposed and unexposed users.", }, {"item": "What mediator adjustment does wrong for total effect","summary": "It blocks part of the causal pathway and moves toward a direct-effect-like quantity.", }, {"item": "What collider adjustment can do wrong","summary": "It can open a noncausal path between treatment and outcome.", }, {"item": "Main limitation","summary": "Backdoor adjustment depends on measured common causes; unobserved confounding can remain.", }, ])final_summary.to_csv(TABLE_DIR /"03_final_backdoor_summary.csv", index=False)final_summary
item
summary
0
Causal question
Total average effect of feature exposure on we...
1
Valid adjustment variables in this teaching graph
user_engagement, prior_sessions, account_age_w...
2
Known total effect
2.200
3
DoWhy backdoor estimate
2.209
4
What the naive estimate does wrong
It mixes treatment effect with baseline differ...
5
What mediator adjustment does wrong for total ...
It blocks part of the causal pathway and moves...
6
What collider adjustment can do wrong
It can open a noncausal path between treatment...
7
Main limitation
Backdoor adjustment depends on measured common...
The final summary names both the estimate and the assumptions. That is the habit to preserve in all observational causal analyses.
Student Exercises
Try these after running the notebook:
Remove baseline_value from the valid adjustment formula and see how much bias returns.
Increase the effect of user_engagement in the treatment assignment equation and watch the naive estimate move farther from the truth.
Increase the effect of post_exposure_activity on weekly_value and see how much mediator adjustment changes the estimate.
Change the support-ticket equation and observe whether collider adjustment becomes more or less damaging.
Rewrite the assumption register for a real dataset you know.
Closing Notes
Backdoor adjustment is powerful when its assumptions are credible: the common causes are observed, pre-treatment, and sufficient to block noncausal paths. It is fragile when analysts under-adjust, over-adjust, condition on colliders, or miss important common causes.
The next tutorial will compare common estimators for the same identified estimand: regression, matching, stratification, and propensity-score methods.