DoWhy Tutorial 02: Causal Graphs, DAGs, And Assumptions
This notebook focuses on the part of causal inference that happens before estimation: writing down assumptions as a graph. DoWhy is most useful when the analyst treats the graph as the causal design, not as a decoration.
We will build a teaching dataset with several important graph roles: confounders, a treatment, a mediator, an outcome, an instrument-like assignment variable, and a collider. Then we will show how different graph assumptions lead to different adjustment logic and different estimates.
Learning Goals
By the end of this notebook, you should be able to:
Explain what a directed acyclic graph, or DAG, represents.
Distinguish confounders, mediators, colliders, instruments, treatments, and outcomes.
Explain why a graph should be written before selecting an estimator.
Use NetworkX to check and visualize a graph.
Use DoWhy to compare how graph assumptions affect common causes, instruments, estimands, and estimates.
Avoid the beginner mistake of adjusting for every available variable.
Why Graphs Come Before Estimators
A causal graph is a compact statement of assumptions about how variables cause each other. The graph answers questions like:
Which variables happen before treatment?
Which variables affect treatment assignment?
Which variables affect the outcome?
Which variables are downstream consequences of treatment?
Which variables should be adjusted for, and which should not?
The estimator comes later. A regression, matching estimator, or weighting estimator cannot decide by itself whether a variable is a confounder, mediator, or collider. That decision comes from the causal graph and domain knowledge.
Setup
This setup mirrors the earlier DoWhy tutorial notebooks. It imports the causal, data, and plotting libraries, sets output folders, fixes the random seed, and suppresses known third-party deprecation warnings so the notebook stays readable for students.
from pathlib import Pathimport osimport platformimport sysimport warningsSTART_DIR = Path.cwd().resolve()PROJECT_ROOT =next( (candidate for candidate in [START_DIR, *START_DIR.parents] if (candidate /"pyproject.toml").exists()), START_DIR,)NOTEBOOK_DIR = PROJECT_ROOT /"notebooks"/"tutorials"/"dowhy"OUTPUT_DIR = NOTEBOOK_DIR /"outputs"FIGURE_DIR = OUTPUT_DIR /"figures"TABLE_DIR = OUTPUT_DIR /"tables"CACHE_DIR = PROJECT_ROOT /".cache"/"matplotlib"for directory in [OUTPUT_DIR, FIGURE_DIR, TABLE_DIR, CACHE_DIR]: directory.mkdir(parents=True, exist_ok=True)os.environ.setdefault("MPLCONFIGDIR", str(CACHE_DIR))warnings.filterwarnings("default")warnings.filterwarnings("ignore", category=DeprecationWarning)warnings.filterwarnings("ignore", category=PendingDeprecationWarning)warnings.filterwarnings("ignore", category=FutureWarning)warnings.filterwarnings("ignore", message=".*IProgress not found.*")warnings.filterwarnings("ignore", message=".*setParseAction.*deprecated.*")warnings.filterwarnings("ignore", message=".*copy keyword is deprecated.*")warnings.filterwarnings("ignore", message=".*disp.*iprint.*L-BFGS-B.*")warnings.filterwarnings("ignore", module="dowhy.causal_estimators.regression_estimator")warnings.filterwarnings("ignore", module="sklearn.linear_model._logistic")warnings.filterwarnings("ignore", module="seaborn.categorical")warnings.filterwarnings("ignore", module="pydot.dot_parser")import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsimport networkx as nximport statsmodels.formula.api as smfimport dowhyfrom dowhy import CausalModelRANDOM_SEED =11rng = np.random.default_rng(RANDOM_SEED)sns.set_theme(style="whitegrid", context="notebook")print(f"Python executable: {sys.executable}")print(f"Python version: {platform.python_version()}")print(f"DoWhy version: {getattr(dowhy, '__version__', 'unknown')}")print(f"Notebook directory: {NOTEBOOK_DIR}")print(f"Output directory: {OUTPUT_DIR}")
The notebook is ready if this cell prints a DoWhy version and the output directory. All generated files in this notebook use a 02_ prefix.
Graph Vocabulary
Before writing code, we need vocabulary. The same column can play different causal roles in different questions, so it is better to define roles relative to a specific treatment and outcome.
This table gives practical definitions used throughout the notebook.
graph_vocabulary = pd.DataFrame( [ {"term": "Treatment","shape": "A","plain_language_definition": "The variable whose causal effect we want to estimate.","adjustment_guidance": "Never adjust away the treatment itself when estimating its effect.", }, {"term": "Outcome","shape": "Y","plain_language_definition": "The variable we expect the treatment to change.","adjustment_guidance": "The outcome is modeled, not adjusted for as a control.", }, {"term": "Confounder","shape": "C -> A and C -> Y","plain_language_definition": "A pre-treatment common cause of treatment and outcome.","adjustment_guidance": "Usually adjust for observed confounders to block backdoor paths.", }, {"term": "Mediator","shape": "A -> M -> Y","plain_language_definition": "A post-treatment pathway variable through which part of the effect flows.","adjustment_guidance": "Do not adjust for mediators when estimating the total effect.", }, {"term": "Collider","shape": "A -> K <- Y","plain_language_definition": "A variable caused by two other variables on a path.","adjustment_guidance": "Do not condition on colliders unless the estimand specifically requires it.", }, {"term": "Instrument","shape": "Z -> A -> Y","plain_language_definition": "A variable that shifts treatment but does not directly affect the outcome except through treatment.","adjustment_guidance": "Useful for IV designs; not the same as a confounder.", }, ])graph_vocabulary.to_csv(TABLE_DIR /"02_graph_vocabulary.csv", index=False)graph_vocabulary
term
shape
plain_language_definition
adjustment_guidance
0
Treatment
A
The variable whose causal effect we want to es...
Never adjust away the treatment itself when es...
1
Outcome
Y
The variable we expect the treatment to change.
The outcome is modeled, not adjusted for as a ...
2
Confounder
C -> A and C -> Y
A pre-treatment common cause of treatment and ...
Usually adjust for observed confounders to blo...
3
Mediator
A -> M -> Y
A post-treatment pathway variable through whic...
Do not adjust for mediators when estimating th...
4
Collider
A -> K <- Y
A variable caused by two other variables on a ...
Do not condition on colliders unless the estim...
5
Instrument
Z -> A -> Y
A variable that shifts treatment but does not ...
Useful for IV designs; not the same as a confo...
The most important lesson is that not every predictive variable should become a control. A good predictive model may use mediators and colliders; a causal adjustment set for a total effect usually should not.
Teaching Scenario
We will use a product-style scenario with the following causal question:
What is the total effect of feature_exposure on next-week weekly_value?
The word total matters. It means we want the effect through all causal pathways, including the pathway where exposure improves satisfaction_depth, which then improves weekly_value.
Create Data With Multiple Graph Roles
This cell simulates data from a known causal system. The true total effect is known because we choose the data-generating equations.
The key roles are:
user_engagement, prior_activity, and account_age_weeks are confounders.
feature_exposure is the treatment.
satisfaction_depth is a mediator.
weekly_value is the outcome.
rollout_batch is an instrument-like assignment shifter.
support_ticket is a collider caused by treatment and outcome.
Rows: 5,000
Known direct effect: 1.1500
Known mediated component: 0.6975
Known total effect: 1.8475
feature_exposure
satisfaction_depth
weekly_value
user_engagement
prior_activity
account_age_weeks
rollout_batch
support_ticket
treatment_probability
0
0
-0.609834
1.956259
0.034193
6
13.404290
0
0
0.476061
1
1
1.402446
8.731601
1.359748
5
9.848849
0
0
0.711362
2
1
0.697735
7.479900
1.224721
5
7.062851
1
0
0.854157
3
1
1.676406
4.837179
-0.510307
2
5.280177
1
1
0.532017
4
0
0.650327
2.546708
-0.297970
3
4.342803
0
0
0.326358
The data contain both pre-treatment variables and post-treatment variables. That mixture is realistic and dangerous: if we adjust for everything in the table, we will usually answer the wrong causal question.
Data Dictionary For The Graph Dataset
This table documents the role and timing of each column. In real work, this table should be built before modeling because timing mistakes are one of the most common sources of causal errors.
data_dictionary = pd.DataFrame( [ {"column": "feature_exposure","graph_role": "treatment","timing": "treatment time","use_in_total_effect_adjustment": "treatment, not a control", }, {"column": "weekly_value","graph_role": "outcome","timing": "post-treatment outcome window","use_in_total_effect_adjustment": "outcome, not a control", }, {"column": "user_engagement","graph_role": "confounder","timing": "pre-treatment","use_in_total_effect_adjustment": "adjust", }, {"column": "prior_activity","graph_role": "confounder","timing": "pre-treatment","use_in_total_effect_adjustment": "adjust", }, {"column": "account_age_weeks","graph_role": "confounder","timing": "pre-treatment","use_in_total_effect_adjustment": "adjust", }, {"column": "rollout_batch","graph_role": "instrument-like assignment shifter","timing": "pre-treatment assignment driver","use_in_total_effect_adjustment": "not needed for backdoor adjustment, useful to know", }, {"column": "satisfaction_depth","graph_role": "mediator","timing": "post-treatment, before outcome","use_in_total_effect_adjustment": "do not adjust for total effect", }, {"column": "support_ticket","graph_role": "collider","timing": "post-treatment diagnostic event","use_in_total_effect_adjustment": "do not adjust", }, {"column": "treatment_probability","graph_role": "simulation diagnostic","timing": "known only because data are simulated","use_in_total_effect_adjustment": "do not use as an observed production column", }, ])data_dictionary.to_csv(TABLE_DIR /"02_graph_data_dictionary.csv", index=False)data_dictionary
column
graph_role
timing
use_in_total_effect_adjustment
0
feature_exposure
treatment
treatment time
treatment, not a control
1
weekly_value
outcome
post-treatment outcome window
outcome, not a control
2
user_engagement
confounder
pre-treatment
adjust
3
prior_activity
confounder
pre-treatment
adjust
4
account_age_weeks
confounder
pre-treatment
adjust
5
rollout_batch
instrument-like assignment shifter
pre-treatment assignment driver
not needed for backdoor adjustment, useful to ...
6
satisfaction_depth
mediator
post-treatment, before outcome
do not adjust for total effect
7
support_ticket
collider
post-treatment diagnostic event
do not adjust
8
treatment_probability
simulation diagnostic
known only because data are simulated
do not use as an observed production column
The adjustment column is the practical takeaway. For the total effect, use pre-treatment common causes. Do not control for the mediator or the collider.
Basic Checks And Role Sanity
The summary below checks treatment prevalence, outcome scale, mediator scale, support-ticket frequency, and the assignment probability range.
The treatment is common enough to compare exposed and unexposed users, and the support-ticket collider is present but not universal. Those properties make the examples easier to see.
Show That Treatment Assignment Is Confounded
This plot shows why a graph is needed. Treatment is related to baseline engagement and to the instrument-like rollout variable. Only the baseline variables are common causes of treatment and outcome; rollout shifts assignment but does not directly cause the outcome in the data-generating process.
The left panel shows confounding, the middle panel shows assignment variation from rollout, and the right panel shows why the mediator is tempting to control for. The graph tells us which temptation is appropriate for the causal question.
Define The Main DAG
This edge list encodes the data-generating graph. It includes direct and mediated treatment effects, observed confounding, an instrument-like rollout shifter, and a collider.
A causal DAG must be acyclic: arrows should not loop back in time. If a graph has a cycle, it may describe equilibrium or feedback behavior, but it is not a DAG in the usual DoWhy effect-identification workflow.
Visualize The Main DAG
This cell draws the main teaching graph with explicit arrows. The positions are chosen to communicate timing: baseline variables on the left, treatment in the middle, mediator and outcome on the right, and the collider below.
This picture is the causal design for the total-effect question. The mediator and collider are downstream of treatment, so controlling for them would change the question or introduce bias.
Inspect Graph Structure Programmatically
NetworkX can help audit graph structure. This cell checks parents, children, ancestors, and descendants for the treatment and outcome.
The descendants of treatment include the mediator, the outcome, and the collider. That descendant status is exactly why blindly adjusting for every column would be dangerous for a total-effect question.
Path-Level Reasoning
A DAG is useful because it lets us reason about paths. Some paths are causal paths we want to preserve; others are backdoor paths we want to block; collider paths should usually remain closed.
path_reasoning = pd.DataFrame( [ {"path": "feature_exposure -> weekly_value","path_type": "direct causal path","for_total_effect": "keep open","reason": "This is part of the total effect.", }, {"path": "feature_exposure -> satisfaction_depth -> weekly_value","path_type": "mediated causal path","for_total_effect": "keep open","reason": "This is also part of the total effect.", }, {"path": "feature_exposure <- user_engagement -> weekly_value","path_type": "backdoor path","for_total_effect": "block by adjustment","reason": "Engagement drives both exposure and outcome.", }, {"path": "feature_exposure <- prior_activity -> weekly_value","path_type": "backdoor path","for_total_effect": "block by adjustment","reason": "Prior activity drives both exposure and outcome.", }, {"path": "feature_exposure <- account_age_weeks -> weekly_value","path_type": "backdoor path","for_total_effect": "block by adjustment","reason": "Account age drives both exposure and outcome.", }, {"path": "feature_exposure <- rollout_batch","path_type": "instrument-like assignment path","for_total_effect": "not a confounding path","reason": "Rollout shifts exposure but has no direct outcome arrow in this graph.", }, {"path": "feature_exposure -> support_ticket <- weekly_value","path_type": "collider path","for_total_effect": "leave closed","reason": "Conditioning on support_ticket can open a noncausal path.", }, ])path_reasoning.to_csv(TABLE_DIR /"02_path_reasoning.csv", index=False)path_reasoning
path
path_type
for_total_effect
reason
0
feature_exposure -> weekly_value
direct causal path
keep open
This is part of the total effect.
1
feature_exposure -> satisfaction_depth -> week...
mediated causal path
keep open
This is also part of the total effect.
2
feature_exposure <- user_engagement -> weekly_...
backdoor path
block by adjustment
Engagement drives both exposure and outcome.
3
feature_exposure <- prior_activity -> weekly_v...
backdoor path
block by adjustment
Prior activity drives both exposure and outcome.
4
feature_exposure <- account_age_weeks -> weekl...
backdoor path
block by adjustment
Account age drives both exposure and outcome.
5
feature_exposure <- rollout_batch
instrument-like assignment path
not a confounding path
Rollout shifts exposure but has no direct outc...
6
feature_exposure -> support_ticket <- weekly_v...
collider path
leave closed
Conditioning on support_ticket can open a nonc...
This table is the bridge between graph drawing and modeling. The adjustment set should block the backdoor paths without blocking the causal paths or opening the collider path.
Manual Adjustment Sets
To make the graph logic concrete, we will estimate several ordinary regressions with different control sets. These are not all valid total-effect estimators; they are examples showing what happens when we adjust for the wrong variables.
The confounder-only specification should be close to the known total effect. Adding the mediator pushes the estimate toward the direct effect, because it blocks the mediated path. Adding the collider can distort the estimate by opening a path that should remain closed.
Visualize Adjustment Mistakes
This plot compares the manual regression specifications against the known total and direct effects from the simulation.
plot_manual = manual_adjustment_results.copy()fig, ax = plt.subplots(figsize=(11, 5.5))sns.pointplot( data=plot_manual, x="estimate", y="specification", linestyle="none", color="#2563eb", ax=ax,)for i, row in plot_manual.reset_index(drop=True).iterrows(): ax.plot([row["ci_95_lower"], row["ci_95_upper"]], [i, i], color="#2563eb", linewidth=1.5)ax.axvline(truth["true_total_effect"], color="#111827", linestyle="--", linewidth=1.4, label="Known total effect")ax.axvline(truth["direct_effect"], color="#92400e", linestyle=":", linewidth=1.8, label="Known direct effect")ax.set_title("Different Controls Answer Different Causal Questions")ax.set_xlabel("Coefficient on feature_exposure")ax.set_ylabel("")ax.legend(loc="lower right")plt.tight_layout()fig.savefig(FIGURE_DIR /"02_adjustment_set_comparison.png", dpi=160, bbox_inches="tight")plt.show()
The visual makes the warning concrete: using all available controls is not the same as estimating the total effect. The graph determines which controls are appropriate.
Convert The Main Graph To DoWhy DOT Syntax
DoWhy can receive a graph as a DOT string. This function converts an edge list into a simple DOT graph string that we can reuse for several graph variants.
This DOT graph is the same design shown in the figure. The next cells will use it to create DoWhy models and compare it to intentionally flawed graph variants.
DoWhy With The Main Graph
This cell creates a DoWhy CausalModel using the main graph. We inspect common causes and instruments before estimating anything.
One practical note: helper methods such as get_common_causes() are useful diagnostics, but the graph and printed estimand remain the clearest statement of assumptions.
DoWhy detects the graph structure and reports candidate common causes and instruments. The important causal design remains: block baseline backdoor paths while preserving the treatment-to-mediator-to-outcome path.
Identify The Effect Under The Main Graph
Now DoWhy identifies the estimand under the main graph. The printed output is intentionally verbose because it exposes the assumptions needed for the effect estimate.
Estimand type: EstimandType.NONPARAMETRIC_ATE
### Estimand : 1
Estimand name: backdoor
Estimand expression:
d ↪
──────────────────(E[weekly_value|user_engagement,account_age_weeks,prior_acti ↪
d[featureₑₓₚₒₛᵤᵣₑ] ↪
↪
↪ vity])
↪
Estimand assumption 1, Unconfoundedness: If U→{feature_exposure} and U→weekly_value then P(weekly_value|feature_exposure,user_engagement,account_age_weeks,prior_activity,U) = P(weekly_value|feature_exposure,user_engagement,account_age_weeks,prior_activity)
### Estimand : 2
Estimand name: iv
Estimand expression:
⎡ -1⎤
⎢ d ⎛ d ⎞ ⎥
E⎢────────────────(weeklyᵥₐₗᵤₑ)⋅⎜────────────────([featureₑₓₚₒₛᵤᵣₑ])⎟ ⎥
⎣d[rollout_batch] ⎝d[rollout_batch] ⎠ ⎦
Estimand assumption 1, As-if-random: If U→→weekly_value then ¬(U →→{rollout_batch})
Estimand assumption 2, Exclusion: If we remove {rollout_batch}→{feature_exposure}, then ¬({rollout_batch}→weekly_value)
### Estimand : 3
Estimand name: frontdoor
No such variable(s) found!
### Estimand : 4
Estimand name: general_adjustment
Estimand expression:
d ↪
──────────────────(E[weekly_value|user_engagement,account_age_weeks,prior_acti ↪
d[featureₑₓₚₒₛᵤᵣₑ] ↪
↪
↪ vity])
↪
Estimand assumption 1, Unconfoundedness: If U→{feature_exposure} and U→weekly_value then P(weekly_value|feature_exposure,user_engagement,account_age_weeks,prior_activity,U) = P(weekly_value|feature_exposure,user_engagement,account_age_weeks,prior_activity)
For this total-effect question, the identified estimand relies on an adjustment strategy that blocks the baseline backdoor paths. It does not ask us to control for the mediator as a normal covariate.
Estimate The Effect Under The Main Graph
Now we estimate the identified effect using DoWhy’s linear-regression estimator. Since the data-generating process is simple and the graph is correct, the estimate should land near the known total effect.
correct_estimate = correct_model.estimate_effect( correct_estimand, method_name="backdoor.linear_regression",)print(correct_estimate)print(f"DoWhy estimate under main graph: {float(correct_estimate.value):.4f}")print(f"Known total effect: {truth['true_total_effect']:.4f}")print(f"Known direct effect: {truth['direct_effect']:.4f}")
*** Causal Estimate ***
## Identified estimand
Estimand type: EstimandType.NONPARAMETRIC_ATE
### Estimand : 1
Estimand name: backdoor
Estimand expression:
d ↪
──────────────────(E[weekly_value|user_engagement,account_age_weeks,prior_acti ↪
d[featureₑₓₚₒₛᵤᵣₑ] ↪
↪
↪ vity])
↪
Estimand assumption 1, Unconfoundedness: If U→{feature_exposure} and U→weekly_value then P(weekly_value|feature_exposure,user_engagement,account_age_weeks,prior_activity,U) = P(weekly_value|feature_exposure,user_engagement,account_age_weeks,prior_activity)
## Realized estimand
b: weekly_value~feature_exposure+user_engagement+account_age_weeks+prior_activity
Target units: ate
## Estimate
Mean value: 1.8601640205910202
DoWhy estimate under main graph: 1.8602
Known total effect: 1.8475
Known direct effect: 1.1500
The estimate should be close to the known total effect, not the direct effect. That is exactly what we want because the graph preserved the mediated pathway through satisfaction_depth.
Compare Correct And Flawed Graphs
To see why graph assumptions matter, we will compare four graph specifications:
The main graph.
A graph that omits user_engagement as a confounder.
A graph that incorrectly treats the mediator as a pre-treatment common cause.
A graph that incorrectly treats the collider as a pre-treatment common cause.
Only the first graph matches the data-generating process.
All variants are DAGs, but being acyclic is not enough. A DAG can be internally valid as a graph and still be causally wrong for the system being studied.
Estimate Under Each Graph Variant
This cell runs the same DoWhy workflow under each graph variant. The data and estimator stay fixed; only the graph assumptions change.
The graph variant changes the adjustment logic and therefore the estimate. The missing-confounder graph leaves a backdoor path open. The mediator-as-confounder graph blocks part of the total effect. The collider-as-confounder graph conditions on a post-treatment collider.
Plot The Graph Variant Results
The plot below puts the graph variants next to the known total and direct effects. This is the central lesson of the notebook: graph assumptions change the meaning of the estimate.
fig, ax = plt.subplots(figsize=(11, 5.5))sns.pointplot( data=variant_results, x="estimate", y="graph_variant", linestyle="none", color="#2563eb", ax=ax,)ax.axvline(truth["true_total_effect"], color="#111827", linestyle="--", linewidth=1.4, label="Known total effect")ax.axvline(truth["direct_effect"], color="#92400e", linestyle=":", linewidth=1.8, label="Known direct effect")ax.set_title("Changing The Graph Changes The Estimate")ax.set_xlabel("DoWhy linear-regression estimate")ax.set_ylabel("")ax.legend(loc="lower right")plt.tight_layout()fig.savefig(FIGURE_DIR /"02_dowhy_graph_variant_results.png", dpi=160, bbox_inches="tight")plt.show()
The main graph is closest to the total-effect benchmark. The flawed graphs may still produce precise-looking numbers, but those numbers answer a different or biased question.
Draw The Flawed Graphs Side By Side
The next plot shows simplified versions of the three flawed graph ideas. This is useful because graph mistakes are often easier to catch visually than in a formula.
flaw_panels = [ {"title": "Missing Confounder","nodes": {"A": (0.20, 0.50),"Y": (0.80, 0.50),"C": (0.50, 0.82), },"labels": {"A": "Feature\nexposure", "Y": "Weekly\nvalue", "C": "Observed\ncontrols"},"edges": [("C", "A"), ("C", "Y"), ("A", "Y")],"note": "A true common cause is absent from the graph.", }, {"title": "Mediator As Control","nodes": {"A": (0.18, 0.50),"M": (0.50, 0.72),"Y": (0.82, 0.50), },"labels": {"A": "Feature\nexposure", "M": "Satisfaction\ndepth", "Y": "Weekly\nvalue"},"edges": [("A", "M"), ("M", "Y"), ("A", "Y")],"note": "Controlling for M blocks part of the total effect.", }, {"title": "Conditioning On Collider","nodes": {"A": (0.18, 0.62),"Y": (0.82, 0.62),"K": (0.50, 0.25), },"labels": {"A": "Feature\nexposure", "Y": "Weekly\nvalue", "K": "Support\nticket"},"edges": [("A", "K"), ("Y", "K"), ("A", "Y")],"note": "Conditioning on K opens a noncausal path.", },]fig, axes = plt.subplots(1, 3, figsize=(16, 5.2))for ax, panel inzip(axes, flaw_panels): ax.set_xlim(0, 1) ax.set_ylim(0, 1) ax.set_axis_off()for source, target in panel["edges"]: ax.annotate("", xy=panel["nodes"][target], xytext=panel["nodes"][source], arrowprops=dict( arrowstyle="-|>", color="#334155", linewidth=1.4, mutation_scale=16, shrinkA=28, shrinkB=28, ), zorder=1, )for node, (x, y) in panel["nodes"].items(): color ="#e0f2fe"if node =="A"else"#dcfce7"if node =="Y"else"#fee2e2"if node =="K"else"#eef2ff" ax.text( x, y, panel["labels"][node], ha="center", va="center", fontsize=10, fontweight="bold", bbox=dict(boxstyle="round,pad=0.42", facecolor=color, edgecolor="#334155", linewidth=1.1), zorder=2, ) ax.set_title(panel["title"], pad=12) ax.text(0.5, 0.04, panel["note"], ha="center", va="center", fontsize=9.5, color="#475569")plt.tight_layout()fig.savefig(FIGURE_DIR /"02_common_graph_mistakes.png", dpi=160, bbox_inches="tight")plt.show()
These mistakes are common because all three flawed graphs can feel plausible if we think only in predictive terms. Causal graphs force us to ask whether a variable is pre-treatment, post-treatment, or a common effect.
Assumption Documentation Template
A causal graph should be accompanied by written assumptions. The table below is a template for documenting the main arrows and the consequence if each assumption is wrong.
assumption_register = pd.DataFrame( [ {"assumption": "Baseline engagement affects exposure and future value.","graph_arrows": "user_engagement -> feature_exposure; user_engagement -> weekly_value","why_it_matters": "Engagement is a confounder and should be adjusted for.","risk_if_wrong": "If omitted, the exposure effect can be overstated.", }, {"assumption": "Prior activity affects exposure and future value.","graph_arrows": "prior_activity -> feature_exposure; prior_activity -> weekly_value","why_it_matters": "Prior behavior is a pre-treatment common cause.","risk_if_wrong": "Treatment and control users are not comparable.", }, {"assumption": "Exposure changes satisfaction depth, which changes future value.","graph_arrows": "feature_exposure -> satisfaction_depth -> weekly_value","why_it_matters": "This path is part of the total effect.","risk_if_wrong": "Adjusting for satisfaction changes the estimand from total to direct-like.", }, {"assumption": "Rollout shifts exposure but does not directly change future value.","graph_arrows": "rollout_batch -> feature_exposure","why_it_matters": "Rollout is instrument-like rather than a confounder.","risk_if_wrong": "A direct rollout effect would need to be represented in the graph.", }, {"assumption": "Support ticket is a common effect of exposure and future value.","graph_arrows": "feature_exposure -> support_ticket <- weekly_value","why_it_matters": "Support ticket is a collider and should not be adjusted for in total-effect estimation.","risk_if_wrong": "Conditioning on support can open a noncausal path.", }, ])assumption_register.to_csv(TABLE_DIR /"02_assumption_register.csv", index=False)assumption_register
assumption
graph_arrows
why_it_matters
risk_if_wrong
0
Baseline engagement affects exposure and futur...
user_engagement -> feature_exposure; user_enga...
Engagement is a confounder and should be adjus...
If omitted, the exposure effect can be oversta...
1
Prior activity affects exposure and future value.
prior_activity -> feature_exposure; prior_acti...
Prior behavior is a pre-treatment common cause.
Treatment and control users are not comparable.
2
Exposure changes satisfaction depth, which cha...
feature_exposure -> satisfaction_depth -> week...
This path is part of the total effect.
Adjusting for satisfaction changes the estiman...
3
Rollout shifts exposure but does not directly ...
rollout_batch -> feature_exposure
Rollout is instrument-like rather than a confo...
A direct rollout effect would need to be repre...
4
Support ticket is a common effect of exposure ...
feature_exposure -> support_ticket <- weekly_v...
Support ticket is a collider and should not be...
Conditioning on support can open a noncausal p...
This register is the part of the analysis a reviewer should challenge. A polished causal notebook should make those challenges easy by stating assumptions plainly.
Final Graph Checklist
This final checklist summarizes the graph workflow students should use before estimating effects with DoWhy.
graph_checklist = pd.DataFrame( [ {"step": "State the causal question","student_prompt": "Am I estimating a total effect, direct effect, mediated effect, or something else?", }, {"step": "Mark variable timing","student_prompt": "Which variables are measured before treatment, at treatment, after treatment, and after outcome?", }, {"step": "Classify variable roles","student_prompt": "Which variables are confounders, mediators, colliders, instruments, or outcomes?", }, {"step": "Draw the DAG","student_prompt": "Do all arrows follow the assumed time ordering and domain logic?", }, {"step": "Check for cycles","student_prompt": "Is this a directed acyclic graph?", }, {"step": "List paths","student_prompt": "Which paths should be blocked and which causal paths should stay open?", }, {"step": "Choose adjustment variables","student_prompt": "Am I avoiding post-treatment mediators and colliders for a total-effect question?", }, {"step": "Use DoWhy to identify","student_prompt": "What estimand and assumptions does DoWhy print before estimation?", }, {"step": "Estimate and compare","student_prompt": "Does the estimate change dramatically under plausible alternative graphs?", }, {"step": "Document limitations","student_prompt": "Which arrows are strongest, weakest, or least testable?", }, ])graph_checklist.to_csv(TABLE_DIR /"02_graph_checklist.csv", index=False)graph_checklist
step
student_prompt
0
State the causal question
Am I estimating a total effect, direct effect,...
1
Mark variable timing
Which variables are measured before treatment,...
2
Classify variable roles
Which variables are confounders, mediators, co...
3
Draw the DAG
Do all arrows follow the assumed time ordering...
4
Check for cycles
Is this a directed acyclic graph?
5
List paths
Which paths should be blocked and which causal...
6
Choose adjustment variables
Am I avoiding post-treatment mediators and col...
7
Use DoWhy to identify
What estimand and assumptions does DoWhy print...
8
Estimate and compare
Does the estimate change dramatically under pl...
9
Document limitations
Which arrows are strongest, weakest, or least ...
The checklist is deliberately slow. In causal inference, speed usually comes after the graph is clear, not before.
Student Exercises
After running the notebook, try these modifications:
Remove the mediator path feature_exposure -> satisfaction_depth -> weekly_value and rerun the DoWhy estimate.
Add a direct arrow rollout_batch -> weekly_value and decide whether rollout is still instrument-like.
Change the support-ticket equation so the collider is rarer or more common and observe the collider-adjustment estimate.
Create a graph with a cycle and check what NetworkX says.
Write your own assumption register for a real dataset you care about.
Closing Notes
This notebook showed that graph assumptions are not cosmetic. The same data and estimator can produce different answers when the graph changes. The next tutorial will focus more narrowly on backdoor adjustment and confounding, building on the graph vocabulary introduced here.