This notebook starts the EconML tutorial series. The goal is to make sure the local environment works, introduce the main EconML estimator families, and create a small synthetic CATE sandbox that later notebooks can reuse.
EconML is strongest when the causal question is already clear and the hard part is estimating treatment effects with modern machine learning. In particular, it is useful for:
estimating heterogeneous treatment effects;
using flexible nuisance models for outcomes and treatment assignment;
comparing DML, doubly robust, forest, and meta-learner approaches;
ranking units by expected treatment benefit;
turning CATE estimates into treatment-targeting policies.
This notebook is intentionally broad. Later notebooks go deeper into each estimator family.
Learning Goals
By the end of this notebook, you should be able to:
Verify that EconML imports successfully in this environment.
Identify the major EconML estimator families and when each one is useful.
Understand the standard EconML data roles: outcome Y, treatment T, controls W, and effect modifiers X.
Build a reusable synthetic dataset with known CATE ground truth.
Run a small LinearDML smoke test to confirm the installation can fit an estimator.
Read the output of a CATE workflow without confusing it for a graph-identification workflow.
EconML In One Sentence
EconML is an estimation library for treatment effects, especially conditional average treatment effects.
A concise mental model:
Use graph-first workflow tools to clarify assumptions and identification.
Use EconML when you need flexible treatment-effect estimation after the causal question is defined.
This notebook does not try to settle causal identification from a graph. It teaches the estimation layer: how the package is organized, how data should be shaped, and how the first estimator call behaves.
Setup
This cell imports the core packages, sets warning filters, creates output folders, and records whether EconML is available. The warning filters are scoped to common notebook noise so real execution errors still appear.
from pathlib import Pathimport osimport warningsimport importlib.metadata as importlib_metadata# Keep Matplotlib cache files in a writable location during notebook execution.os.environ.setdefault("MPLCONFIGDIR", "/tmp/matplotlib-ranking-sys")warnings.filterwarnings("default")warnings.filterwarnings("ignore", category=DeprecationWarning)warnings.filterwarnings("ignore", category=PendingDeprecationWarning)warnings.filterwarnings("ignore", category=FutureWarning)warnings.filterwarnings("ignore", message=".*IProgress not found.*")warnings.filterwarnings("ignore", message=".*X does not have valid feature names.*")warnings.filterwarnings("ignore", message=".*The final model has a nonzero intercept.*")warnings.filterwarnings("ignore", module="sklearn.linear_model._logistic")import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsfrom IPython.display import displayfrom sklearn.ensemble import RandomForestRegressor, RandomForestClassifierfrom sklearn.linear_model import LogisticRegression, LinearRegressionfrom sklearn.metrics import roc_auc_score, mean_squared_errorfrom sklearn.model_selection import train_test_splitfrom sklearn.pipeline import make_pipelinefrom sklearn.preprocessing import StandardScalertry:import econml ECONML_AVAILABLE =True ECONML_VERSION =getattr(econml, "__version__", "unknown")exceptExceptionas exc: ECONML_AVAILABLE =False ECONML_VERSION =f"import failed: {type(exc).__name__}: {exc}"RANDOM_SEED =2026rng = np.random.default_rng(RANDOM_SEED)OUTPUT_DIR = Path("outputs")FIGURE_DIR = OUTPUT_DIR /"figures"TABLE_DIR = OUTPUT_DIR /"tables"FIGURE_DIR.mkdir(parents=True, exist_ok=True)TABLE_DIR.mkdir(parents=True, exist_ok=True)sns.set_theme(style="whitegrid", context="notebook")pd.set_option("display.max_columns", 100)pd.set_option("display.float_format", lambda value: f"{value:,.4f}")print(f"EconML available: {ECONML_AVAILABLE}")print(f"EconML version: {ECONML_VERSION}")print(f"Figures will be saved to: {FIGURE_DIR.resolve()}")print(f"Tables will be saved to: {TABLE_DIR.resolve()}")
EconML available: True
EconML version: 0.16.0
Figures will be saved to: /home/apex/Documents/ranking_sys/notebooks/tutorials/econml/outputs/figures
Tables will be saved to: /home/apex/Documents/ranking_sys/notebooks/tutorials/econml/outputs/tables
The import check confirms that EconML is usable in this environment. All saved artifacts from this notebook use the 00_ prefix.
Package Versions
A causal ML notebook depends on several libraries working together. We record versions so future debugging has a clear starting point.
packages_to_check = ["python","econml","numpy","pandas","scikit-learn","scipy","statsmodels","matplotlib","seaborn",]version_rows = []for package in packages_to_check:if package =="python":import sys version = sys.version.split()[0]else:try: version = importlib_metadata.version(package)except importlib_metadata.PackageNotFoundError: version ="not installed" version_rows.append({"package": package, "version": version})package_versions = pd.DataFrame(version_rows)package_versions.to_csv(TABLE_DIR /"00_environment_package_versions.csv", index=False)display(package_versions)
package
version
0
python
3.13.12
1
econml
0.16.0
2
numpy
2.4.4
3
pandas
3.0.2
4
scikit-learn
1.6.1
5
scipy
1.17.1
6
statsmodels
0.14.6
7
matplotlib
3.10.9
8
seaborn
0.13.2
This table is boring in exactly the right way. If a later estimator behaves differently after dependency changes, this snapshot gives us something concrete to compare against.
EconML Capability Check
Next we check whether the major estimator classes import successfully. This is a lightweight environment test, not a claim that every estimator is appropriate for every dataset.
The core estimator families are available. That means later notebooks can use the real EconML package instead of fallback examples.
Estimator Family Map
EconML has many estimators. The practical choice usually starts with the treatment type, the desired amount of flexibility, and whether you need direct interpretability or nonlinear CATE discovery.
estimator_family_map = pd.DataFrame( [ {"family": "DML estimators","examples": "LinearDML, SparseLinearDML, CausalForestDML","best for": "CATE with observed confounding and flexible nuisance models","main output": "treatment-effect function tau(X)", }, {"family": "Doubly robust learners","examples": "DRLearner, ForestDRLearner","best for": "binary or categorical treatment with outcome and propensity nuisance models","main output": "CATE estimates with doubly robust pseudo-outcomes", }, {"family": "Meta-learners","examples": "SLearner, TLearner, XLearner","best for": "simple, flexible baselines for heterogeneous effects","main output": "CATE from outcome-model contrasts", }, {"family": "IV estimators","examples": "DMLIV, DeepIV, OrthoIV","best for": "endogenous treatment with a valid instrument","main output": "effect estimates identified by instrument variation", }, {"family": "Policy tools","examples": "policy learning utilities and CATE ranking workflows","best for": "turning CATE estimates into treatment allocation rules","main output": "targeting rule or policy-value comparison", }, ])estimator_family_map.to_csv(TABLE_DIR /"00_estimator_family_map.csv", index=False)display(estimator_family_map)
family
examples
best for
main output
0
DML estimators
LinearDML, SparseLinearDML, CausalForestDML
CATE with observed confounding and flexible nu...
treatment-effect function tau(X)
1
Doubly robust learners
DRLearner, ForestDRLearner
binary or categorical treatment with outcome a...
CATE estimates with doubly robust pseudo-outcomes
2
Meta-learners
SLearner, TLearner, XLearner
simple, flexible baselines for heterogeneous e...
CATE from outcome-model contrasts
3
IV estimators
DMLIV, DeepIV, OrthoIV
endogenous treatment with a valid instrument
effect estimates identified by instrument vari...
4
Policy tools
policy learning utilities and CATE ranking wor...
turning CATE estimates into treatment allocati...
targeting rule or policy-value comparison
This map is the first decision aid for the series. The next notebooks unpack these rows one by one with runnable examples.
Standard EconML Data Roles
EconML estimators use a consistent vocabulary. The most important distinction is between controls W and effect modifiers X.
data_role_map = pd.DataFrame( [ {"symbol": "Y","role": "outcome","plain meaning": "The result measured after treatment.","example in this notebook": "outcome", }, {"symbol": "T","role": "treatment","plain meaning": "The intervention or exposure whose effect we estimate.","example in this notebook": "treatment", }, {"symbol": "W","role": "controls / confounders","plain meaning": "Variables used to adjust nuisance models but not necessarily describe heterogeneity.","example in this notebook": "account_tenure", }, {"symbol": "X","role": "effect modifiers","plain meaning": "Variables used to model how the treatment effect changes across units.","example in this notebook": "baseline_need, prior_engagement, friction_score, high_need_segment", }, {"symbol": "tau(X)","role": "CATE function","plain meaning": "The conditional average treatment effect for units with features X.","example in this notebook": "true_cate and estimated_cate", }, ])data_role_map.to_csv(TABLE_DIR /"00_data_role_map.csv", index=False)display(data_role_map)
symbol
role
plain meaning
example in this notebook
0
Y
outcome
The result measured after treatment.
outcome
1
T
treatment
The intervention or exposure whose effect we e...
treatment
2
W
controls / confounders
Variables used to adjust nuisance models but n...
account_tenure
3
X
effect modifiers
Variables used to model how the treatment effe...
baseline_need, prior_engagement, friction_scor...
4
tau(X)
CATE function
The conditional average treatment effect for u...
true_cate and estimated_cate
The X versus W split matters. X describes where effects vary; W helps remove confounding in the nuisance models.
Workflow Map
A typical EconML analysis has a repeatable shape. This map gives the flow that later notebooks will use.
econml_workflow = pd.DataFrame( [ {"step": "Frame causal question","question": "What is the treatment, outcome, population, and estimand?","artifact": "causal question table", }, {"step": "Prepare data roles","question": "Which columns are Y, T, X, and W?","artifact": "role map and design matrices", }, {"step": "Check overlap","question": "Do treated and control units have comparable covariates?","artifact": "propensity and balance diagnostics", }, {"step": "Choose estimator","question": "Do we need linear CATE, nonlinear CATE, DR learning, or IV logic?","artifact": "estimator selection rationale", }, {"step": "Fit nuisance models and CATE model","question": "Are outcome and treatment processes modeled well enough for this task?","artifact": "fitted EconML estimator", }, {"step": "Validate CATE behavior","question": "Do estimated effects rank, segment, and average sensibly?","artifact": "calibration, segment, and ranking diagnostics", }, {"step": "Report decision summary","question": "What should someone do differently, and what assumptions could break it?","artifact": "report-ready summary", }, ])econml_workflow.to_csv(TABLE_DIR /"00_econml_workflow_map.csv", index=False)display(econml_workflow)
step
question
artifact
0
Frame causal question
What is the treatment, outcome, population, an...
causal question table
1
Prepare data roles
Which columns are Y, T, X, and W?
role map and design matrices
2
Check overlap
Do treated and control units have comparable c...
propensity and balance diagnostics
3
Choose estimator
Do we need linear CATE, nonlinear CATE, DR lea...
estimator selection rationale
4
Fit nuisance models and CATE model
Are outcome and treatment processes modeled we...
fitted EconML estimator
5
Validate CATE behavior
Do estimated effects rank, segment, and averag...
calibration, segment, and ranking diagnostics
6
Report decision summary
What should someone do differently, and what a...
report-ready summary
The estimator call is only one part of the workflow. The diagnostics and reporting steps are what keep a CATE model from becoming a black box with causal language attached.
Teaching Dataset Design
We now create a synthetic dataset with observed confounding and known heterogeneous treatment effects. Known truth is a teaching luxury: it lets us check whether an estimator is learning the right pattern.
N =3_000baseline_need = rng.normal(0, 1, size=N)prior_engagement = rng.normal(0, 1, size=N)account_tenure = rng.normal(0, 1, size=N)friction_score = rng.normal(0, 1, size=N)high_need_segment = (baseline_need >0).astype(int)# Observational treatment assignment. Treatment is not randomized.propensity =1/ (1+ np.exp(-(-0.20+0.70* baseline_need+0.45* prior_engagement-0.30* account_tenure+0.35* friction_score)))treatment = rng.binomial(1, propensity, size=N)# Heterogeneous treatment effect. This is known only because the data are synthetic.true_cate = (0.50+0.35* high_need_segment+0.20* prior_engagement-0.15* friction_score)outcome = ( true_cate * treatment+0.60* baseline_need+0.35* prior_engagement-0.25* account_tenure-0.30* friction_score+ rng.normal(0, 0.75, size=N))teaching_df = pd.DataFrame( {"baseline_need": baseline_need,"prior_engagement": prior_engagement,"account_tenure": account_tenure,"friction_score": friction_score,"high_need_segment": high_need_segment,"treatment": treatment,"outcome": outcome,"propensity": propensity,"true_cate": true_cate, })teaching_df.to_csv(TABLE_DIR /"00_teaching_dataset.csv", index=False)display(teaching_df.head())
baseline_need
prior_engagement
account_tenure
friction_score
high_need_segment
treatment
outcome
propensity
true_cate
0
-0.7931
-0.4520
0.3610
1.5171
0
0
-2.0808
0.3691
0.1820
1
0.2406
-0.3531
-1.0970
-0.6711
1
1
2.0381
0.4760
0.8800
2
-1.8963
-0.9423
-0.4935
0.9219
0
1
-1.3458
0.1853
0.1733
3
1.3958
0.0110
0.4890
0.1365
1
1
0.8205
0.6644
0.8317
4
0.6383
1.1904
-0.5878
1.5456
1
1
2.8091
0.8175
0.8563
The dataset has the ingredients needed for a first EconML tour: binary treatment, continuous outcome, observed confounding, and treatment effects that vary by features.
Data Dictionary
The next table documents each column and how it should be used. Later notebooks will reuse this pattern for their own teaching data.
data_dictionary = pd.DataFrame( [ {"column": "baseline_need","role": "effect modifier and confounder","plain meaning": "Pre-treatment need or intent.","included in": "X", }, {"column": "prior_engagement","role": "effect modifier and confounder","plain meaning": "Historical engagement before treatment.","included in": "X", }, {"column": "account_tenure","role": "control confounder","plain meaning": "Pre-treatment account maturity.","included in": "W", }, {"column": "friction_score","role": "effect modifier and confounder","plain meaning": "Pre-treatment friction or difficulty.","included in": "X", }, {"column": "high_need_segment","role": "effect modifier","plain meaning": "Binary segment derived from baseline need.","included in": "X", }, {"column": "treatment","role": "binary treatment","plain meaning": "Whether the unit received the intervention.","included in": "T", }, {"column": "outcome","role": "outcome","plain meaning": "Post-treatment outcome to improve.","included in": "Y", }, {"column": "propensity","role": "known treatment probability for teaching","plain meaning": "True assignment probability from the simulator.","included in": "diagnostics only", }, {"column": "true_cate","role": "known treatment-effect truth for teaching","plain meaning": "Unit-level conditional effect from the simulator.","included in": "diagnostics only", }, ])data_dictionary.to_csv(TABLE_DIR /"00_teaching_data_dictionary.csv", index=False)display(data_dictionary)
column
role
plain meaning
included in
0
baseline_need
effect modifier and confounder
Pre-treatment need or intent.
X
1
prior_engagement
effect modifier and confounder
Historical engagement before treatment.
X
2
account_tenure
control confounder
Pre-treatment account maturity.
W
3
friction_score
effect modifier and confounder
Pre-treatment friction or difficulty.
X
4
high_need_segment
effect modifier
Binary segment derived from baseline need.
X
5
treatment
binary treatment
Whether the unit received the intervention.
T
6
outcome
outcome
Post-treatment outcome to improve.
Y
7
propensity
known treatment probability for teaching
True assignment probability from the simulator.
diagnostics only
8
true_cate
known treatment-effect truth for teaching
Unit-level conditional effect from the simulator.
diagnostics only
The last two columns would not exist in real data. They are included here only so the tutorial can check whether estimators recover the known effect pattern.
Basic Dataset Summary
Before fitting anything, summarize treatment rate, outcome scale, and the true CATE distribution. This gives us a baseline expectation for later estimator outputs.
The groups are imbalanced, which means a raw difference in outcomes would not be a clean causal estimate. EconML estimators still need a credible adjustment design.
Plot Treatment Imbalance
The plot uses standardized mean differences so all features share the same scale. The dashed lines at +/-0.1 are rough balance guides.
The imbalance is deliberate. This gives the EconML estimators real nuisance-model work to do instead of estimating effects from randomized treatment.
Propensity Overlap Check
Overlap matters for CATE estimation. If treatment assignment is nearly deterministic in some region, the estimator has little evidence for the missing treatment state there.
The propensity model is predictive, which confirms observational assignment. The next plot checks whether treated and control units still overlap enough to compare.
Plot Propensity Overlap
A healthy CATE workflow should inspect overlap before trusting individualized or segment-level effects.
The overlap is usable for a teaching example. There are treated and control units across much of the propensity range, though the tails still deserve caution.
Train/Test Split For The Smoke Test
We split the data so the first EconML estimator is evaluated on rows it did not fit. This is not a full benchmark, but it is better than only checking in-sample output.
The train and test splits have similar treatment rates and true ATE values. That makes the smoke-test metrics easier to read.
LinearDML Smoke Test
LinearDML is a good first estimator because it uses nuisance models for treatment and outcome, then fits a linear final-stage treatment-effect model over X. This cell proves the environment can fit an actual EconML estimator.
from econml.dml import LinearDMLY = teaching_df[OUTCOME_COLUMN].to_numpy()T = teaching_df[TREATMENT_COLUMN].to_numpy()X = teaching_df[X_COLUMNS].to_numpy()W = teaching_df[W_COLUMNS].to_numpy()linear_dml = LinearDML( model_y=RandomForestRegressor( n_estimators=80, min_samples_leaf=20, random_state=RANDOM_SEED, ), model_t=RandomForestClassifier( n_estimators=80, min_samples_leaf=20, random_state=RANDOM_SEED, ), discrete_treatment=True, cv=3, random_state=RANDOM_SEED,)linear_dml.fit( Y[train_idx], T[train_idx], X=X[train_idx], W=W[train_idx],)estimated_cate_test = linear_dml.effect(X[test_idx])estimated_ate_test = linear_dml.ate(X[test_idx])print(f"Estimated test ATE from LinearDML: {estimated_ate_test:.4f}")
Estimated test ATE from LinearDML: 0.7452
The estimator fit and produced an ATE. That confirms the core EconML workflow is operational in this environment.
Smoke-Test Metrics Against Known Truth
Because the synthetic data include true_cate, we can check whether the first estimator learned the broad effect pattern. These metrics are a teaching diagnostic, not something available in real data.
true_cate_test = teaching_df.iloc[test_idx]["true_cate"].to_numpy()smoke_metrics = pd.DataFrame( [ {"metric": "test true ATE","value": true_cate_test.mean(),"reading": "average ground-truth CATE in the test split", }, {"metric": "LinearDML estimated ATE","value": float(estimated_ate_test),"reading": "average estimated CATE in the test split", }, {"metric": "CATE correlation with truth","value": float(np.corrcoef(estimated_cate_test, true_cate_test)[0, 1]),"reading": "ranking alignment between estimated and true CATE", }, {"metric": "CATE RMSE","value": float(np.sqrt(mean_squared_error(true_cate_test, estimated_cate_test))),"reading": "average estimation error on the CATE scale", }, ])smoke_metrics.to_csv(TABLE_DIR /"00_lineardml_smoke_metrics.csv", index=False)display(smoke_metrics)
metric
value
reading
0
test true ATE
0.6882
average ground-truth CATE in the test split
1
LinearDML estimated ATE
0.7452
average estimated CATE in the test split
2
CATE correlation with truth
0.9738
ranking alignment between estimated and true CATE
3
CATE RMSE
0.0900
average estimation error on the CATE scale
The CATE correlation is the main smoke-test signal here: the estimator is learning the ranking pattern in the heterogeneous effects. Later notebooks will improve and compare estimators more systematically.
Inspect Predicted CATE Rows
A few row-level examples make the output tangible. The estimated CATE is model output; the true CATE is visible only because this is a synthetic tutorial.
The row preview shows what EconML gives you: an estimated treatment effect for each feature row. Those row-level estimates should be summarized carefully rather than treated as perfect individual truths.
Plot Estimated Versus True CATE
The scatter plot checks whether high true-effect units tend to receive high estimated effects. This is a stronger diagnostic than only comparing ATE values.
The points follow the diagonal closely enough for an environment smoke test. The later estimator-specific notebooks will ask harder questions about calibration and uncertainty.
Segment-Level CATE Summary
CATE estimates are often easier to communicate by segment than row by row. Here we compare the true and estimated effects for the high-need segment.
The segment summary recovers the intended pattern: higher-need units have larger expected treatment effects. This is the kind of summary that often matters more than individual row estimates.
Plot Segment Effects
The bar chart shows how segment-level estimates compare with the synthetic truth.
The segment plot is a preview of the policy-targeting notebooks. CATE models become useful when their rankings and segment patterns support better treatment decisions.
What The Smoke Test Does And Does Not Prove
A successful LinearDML run confirms the package works and that the workflow is coherent. It does not prove every causal assumption in a real analysis.
smoke_test_boundaries = pd.DataFrame( [ {"what it shows": "EconML is installed and importable","what it does not show": "The environment is tuned for every advanced estimator.", }, {"what it shows": "LinearDML can fit the synthetic teaching data","what it does not show": "LinearDML is the best estimator for every CATE problem.", }, {"what it shows": "The estimated CATE ranking matches known truth in this simulation","what it does not show": "Real-data CATE estimates are individually precise.", }, {"what it shows": "Observed confounding and overlap diagnostics are visible","what it does not show": "Unmeasured confounding has been solved.", }, ])smoke_test_boundaries.to_csv(TABLE_DIR /"00_smoke_test_boundaries.csv", index=False)display(smoke_test_boundaries)
what it shows
what it does not show
0
EconML is installed and importable
The environment is tuned for every advanced es...
1
LinearDML can fit the synthetic teaching data
LinearDML is the best estimator for every CATE...
2
The estimated CATE ranking matches known truth...
Real-data CATE estimates are individually prec...
3
Observed confounding and overlap diagnostics a...
Unmeasured confounding has been solved.
This boundary table is important. EconML estimates effects; the credibility of those effects still depends on design, assumptions, overlap, and diagnostics.
Notebook Roadmap
The rest of the EconML series will build from this environment tour toward more advanced estimators and decision workflows.
notebook_roadmap = pd.DataFrame( [ {"notebook": "01", "topic": "CATE foundations", "main skill": "understand ATE versus CATE"}, {"notebook": "02", "topic": "Double machine learning basics", "main skill": "understand nuisance residualization"}, {"notebook": "03", "topic": "LinearDML and SparseLinearDML", "main skill": "fit interpretable CATE models"}, {"notebook": "04", "topic": "CausalForestDML", "main skill": "fit nonlinear forest-based CATE"}, {"notebook": "05", "topic": "DRLearner", "main skill": "use doubly robust pseudo-outcomes"}, {"notebook": "06", "topic": "Meta-learners", "main skill": "compare S, T, and X learners"}, {"notebook": "07", "topic": "Policy targeting", "main skill": "turn CATE into treatment rules"}, {"notebook": "08", "topic": "Explanation and segments", "main skill": "summarize what drives effect heterogeneity"}, {"notebook": "09", "topic": "Uncertainty", "main skill": "use intervals and uncertainty-aware decisions"}, {"notebook": "10", "topic": "Multiple and continuous treatments", "main skill": "move beyond binary treatment"}, {"notebook": "11", "topic": "IV estimators", "main skill": "handle endogenous treatment with instruments"}, {"notebook": "12", "topic": "Panel or longitudinal extensions", "main skill": "reason about repeated observations"}, {"notebook": "13", "topic": "Estimator benchmark", "main skill": "compare estimators on known truth"}, {"notebook": "14", "topic": "End-to-end case study", "main skill": "combine estimation, diagnostics, targeting, and reporting"}, {"notebook": "15", "topic": "Pitfalls and debugging", "main skill": "avoid leakage, weak overlap, and overclaimed CATE"}, ])notebook_roadmap.to_csv(TABLE_DIR /"00_econml_notebook_roadmap.csv", index=False)display(notebook_roadmap)
notebook
topic
main skill
0
01
CATE foundations
understand ATE versus CATE
1
02
Double machine learning basics
understand nuisance residualization
2
03
LinearDML and SparseLinearDML
fit interpretable CATE models
3
04
CausalForestDML
fit nonlinear forest-based CATE
4
05
DRLearner
use doubly robust pseudo-outcomes
5
06
Meta-learners
compare S, T, and X learners
6
07
Policy targeting
turn CATE into treatment rules
7
08
Explanation and segments
summarize what drives effect heterogeneity
8
09
Uncertainty
use intervals and uncertainty-aware decisions
9
10
Multiple and continuous treatments
move beyond binary treatment
10
11
IV estimators
handle endogenous treatment with instruments
11
12
Panel or longitudinal extensions
reason about repeated observations
12
13
Estimator benchmark
compare estimators on known truth
13
14
End-to-end case study
combine estimation, diagnostics, targeting, an...
14
15
Pitfalls and debugging
avoid leakage, weak overlap, and overclaimed CATE
The sequence starts with concepts and ends with applied reporting. That shape mirrors how CATE modeling should be learned: assumptions first, estimators second, decisions last.
Troubleshooting Checklist
The final table gives quick fixes for common EconML setup and workflow issues.
troubleshooting_checklist = pd.DataFrame( [ {"symptom": "EconML import fails","likely cause": "package not installed or Python/dependency mismatch","first check": "run `uv add econml` and verify the Python version supported by the installed EconML release", }, {"symptom": "estimator fit fails with shape errors","likely cause": "Y, T, X, or W have inconsistent row counts or unexpected dimensions","first check": "print array shapes before calling `.fit()`", }, {"symptom": "CATE estimates are noisy or extreme","likely cause": "weak overlap, small sample size, or overly flexible final model","first check": "plot propensity overlap and summarize CATE by segment", }, {"symptom": "treatment model predicts treatment almost perfectly","likely cause": "poor overlap or leakage into treatment features","first check": "inspect propensity distributions by treatment group", }, {"symptom": "estimated CATE ranking looks implausible","likely cause": "bad effect modifiers, leakage, or nuisance model misspecification","first check": "compare segment summaries and run simpler baseline estimators", }, {"symptom": "policy targeting looks too good","likely cause": "evaluating policy on the same data used to learn CATE","first check": "use held-out data or doubly robust policy evaluation where possible", }, ])troubleshooting_checklist.to_csv(TABLE_DIR /"00_troubleshooting_checklist.csv", index=False)display(troubleshooting_checklist)
symptom
likely cause
first check
0
EconML import fails
package not installed or Python/dependency mis...
run `uv add econml` and verify the Python vers...
1
estimator fit fails with shape errors
Y, T, X, or W have inconsistent row counts or ...
print array shapes before calling `.fit()`
2
CATE estimates are noisy or extreme
weak overlap, small sample size, or overly fle...
plot propensity overlap and summarize CATE by ...
3
treatment model predicts treatment almost perf...
poor overlap or leakage into treatment features
inspect propensity distributions by treatment ...
4
estimated CATE ranking looks implausible
bad effect modifiers, leakage, or nuisance mod...
compare segment summaries and run simpler base...
5
policy targeting looks too good
evaluating policy on the same data used to lea...
use held-out data or doubly robust policy eval...
The checklist is deliberately practical. Many EconML issues are data-role, shape, overlap, or leakage issues rather than exotic estimator failures.
Final Summary
This environment tour confirmed that EconML is installed, mapped the major estimator families, created a reusable heterogeneous-effect teaching dataset, and ran a first LinearDML smoke test.
Key takeaways:
EconML is mainly an estimation toolkit for treatment effects, especially CATE.
The standard data roles are Y, T, X, and W.
CATE work needs overlap diagnostics, nuisance-model thinking, and careful reporting.
A working estimator call is only the beginning; later notebooks will focus on estimator choice, diagnostics, uncertainty, and treatment policies.
The next notebook introduces CATE foundations and potential-outcomes language before moving deeper into EconML estimators.