EconML Tutorial 07: Policy Learning And Treatment Targeting
This notebook moves from estimating treatment effects to using them for decisions.
A CATE model answers:
How much benefit do we expect from treatment for this unit?
A policy answers:
Which units should we actually treat?
Those are related but not identical. A policy may have a limited budget, a treatment cost, fairness or support constraints, and uncertainty concerns. This notebook teaches how to turn CATE estimates into treatment rules, compare policies, and use EconML’s policy learners as another way to learn decision rules directly.
The lesson uses a synthetic truth-known setting so we can evaluate policy value exactly. In real work, policy value usually needs an experiment, randomized holdout, or careful off-policy evaluation.
Learning Goals
By the end of this notebook, you should be able to:
distinguish CATE estimation from treatment policy selection;
define net benefit after treatment cost;
turn CATE estimates into threshold and budgeted targeting rules;
compute true policy value in a simulation;
compare random, treat-all, threshold, top-k, and oracle policies;
fit EconML DRPolicyTree and DRPolicyForest;
compare direct policy learners with CATE-ranking policies;
inspect treatment rates, segment targeting, regret, and support risks;
explain why offline policy decisions need uncertainty and overlap checks.
CATE Versus Policy
A treatment-effect estimate is a score. A policy is an action rule.
For binary treatment, a simple policy can be written as:
policy(X) = 1 if estimated_net_CATE(X) > threshold else 0
If there is no budget constraint, a simple rule treats units with positive estimated net benefit. If there is a budget constraint, the rule may treat only the top k% of units ranked by estimated net benefit.
The key evaluation quantity in this notebook is policy gain relative to treating nobody:
policy_gain = mean(policy(X) * true_net_CATE(X))
A good policy has high positive gain, treats a defensible share of the population, and avoids overreliance on noisy or unsupported regions.
Tutorial Flow
The notebook follows this path:
Create a confounded dataset with true potential outcomes and treatment cost.
Define true net CATE and oracle policy value.
Fit CATE models that estimate net treatment benefit.
Convert CATE scores into threshold and budgeted policies.
Fit direct EconML policy learners.
Compare policy gain, regret, treatment rate, and segment targeting.
Inspect policy trees and feature importances.
Evaluate support and uncertainty risks.
Close with a practical policy-learning checklist.
Setup
This cell imports packages, creates output folders, fixes a random seed, and checks whether the EconML estimators needed for the notebook are available.
from pathlib import Pathimport osimport warningsimport importlib.metadata as importlib_metadata# Keep Matplotlib cache files in a writable location during notebook execution.os.environ.setdefault("MPLCONFIGDIR", "/tmp/matplotlib-ranking-sys")warnings.filterwarnings("default")warnings.filterwarnings("ignore", category=DeprecationWarning)warnings.filterwarnings("ignore", category=PendingDeprecationWarning)warnings.filterwarnings("ignore", category=FutureWarning)warnings.filterwarnings("ignore", message=".*IProgress not found.*")warnings.filterwarnings("ignore", message=".*X does not have valid feature names.*")warnings.filterwarnings("ignore", message=".*The final model has a nonzero intercept.*")warnings.filterwarnings("ignore", message=".*Co-variance matrix is underdetermined.*")warnings.filterwarnings("ignore", module="sklearn.linear_model._logistic")import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsfrom sklearn.tree import plot_treefrom IPython.display import displayfrom sklearn.ensemble import RandomForestClassifier, RandomForestRegressorfrom sklearn.linear_model import LinearRegressionfrom sklearn.metrics import brier_score_loss, log_loss, mean_squared_error, roc_auc_scorefrom sklearn.model_selection import KFold, StratifiedKFold, cross_val_predict, train_test_splittry:import econmlfrom econml.dml import CausalForestDML, LinearDMLfrom econml.dr import DRLearnerfrom econml.policy import DRPolicyTree, DRPolicyForest ECONML_AVAILABLE =True ECONML_VERSION =getattr(econml, "__version__", "unknown")exceptExceptionas exc: ECONML_AVAILABLE =False ECONML_VERSION =f"import failed: {type(exc).__name__}: {exc}"RANDOM_SEED =2026rng = np.random.default_rng(RANDOM_SEED)OUTPUT_DIR = Path("outputs")FIGURE_DIR = OUTPUT_DIR /"figures"TABLE_DIR = OUTPUT_DIR /"tables"FIGURE_DIR.mkdir(parents=True, exist_ok=True)TABLE_DIR.mkdir(parents=True, exist_ok=True)sns.set_theme(style="whitegrid", context="notebook")pd.set_option("display.max_columns", 140)pd.set_option("display.float_format", lambda value: f"{value:,.4f}")print(f"EconML available: {ECONML_AVAILABLE}")print(f"EconML version: {ECONML_VERSION}")print(f"Figures will be saved to: {FIGURE_DIR.resolve()}")print(f"Tables will be saved to: {TABLE_DIR.resolve()}")
EconML available: True
EconML version: 0.16.0
Figures will be saved to: /home/apex/Documents/ranking_sys/notebooks/tutorials/econml/outputs/figures
Tables will be saved to: /home/apex/Documents/ranking_sys/notebooks/tutorials/econml/outputs/tables
What this shows: the notebook is ready if EconML imports successfully. The output files use the 07_ prefix so they are easy to separate from earlier tutorial artifacts.
Policy Objects In This Lesson
The next table names the policy strategies we will compare. Some are simple score-based rules; others are learned directly by EconML policy estimators.
policy_strategy_map = pd.DataFrame( [ {"policy family": "Treat nobody","how it works": "Set treatment to 0 for every unit","why include it": "Baseline value for policy gain", }, {"policy family": "Treat everybody","how it works": "Set treatment to 1 for every unit","why include it": "Shows whether treatment is beneficial on average after cost", }, {"policy family": "CATE threshold","how it works": "Treat when estimated net CATE is above zero","why include it": "Natural rule without a fixed budget", }, {"policy family": "Budgeted top-k","how it works": "Treat the top share of units ranked by estimated net CATE","why include it": "Matches constrained treatment capacity", }, {"policy family": "DRPolicyTree / DRPolicyForest","how it works": "Learn a decision rule directly from observed outcomes, treatment, X, and W","why include it": "Uses EconML's direct policy-learning tools", }, {"policy family": "Oracle","how it works": "Treat using true net CATE","why include it": "Upper benchmark available only in simulation", }, ])policy_strategy_map.to_csv(TABLE_DIR /"07_policy_strategy_map.csv", index=False)display(policy_strategy_map)
policy family
how it works
why include it
0
Treat nobody
Set treatment to 0 for every unit
Baseline value for policy gain
1
Treat everybody
Set treatment to 1 for every unit
Shows whether treatment is beneficial on avera...
2
CATE threshold
Treat when estimated net CATE is above zero
Natural rule without a fixed budget
3
Budgeted top-k
Treat the top share of units ranked by estimat...
Matches constrained treatment capacity
4
DRPolicyTree / DRPolicyForest
Learn a decision rule directly from observed o...
Uses EconML's direct policy-learning tools
5
Oracle
Treat using true net CATE
Upper benchmark available only in simulation
What this shows: a policy comparison should include simple baselines. A complicated learner is only useful if it improves over clear rules like treat nobody, treat everybody, and top-k targeting.
Synthetic Teaching Data
The dataset below has a binary treatment, observed confounding, heterogeneous treatment effects, and an explicit treatment cost.
The observed outcome is a gross benefit. We create a net outcome by subtracting treatment cost from treated rows:
The policy problem is to maximize expected net outcome, not just gross outcome. That distinction matters because some units may have positive gross treatment effects but negative net effects after cost.
What this shows: policy learning is being framed as net value maximization. The gross CATE can be positive while the net CATE is negative if the treatment cost is larger than expected benefit.
Field Dictionary
This table clarifies which fields are observed in a real analysis and which are oracle fields available only because we simulated the data.
effect_modifier_cols = ["baseline_need","prior_engagement","friction_score","content_affinity","price_sensitivity","region_risk","high_need_segment",]control_cols = ["trust_score", "recency_gap", "account_tenure", "seasonality_index", "device_stability", "traffic_intensity"]all_observed_covariates = effect_modifier_cols + control_colstrue_driver_cols = effect_modifier_cols.copy()field_rows = []for col in effect_modifier_cols: field_rows.append( {"column": col,"role": "X policy/CATE feature","observed_in_real_analysis": "yes","description": "Pre-treatment feature used for treatment-effect ranking and policy decisions.","true_net_cate_driver": "yes"if col in true_driver_cols else"no", } )for col in control_cols: field_rows.append( {"column": col,"role": "W/control or support feature","observed_in_real_analysis": "yes","description": "Pre-treatment feature used for nuisance adjustment and support diagnostics.","true_net_cate_driver": "no", } )for col, role, description in [ ("treatment", "treatment", "Binary intervention indicator."), ("outcome", "observed outcome", "Observed gross post-treatment outcome."), ("net_outcome", "observed net outcome", "Observed outcome after subtracting treatment cost for treated rows."), ("propensity", "oracle", "True treatment probability from the simulated assignment process."), ("mu0", "oracle", "True conditional mean outcome under control."), ("mu1", "oracle", "True conditional mean gross outcome under treatment."), ("gross_cate", "oracle", "Known gross individual treatment effect."), ("true_net_cate", "oracle", "Known treatment effect after subtracting treatment cost."),]: field_rows.append( {"column": col,"role": role,"observed_in_real_analysis": "yes"if role in ["treatment", "observed outcome", "observed net outcome"] else"no","description": description,"true_net_cate_driver": "not applicable", } )field_dictionary = pd.DataFrame(field_rows)field_dictionary.to_csv(TABLE_DIR /"07_field_dictionary.csv", index=False)display(field_dictionary)
column
role
observed_in_real_analysis
description
true_net_cate_driver
0
baseline_need
X policy/CATE feature
yes
Pre-treatment feature used for treatment-effec...
yes
1
prior_engagement
X policy/CATE feature
yes
Pre-treatment feature used for treatment-effec...
yes
2
friction_score
X policy/CATE feature
yes
Pre-treatment feature used for treatment-effec...
yes
3
content_affinity
X policy/CATE feature
yes
Pre-treatment feature used for treatment-effec...
yes
4
price_sensitivity
X policy/CATE feature
yes
Pre-treatment feature used for treatment-effec...
yes
5
region_risk
X policy/CATE feature
yes
Pre-treatment feature used for treatment-effec...
yes
6
high_need_segment
X policy/CATE feature
yes
Pre-treatment feature used for treatment-effec...
yes
7
trust_score
W/control or support feature
yes
Pre-treatment feature used for nuisance adjust...
no
8
recency_gap
W/control or support feature
yes
Pre-treatment feature used for nuisance adjust...
no
9
account_tenure
W/control or support feature
yes
Pre-treatment feature used for nuisance adjust...
no
10
seasonality_index
W/control or support feature
yes
Pre-treatment feature used for nuisance adjust...
no
11
device_stability
W/control or support feature
yes
Pre-treatment feature used for nuisance adjust...
no
12
traffic_intensity
W/control or support feature
yes
Pre-treatment feature used for nuisance adjust...
no
13
treatment
treatment
yes
Binary intervention indicator.
not applicable
14
outcome
observed outcome
yes
Observed gross post-treatment outcome.
not applicable
15
net_outcome
observed net outcome
yes
Observed outcome after subtracting treatment c...
not applicable
16
propensity
oracle
no
True treatment probability from the simulated ...
not applicable
17
mu0
oracle
no
True conditional mean outcome under control.
not applicable
18
mu1
oracle
no
True conditional mean gross outcome under trea...
not applicable
19
gross_cate
oracle
no
Known gross individual treatment effect.
not applicable
20
true_net_cate
oracle
no
Known treatment effect after subtracting treat...
not applicable
What this shows: the fitted models should use only observed pre-treatment features, treatment, and net outcome. Oracle fields are reserved for policy evaluation in the tutorial.
Basic Shape And Net Effect Scale
Before fitting any model, we summarize treatment rate, gross treatment effects, and net treatment effects.
What this shows: treatment is not automatically worth applying to everyone. The share of positive net CATE defines the approximate size of the unconstrained oracle policy.
Net CATE Distribution
This plot shows who has positive or negative true net benefit. The vertical zero line is the natural threshold for an unconstrained policy.
fig, ax = plt.subplots(figsize=(10, 5))sns.histplot(teaching_df["true_net_cate"], bins=45, kde=True, color="#2563eb", ax=ax)ax.axvline(0, color="#111827", linewidth=1.5, linestyle="--", label="break-even")ax.axvline(teaching_df["true_net_cate"].mean(), color="#dc2626", linewidth=2, label="mean net CATE")ax.set_title("True Net CATE Distribution")ax.set_xlabel("True Net CATE")ax.set_ylabel("Rows")ax.legend()plt.tight_layout()fig.savefig(FIGURE_DIR /"07_true_net_cate_distribution.png", dpi=160, bbox_inches="tight")plt.show()
What this shows: some units are below the break-even line. Good targeting should avoid treating them when possible, especially under a limited budget.
Raw Treated-Versus-Control Net Difference
A raw net-outcome difference is not a policy estimate. It mixes treatment effect, treatment cost, and selection into treatment.
What this shows: treated and untreated rows differ in baseline features and true net benefit. Policy learning needs adjustment, not raw group comparisons.
Covariate Balance Check
Standardized mean differences show how different treated and untreated groups are before modeling.
What this shows: the policy problem is observational, so learned policies should be treated as candidates for evaluation rather than automatically deployable rules.
Propensity Overlap
Policy learning can fail in regions where one action is rarely observed. The next table summarizes treatment rates and net effects by true propensity bucket.
What this shows: once treatment cost is folded into the outcome, the CATE from these estimators is a net benefit estimate. That makes policy thresholding straightforward.
Nuisance Diagnostics
Before fitting policy models, we check whether treatment and net outcome are predictable from observed pre-treatment features.
What this shows: assignment is predictable, so this is an observational policy problem. The policy learners need nuisance adjustment rather than simple outcome comparisons.
Fit Net CATE Models
We fit three CATE models on net outcome:
LinearDML as a readable baseline;
CausalForestDML as a flexible CATE model with intervals;
DRLearner with a forest final model as a doubly robust pseudo-outcome approach.
Each model estimates net benefit from treatment, because the outcome already subtracts treatment cost.
What this shows: policy learning begins with score quality. The best targeting model is not necessarily the model with the smallest ATE error; ranking quality matters heavily.
CATE Recovery Plot
The scatter plot compares estimated net CATE with true net CATE for the three CATE models.
What this shows: the zero lines matter for policy. Points in the wrong quadrant represent units where the model would make the wrong treat-or-do-not-treat decision under a threshold rule.
Build Score-Based Policies
This cell turns estimated net CATE scores into policy actions:
threshold policy: treat if estimated net CATE is above zero;
conservative policy: treat if the causal forest lower interval is above zero;
top-k policies: treat the highest-scoring 20 percent under each model;
oracle policies: use true net CATE, available only in simulation.
What this shows: threshold policies and top-k policies can have very different treatment rates. Budgeted policies are easier to compare because they treat the same share of the population.
Policy Value Function
In this truth-known simulation, policy gain over treating nobody is:
mean(policy_action * true_net_CATE)
This is not available in real observational data. Real policy value requires a credible evaluation design, such as an experiment or off-policy evaluation.
What this shows: policy value depends on both ranking and treatment rate. A conservative policy may have high precision among treated units but lower total gain because it treats fewer rows.
Policy Value Plot
This plot ranks score-based policies by true policy gain. The oracle rows are benchmarks, not deployable real-world policies.
plot_policy_values = policy_value_table.sort_values("true_policy_gain_vs_treat_none", ascending=True)fig, ax = plt.subplots(figsize=(11, 6))sns.barplot( data=plot_policy_values, x="true_policy_gain_vs_treat_none", y="policy", color="#34d399", ax=ax,)ax.axvline(0, color="#111827", linewidth=1)ax.set_title("True Policy Gain Versus Treating Nobody")ax.set_xlabel("Average True Net Gain")ax.set_ylabel("Policy")plt.tight_layout()fig.savefig(FIGURE_DIR /"07_score_policy_value.png", dpi=160, bbox_inches="tight")plt.show()
What this shows: the best feasible score-based policy should get close to the oracle benchmark while treating a realistic share of the population and avoiding negative-benefit selections.
Budget Curve
Instead of fixing one budget, we can examine policy gain across budget levels. This is useful when treatment capacity is uncertain.
What this shows: a single top-k number can hide how policies behave as budget changes. Budget curves show whether a ranking remains useful beyond the very top slice.
Budget Curve Plot
The plot shows how true policy gain changes as the treatment budget increases.
fig, ax = plt.subplots(figsize=(10, 5))sns.lineplot( data=budget_curve, x="budget_fraction", y="true_policy_gain", hue="policy_score", marker="o", linewidth=2, ax=ax,)ax.set_title("Policy Gain Across Treatment Budgets")ax.set_xlabel("Budget Fraction Treated")ax.set_ylabel("Average True Net Gain")ax.yaxis.set_major_formatter(lambda x, _: f"{x:.3f}")plt.tight_layout()fig.savefig(FIGURE_DIR /"07_budget_curve.png", dpi=160, bbox_inches="tight")plt.show()
What this shows: the oracle curve is the upper bound. A useful model stays clearly above random selection across the budget range where policy decisions are likely to be made.
Fit Direct EconML Policy Learners
EconML also includes direct policy learners. Here we fit:
DRPolicyTree: a shallow, interpretable policy tree;
DRPolicyForest: an ensemble policy model.
Both are fit on observed net outcome, treatment, X, and W. The learned action is a direct recommendation rather than a post-hoc threshold on CATE estimates.
What this shows: direct policy learners output actions, not CATE scores. They are attractive when the final object needs to be a decision rule, especially an interpretable tree.
Combined Policy Comparison
Now we compare the strongest score-based rules with direct policy learners in one table.
What this shows: direct policy learners and CATE-threshold policies answer the same decision problem in different ways. Their treatment rates may differ, so both value and action rate should be reported.
Combined Policy Plot
The plot compares score-based policies and direct policy learners by true net gain.
plot_combined = combined_policy_comparison.sort_values("true_policy_gain_vs_treat_none", ascending=True)fig, ax = plt.subplots(figsize=(11, 6))sns.barplot( data=plot_combined, x="true_policy_gain_vs_treat_none", y="policy", hue="policy_type", dodge=False, ax=ax,)ax.axvline(0, color="#111827", linewidth=1)ax.set_title("Score-Based Policies Versus Direct Policy Learners")ax.set_xlabel("Average True Net Gain")ax.set_ylabel("Policy")plt.tight_layout()fig.savefig(FIGURE_DIR /"07_combined_policy_comparison.png", dpi=160, bbox_inches="tight")plt.show()
What this shows: policy choice is not only about which method is most sophisticated. A simpler rule can be competitive if the CATE score ranks units well.
Policy Feature Importance
Direct policy learners can expose feature importance. This table shows which features the policy tree and policy forest used most when forming decisions.
What this shows: feature importance describes the fitted policy model, not the truth by itself. It helps explain which variables drove action recommendations.
Policy Feature Importance Plot
The plot compares the most important policy features across tree and forest policy learners.
What this shows: the tree and forest may emphasize different features. A shallow tree is easier to explain; a forest can average over more decision patterns.
Policy Tree Visualization
A shallow policy tree is valuable because it can be inspected directly. The next cell plots the learned tree structure.
What this shows: the tree is an interpretable decision rule. It should still be evaluated by policy value and support; readability alone does not make a policy credible.
Segment-Level Policy Behavior
A policy can have good overall value while concentrating treatment in particular segments. The next table summarizes treatment rates and true gain by segment for several policies.
What this shows: segment behavior is part of policy reporting. A policy that gains value by ignoring or over-targeting certain segments may need additional review.
Segment Treatment Rate Plot
This plot compares how often each policy treats each segment.
What this shows: policies encode priorities. Segment treatment-rate plots make those priorities explicit and easier to audit.
Support-Aware Policy Diagnostics
A high-value policy can still be risky if it selects many rows from weak-overlap regions. The next table summarizes propensity and interval width among selected rows.
What this shows: policy value should be reported alongside support diagnostics. A policy that relies on extreme-propensity rows may need experimental validation before deployment.
Threshold Sensitivity
A zero threshold is natural for net benefit, but analysts may choose a higher threshold to be conservative. This cell evaluates causal-forest threshold policies across several thresholds.
What this shows: raising the threshold usually treats fewer rows with higher average benefit among selected units. The best threshold depends on policy goals, costs, and risk tolerance.
Threshold Sensitivity Plot
The plot shows the tradeoff between treatment rate and true policy gain as the threshold changes.
What this shows: policy thresholds are business and risk decisions, not purely statistical choices. The curve makes the tradeoff visible.
Policy Learning Guidance
This table summarizes when to use different policy approaches.
policy_guidance = pd.DataFrame( [ {"situation": "No fixed budget and treatment cost is known","reasonable policy": "Treat if estimated net CATE is above zero","watchout": "Point estimates near zero are fragile; consider uncertainty or a margin.", }, {"situation": "Fixed treatment capacity","reasonable policy": "Top-k ranking by estimated net CATE","watchout": "Budget curves should be checked instead of relying on one k value.", }, {"situation": "Need an interpretable action rule","reasonable policy": "DRPolicyTree or shallow tree over CATE scores","watchout": "Interpretability can cost value; compare against score-based policies.", }, {"situation": "Need stronger predictive action performance","reasonable policy": "DRPolicyForest or flexible CATE ranking","watchout": "The rule may be harder to explain and still needs support checks.", }, {"situation": "Offline observational data only","reasonable policy": "Treat learned policy as a candidate for evaluation","watchout": "Real deployment should use experiments or valid off-policy evaluation.", }, ])policy_guidance.to_csv(TABLE_DIR /"07_policy_guidance.csv", index=False)display(policy_guidance)
situation
reasonable policy
watchout
0
No fixed budget and treatment cost is known
Treat if estimated net CATE is above zero
Point estimates near zero are fragile; conside...
1
Fixed treatment capacity
Top-k ranking by estimated net CATE
Budget curves should be checked instead of rel...
2
Need an interpretable action rule
DRPolicyTree or shallow tree over CATE scores
Interpretability can cost value; compare again...
3
Need stronger predictive action performance
DRPolicyForest or flexible CATE ranking
The rule may be harder to explain and still ne...
4
Offline observational data only
Treat learned policy as a candidate for evalua...
Real deployment should use experiments or vali...
What this shows: policy choice depends on operational constraints. The same CATE model can lead to different action rules under different costs, budgets, and risk tolerances.
Policy Learning Checklist
Before presenting a treatment policy, it is worth checking the items below.
policy_checklist = pd.DataFrame( [ {"check": "Treatment and outcome are clearly defined", "why_it_matters": "A policy acts on a specific intervention and optimizes a specific response."}, {"check": "Treatment cost is included or explicitly justified", "why_it_matters": "Positive gross effects can become negative net effects after cost."}, {"check": "All features are pre-treatment", "why_it_matters": "Policy rules must be available before deciding treatment."}, {"check": "Overlap is inspected", "why_it_matters": "Unsupported regions make action recommendations extrapolative."}, {"check": "CATE ranking quality is evaluated", "why_it_matters": "Targeting depends on ranking more than average effect alone."}, {"check": "Policy value is compared with simple baselines", "why_it_matters": "A learned policy should beat treat-none, treat-all, random, and simple top-k rules."}, {"check": "Treatment rate and budget are reported", "why_it_matters": "Policy value depends on how many units are treated."}, {"check": "Segment-level action rates are audited", "why_it_matters": "Overall gain can hide uneven treatment allocation."}, {"check": "Uncertainty or conservative margins are considered", "why_it_matters": "Policies based on noisy effects can over-treat borderline units."}, {"check": "Deployment requires evaluation", "why_it_matters": "Offline policy value from observational data is not enough by itself."}, ])policy_checklist.to_csv(TABLE_DIR /"07_policy_learning_checklist.csv", index=False)display(policy_checklist)
check
why_it_matters
0
Treatment and outcome are clearly defined
A policy acts on a specific intervention and o...
1
Treatment cost is included or explicitly justi...
Positive gross effects can become negative net...
2
All features are pre-treatment
Policy rules must be available before deciding...
3
Overlap is inspected
Unsupported regions make action recommendation...
4
CATE ranking quality is evaluated
Targeting depends on ranking more than average...
5
Policy value is compared with simple baselines
A learned policy should beat treat-none, treat...
6
Treatment rate and budget are reported
Policy value depends on how many units are tre...
7
Segment-level action rates are audited
Overall gain can hide uneven treatment allocat...
8
Uncertainty or conservative margins are consid...
Policies based on noisy effects can over-treat...
9
Deployment requires evaluation
Offline policy value from observational data i...
What this shows: policy learning is a decision workflow, not just a model-fitting exercise. Good reporting includes value, support, uncertainty, and action-distribution diagnostics.
Summary
This notebook turned CATE estimates into treatment policies.
The main takeaways are:
policy learning is about choosing actions, not only estimating effects;
treatment cost should be included when defining net benefit;
threshold policies and budgeted top-k policies answer different operational problems;
direct EconML policy learners can learn action rules without first exposing CATE scores;
policy value should be compared against simple baselines and oracle benchmarks when available;
segment treatment rates, support diagnostics, and uncertainty checks are essential for responsible policy reporting;
real-world deployment requires prospective evaluation or valid off-policy evaluation.
The next tutorial can focus on interpreting CATE models more deeply with feature importance, SHAP-style explanations, and segment-level summaries.