Sensitivity and Final Report

This notebook closes the interference and spillover workflow.

The analysis studied a recommendation setting where items compete for limited slate attention. A lower-ranked focal movie was randomly promoted in simulated MovieLens slates. That promotion helped the focal item, but it also shifted visibility away from other movies. The central question was not only whether the promoted item gained clicks, but whether the full slate gained value after accounting for displaced competitors.

This final notebook packages the work into portfolio-ready artifacts:

The main conclusion is that promoted-item gains alone are misleading under interference. In this simulation, the focal item gained engagement, but competitor losses more than offset that gain, producing a negative total slate effect.

1. Environment and Paths

This cell imports the tools used for final reporting and sets up the writeup folders. The notebook searches upward for the processed interference outputs so it can run from either JupyterLab or command-line execution.

from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from IPython.display import display

sns.set_theme(style="whitegrid", context="notebook")
pd.set_option("display.max_columns", 140)
pd.set_option("display.max_rows", 120)
pd.set_option("display.float_format", lambda value: f"{value:,.4f}")

candidate_roots = [Path.cwd(), *Path.cwd().parents]
PROJECT_DIR = next(
    root for root in candidate_roots
    if (root / "data" / "processed" / "movielens_interference_observed_effects.csv").exists()
)

PROCESSED_DIR = PROJECT_DIR / "data" / "processed"
NOTEBOOK_DIR = PROJECT_DIR / "notebooks" / "interference_spillover_effects"
WRITEUP_DIR = NOTEBOOK_DIR / "writeup"
FIGURE_DIR = WRITEUP_DIR / "figures"
TABLE_DIR = WRITEUP_DIR / "tables"
FIGURE_DIR.mkdir(parents=True, exist_ok=True)
TABLE_DIR.mkdir(parents=True, exist_ok=True)

WRITEUP_DIR, FIGURE_DIR, TABLE_DIR
(PosixPath('/home/apex/Documents/ranking_sys/notebooks/projects/project_4_interference_spillover_effects/writeup'),
 PosixPath('/home/apex/Documents/ranking_sys/notebooks/projects/project_4_interference_spillover_effects/writeup/figures'),
 PosixPath('/home/apex/Documents/ranking_sys/notebooks/projects/project_4_interference_spillover_effects/writeup/tables'))

The writeup folders keep final report artifacts colocated with the notebooks. Figures go into writeup/figures, tables go into writeup/tables, and markdown summaries go into the writeup root.

2. Load Final Inputs

This cell loads all outputs needed for the final report. These tables come from the earlier notebooks: setup, exposure mapping, randomized estimation, decomposition, and advanced modeling.

paths = {
    "setup_readiness": PROCESSED_DIR / "movielens_interference_setup_readiness.csv",
    "exposure_readiness": PROCESSED_DIR / "movielens_interference_exposure_readiness.csv",
    "observed_effects": PROCESSED_DIR / "movielens_interference_observed_effects.csv",
    "direct_indirect_total": PROCESSED_DIR / "movielens_interference_direct_indirect_total_effects.csv",
    "additivity_checks": PROCESSED_DIR / "movielens_interference_additivity_checks.csv",
    "spillover_sensitivity": PROCESSED_DIR / "movielens_interference_spillover_definition_sensitivity.csv",
    "product_summary": PROCESSED_DIR / "movielens_interference_product_summary.csv",
    "recommendations": PROCESSED_DIR / "movielens_interference_decomposition_recommendations.csv",
    "advanced_metrics": PROCESSED_DIR / "movielens_interference_advanced_model_metrics.csv",
    "advanced_aipw": PROCESSED_DIR / "movielens_interference_advanced_aipw_estimates.csv",
    "advanced_policy": PROCESSED_DIR / "movielens_interference_advanced_policy_targeting.csv",
    "advanced_takeaways": PROCESSED_DIR / "movielens_interference_advanced_takeaways.csv",
    "advanced_heterogeneity": PROCESSED_DIR / "movielens_interference_advanced_heterogeneity.csv",
    "cluster_risk": PROCESSED_DIR / "movielens_interference_cluster_risk.csv",
    "position_risk": PROCESSED_DIR / "movielens_interference_position_risk.csv",
}

loaded = {name: pd.read_csv(path) for name, path in paths.items()}

load_index = pd.DataFrame(
    {
        "artifact": list(paths.keys()),
        "path": [str(path) for path in paths.values()],
        "rows": [len(loaded[name]) for name in paths],
    }
)

display(load_index)
artifact path rows
0 setup_readiness /home/apex/Documents/ranking_sys/data/processe... 7
1 exposure_readiness /home/apex/Documents/ranking_sys/data/processe... 8
2 observed_effects /home/apex/Documents/ranking_sys/data/processe... 5
3 direct_indirect_total /home/apex/Documents/ranking_sys/data/processe... 12
4 additivity_checks /home/apex/Documents/ranking_sys/data/processe... 3
5 spillover_sensitivity /home/apex/Documents/ranking_sys/data/processe... 5
6 product_summary /home/apex/Documents/ranking_sys/data/processe... 5
7 recommendations /home/apex/Documents/ranking_sys/data/processe... 4
8 advanced_metrics /home/apex/Documents/ranking_sys/data/processe... 4
9 advanced_aipw /home/apex/Documents/ranking_sys/data/processe... 3
10 advanced_policy /home/apex/Documents/ranking_sys/data/processe... 5
11 advanced_takeaways /home/apex/Documents/ranking_sys/data/processe... 4
12 advanced_heterogeneity /home/apex/Documents/ranking_sys/data/processe... 21
13 cluster_risk /home/apex/Documents/ranking_sys/data/processe... 5
14 position_risk /home/apex/Documents/ranking_sys/data/processe... 3

The load index is a compact inventory of the full workflow. If a final table looks surprising, this index makes it easy to trace which upstream notebook produced the source data.

3. Build the Final Main Effects Table

This cell cleans the main randomized-estimator output. It converts estimates to report-friendly units and keeps the key quantities: direct focal effect, spillover effects, and total slate effect.

observed_effects = loaded["observed_effects"].copy()

final_effects = observed_effects[
    [
        "contrast",
        "outcome",
        "estimate",
        "cluster_se",
        "ci_95_lower",
        "ci_95_upper",
        "treated_mean",
        "control_mean",
        "treated_n",
        "control_n",
        "clusters",
    ]
].copy()
final_effects["estimate_per_1000"] = final_effects["estimate"] * 1000
final_effects["ci_95_lower_per_1000"] = final_effects["ci_95_lower"] * 1000
final_effects["ci_95_upper_per_1000"] = final_effects["ci_95_upper"] * 1000
final_effects["report_note"] = np.select(
    [
        final_effects["contrast"].eq("Direct focal item"),
        final_effects["contrast"].str.contains("spillover", case=False, na=False),
        final_effects["contrast"].eq("Total slate"),
    ],
    [
        "Promoted focal item gain.",
        "Competitor or non-focal item displacement.",
        "Net slate-level value after direct and spillover effects.",
    ],
    default="Supporting contrast.",
)

final_effects_output = TABLE_DIR / "final_main_effects.csv"
final_effects.to_csv(final_effects_output, index=False)

display(final_effects)
contrast outcome estimate cluster_se ci_95_lower ci_95_upper treated_mean control_mean treated_n control_n clusters estimate_per_1000 ci_95_lower_per_1000 ci_95_upper_per_1000 report_note
0 Direct focal item Observed simulated click 0.1716 0.0157 0.1409 0.2023 0.3435 0.1719 1505 1495 3000 171.6152 140.9221 202.3083 Promoted focal item gain.
1 Same-cluster competitor spillover Observed simulated click -0.0577 0.0084 -0.0743 -0.0412 0.1522 0.2100 4309 4139 2456 -57.7146 -74.2530 -41.1761 Competitor or non-focal item displacement.
2 Displaced-item spillover Observed simulated click -0.0568 0.0057 -0.0679 -0.0457 0.1818 0.2386 11277 11133 3000 -56.7841 -67.8706 -45.6975 Competitor or non-focal item displacement.
3 All non-focal slate spillover Observed simulated click -0.0436 0.0046 -0.0526 -0.0346 0.1705 0.2141 16555 16445 3000 -43.5851 -52.6154 -34.5549 Competitor or non-focal item displacement.
4 Total slate Observed total simulated clicks -0.3078 0.0538 -0.4132 -0.2024 2.2193 2.5271 1505 1495 3000 -307.8212 -413.2036 -202.4388 Net slate-level value after direct and spillov...

This is the core result table. The focal item gains clicks, but the spillover rows are negative and the total slate effect is negative. That is the central evidence that item-level reporting would overstate product value.

4. Plot the Final Main Effects

This figure turns the main effects table into a clean final-report chart. It uses the original units for each contrast: item-level click-rate differences for item rows and total simulated clicks for the slate-level row.

plot_effects = final_effects.sort_values("estimate").copy()

fig, ax = plt.subplots(figsize=(11, 5.5))
sns.pointplot(
    data=plot_effects,
    x="estimate",
    y="contrast",
    join=False,
    errorbar=None,
    color="tab:blue",
    ax=ax,
)
for y_pos, (_, row) in enumerate(plot_effects.reset_index(drop=True).iterrows()):
    ax.errorbar(
        x=row["estimate"],
        y=y_pos,
        xerr=[[row["estimate"] - row["ci_95_lower"]], [row["ci_95_upper"] - row["estimate"]]],
        fmt="none",
        color="black",
        capsize=3,
    )
ax.axvline(0, color="black", linewidth=1)
ax.set_title("Final Randomized Estimates: Direct, Spillover, and Total Effects")
ax.set_xlabel("Estimated effect")
ax.set_ylabel("")
plt.tight_layout()
main_effects_figure = FIGURE_DIR / "25_final_main_effects.png"
fig.savefig(main_effects_figure, dpi=160, bbox_inches="tight")
plt.show()

The chart makes the contrast visible in one glance. The direct focal effect is positive, but every spillover contrast is negative and the net slate effect is negative. That is the practical signal that attention was reallocated rather than expanded.

5. Final Direct, Indirect, and Total Decomposition

This cell extracts the observed-click decomposition from the formal decomposition notebook. The table reports effects per promoted slate and per 1,000 promoted slates so the numbers can be used directly in a written summary.

decomposition = loaded["direct_indirect_total"].copy()
observed_decomposition = decomposition.query("outcome_type == 'observed_clicks'").copy()
observed_decomposition["component_label"] = observed_decomposition["effect_family"] + ": " + observed_decomposition["component"]

component_order = [
    "Direct: Focal item",
    "Indirect: Same-cluster competitors",
    "Indirect: Other competitors",
    "Total: Full slate",
]
observed_decomposition["component_order"] = observed_decomposition["component_label"].map(
    {label: i for i, label in enumerate(component_order)}
)
observed_decomposition = observed_decomposition.sort_values("component_order")

observed_decomposition_output = TABLE_DIR / "final_direct_indirect_total_decomposition.csv"
observed_decomposition.to_csv(observed_decomposition_output, index=False)

display(observed_decomposition[
    [
        "effect_family",
        "component",
        "estimate_per_slate",
        "ci_95_lower",
        "ci_95_upper",
        "estimate_per_1000_promoted_slates",
        "ci_95_lower_per_1000",
        "ci_95_upper_per_1000",
    ]
])
effect_family component estimate_per_slate ci_95_lower ci_95_upper estimate_per_1000_promoted_slates ci_95_lower_per_1000 ci_95_upper_per_1000
0 Direct Focal item 0.1716 0.1409 0.2023 171.6152 140.8990 202.3315
1 Indirect Same-cluster competitors -0.1454 -0.2015 -0.0892 -145.3905 -201.5497 -89.2313
2 Indirect Other competitors -0.3340 -0.4260 -0.2421 -334.0459 -425.9571 -242.1347
3 Total Full slate -0.3078 -0.4132 -0.2025 -307.8212 -413.1899 -202.4525

This decomposition is the most important final-report table. It shows the accounting: the direct focal gain is positive, but same-cluster and other-competitor losses combine into a larger negative indirect effect, so the full slate loses value.

6. Plot the Final Decomposition

This figure shows the same decomposition in clicks per 1,000 promoted slates. Positive bars are gains, negative bars are losses, and the total bar summarizes the net product effect.

fig, ax = plt.subplots(figsize=(11, 5.5))
colors = ["tab:green" if value >= 0 else "tab:red" for value in observed_decomposition["estimate_per_1000_promoted_slates"]]
sns.barplot(
    data=observed_decomposition,
    x="estimate_per_1000_promoted_slates",
    y="component_label",
    order=component_order,
    hue="component_label",
    palette=dict(zip(observed_decomposition["component_label"], colors)),
    legend=False,
    ax=ax,
)
for y_pos, (_, row) in enumerate(observed_decomposition.reset_index(drop=True).iterrows()):
    ax.errorbar(
        x=row["estimate_per_1000_promoted_slates"],
        y=y_pos,
        xerr=[
            [row["estimate_per_1000_promoted_slates"] - row["ci_95_lower_per_1000"]],
            [row["ci_95_upper_per_1000"] - row["estimate_per_1000_promoted_slates"]],
        ],
        fmt="none",
        color="black",
        capsize=3,
    )
ax.axvline(0, color="black", linewidth=1)
ax.set_title("Final Decomposition per 1,000 Promoted Slates")
ax.set_xlabel("Change in simulated clicks per 1,000 promoted slates")
ax.set_ylabel("")
plt.tight_layout()
decomposition_figure = FIGURE_DIR / "26_final_direct_indirect_total_decomposition.png"
fig.savefig(decomposition_figure, dpi=160, bbox_inches="tight")
plt.show()

The figure is the clearest final visual for the project. It shows why the promoted item’s gain is not sufficient for product decision-making: the competitors lose more than the focal item gains.

7. Sensitivity Across Spillover Definitions

This cell packages the spillover sensitivity analysis. The goal is to check whether the spillover conclusion depends on one narrow definition of competition.

spillover_sensitivity = loaded["spillover_sensitivity"].copy()
spillover_sensitivity["estimate_per_1000_rows"] = spillover_sensitivity["estimate"] * 1000
spillover_sensitivity["ci_95_lower_per_1000_rows"] = spillover_sensitivity["ci_95_lower"] * 1000
spillover_sensitivity["ci_95_upper_per_1000_rows"] = spillover_sensitivity["ci_95_upper"] * 1000
spillover_sensitivity["direction"] = np.where(spillover_sensitivity["estimate"] < 0, "negative", "positive")

spillover_sensitivity_output = TABLE_DIR / "final_spillover_definition_sensitivity.csv"
spillover_sensitivity.to_csv(spillover_sensitivity_output, index=False)

display(spillover_sensitivity)
definition estimate cluster_se ci_95_lower ci_95_upper treated_mean control_mean treated_n control_n clusters description estimate_per_1000_rows ci_95_lower_per_1000_rows ci_95_upper_per_1000_rows direction
0 All non-focal items -0.0436 0.0046 -0.0526 -0.0346 0.1705 0.2141 16555 16445 3000 Every non-focal item in the slate. -43.5851 -52.6154 -34.5549 negative
1 Same-cluster competitors -0.0577 0.0084 -0.0743 -0.0412 0.1522 0.2100 4309 4139 2456 Non-focal items with the same primary-genre cl... -57.7146 -74.2530 -41.1761 negative
2 Displaced items -0.0568 0.0057 -0.0679 -0.0457 0.1818 0.2386 11277 11133 3000 Non-focal items above the focal item that shif... -56.7841 -67.8706 -45.6975 negative
3 Same-cluster displaced substitutes -0.0657 0.0104 -0.0860 -0.0454 0.1610 0.2268 2937 2809 2182 Same-cluster competitors that are also mechani... -65.7224 -86.0311 -45.4137 negative
4 Near-position displaced items -0.0330 0.0082 -0.0491 -0.0168 0.1604 0.1933 4515 4485 3000 Displaced items within three positions above t... -32.9567 -49.1192 -16.7941 negative

Every tested spillover definition is negative in this simulation. That makes the displacement conclusion stronger: it is not just an artifact of using one particular competitor definition.

8. Plot Spillover Sensitivity

This figure compares spillover definitions with confidence intervals. It helps show which competitor definitions are broad and which are stricter.

spillover_plot = spillover_sensitivity.sort_values("estimate").copy()

fig, ax = plt.subplots(figsize=(11, 5.5))
sns.pointplot(
    data=spillover_plot,
    x="estimate",
    y="definition",
    join=False,
    errorbar=None,
    color="tab:purple",
    ax=ax,
)
for y_pos, (_, row) in enumerate(spillover_plot.reset_index(drop=True).iterrows()):
    ax.errorbar(
        x=row["estimate"],
        y=y_pos,
        xerr=[[row["estimate"] - row["ci_95_lower"]], [row["ci_95_upper"] - row["estimate"]]],
        fmt="none",
        color="black",
        capsize=3,
    )
ax.axvline(0, color="black", linewidth=1)
ax.set_title("Final Sensitivity: Spillover Definition")
ax.set_xlabel("Estimated click-rate effect")
ax.set_ylabel("")
plt.tight_layout()
spillover_figure = FIGURE_DIR / "27_final_spillover_sensitivity.png"
fig.savefig(spillover_figure, dpi=160, bbox_inches="tight")
plt.show()

The sensitivity figure supports a cautious but clear claim: under several plausible definitions of competitor exposure, spillover effects are negative. That gives the final recommendation more weight.

9. Advanced Modeling Summary

This cell packages the advanced modeling results: outcome-model metrics, model-assisted AIPW estimates, counterfactual targeting, and the final takeaways. These results show that flexible models can help target safer promotions, but they do not overturn the randomized finding.

advanced_metrics = loaded["advanced_metrics"].copy()
advanced_aipw = loaded["advanced_aipw"].copy()
advanced_policy = loaded["advanced_policy"].copy()
advanced_takeaways = loaded["advanced_takeaways"].copy()

advanced_summary = pd.concat(
    [
        advanced_takeaways.assign(section="takeaway"),
    ],
    ignore_index=True,
)

advanced_metrics.to_csv(TABLE_DIR / "final_advanced_model_metrics.csv", index=False)
advanced_aipw.to_csv(TABLE_DIR / "final_advanced_aipw_estimates.csv", index=False)
advanced_policy.to_csv(TABLE_DIR / "final_advanced_policy_targeting.csv", index=False)
advanced_takeaways.to_csv(TABLE_DIR / "final_advanced_takeaways.csv", index=False)

print("Advanced model metrics")
display(advanced_metrics)
print("Model-assisted estimates")
display(advanced_aipw)
print("Policy targeting")
display(advanced_policy)
print("Advanced takeaways")
display(advanced_takeaways)
Advanced model metrics
model split rmse mae r2
0 LightGBM train 0.8958 0.7144 0.6281
1 LightGBM test 1.4425 1.1341 0.0813
2 XGBoost train 1.1111 0.8913 0.4278
3 XGBoost test 1.4054 1.1033 0.1280
Model-assisted estimates
estimator estimate se ci_95_lower ci_95_upper reference
0 Randomized difference in means -0.3078 NaN NaN NaN Observed total simulated clicks
1 Cross-fitted LightGBM AIPW -0.3036 0.0505 -0.4024 -0.2047 Observed total simulated clicks
2 Oracle mean promotion lift -0.3158 0.0019 -0.3195 -0.3120 Known simulation lift
Policy targeting
policy selected_slates coverage mean_predicted_lift_selected mean_oracle_lift_selected oracle_lift_per_1000_selected_slates oracle_lift_per_1000_eligible_slates
0 Promote all eligible slates 3000 1.0000 -0.2388 -0.3158 -315.7549 -315.7549
1 Promote top 50% by predicted net lift 1500 0.5000 -0.1648 -0.3091 -309.0693 -154.5346
2 Promote top 25% by predicted net lift 750 0.2500 -0.1213 -0.3065 -306.4754 -76.6188
3 Promote only predicted-positive slates 30 0.0100 0.0901 -0.3654 -365.3709 -3.6537
4 Oracle top 25% benchmark 750 0.2500 -0.2284 -0.1953 -195.3376 -48.8344
Advanced takeaways
area finding why_it_matters
0 Outcome modeling XGBoost had the best held-out RMSE at 1.405 cl... Flexible models can summarize how slate compos...
1 Conditional effects XGBoost had CATE RMSE 0.161 versus the oracle ... The model can be used to rank slates by predic...
2 Model-assisted estimation Cross-fitted AIPW estimated -0.304 total click... Model-assisted estimates should agree with the...
3 Policy targeting The best evaluated policy was 'Promote only pr... Targeting can reduce displacement harm compare...

The advanced-model story is nuanced. XGBoost had the best held-out prediction performance, and the model-assisted AIPW estimate closely matched the randomized estimator. Targeting reduced harm mainly by promoting fewer slates, which reinforces the product lesson: promotions should be gated by expected net slate value.

10. Plot Advanced Policy Targeting

This figure compares policy rules using oracle expected lift per 1,000 eligible slates. It rewards policies for selecting better slates but also accounts for how many slates they promote.

policy_plot = advanced_policy.sort_values("oracle_lift_per_1000_eligible_slates").copy()

fig, ax = plt.subplots(figsize=(11, 5.5))
colors = ["tab:green" if value >= 0 else "tab:red" for value in policy_plot["oracle_lift_per_1000_eligible_slates"]]
sns.barplot(
    data=policy_plot,
    x="oracle_lift_per_1000_eligible_slates",
    y="policy",
    hue="policy",
    palette=dict(zip(policy_plot["policy"], colors)),
    legend=False,
    ax=ax,
)
ax.axvline(0, color="black", linewidth=1)
ax.set_title("Final Advanced Modeling Check: Targeted Promotion Policies")
ax.set_xlabel("Oracle lift per 1,000 eligible slates")
ax.set_ylabel("")
plt.tight_layout()
policy_figure = FIGURE_DIR / "28_final_policy_targeting.png"
fig.savefig(policy_figure, dpi=160, bbox_inches="tight")
plt.show()

The targeting figure shows that model-based selection can reduce harm compared with promoting every eligible slate. It also shows that the simulation contains few broadly positive promotion opportunities, so restraint is part of the recommended policy.

11. Limitations Table

This cell writes a limitations table. The project is methodologically useful, but the final report should be honest about the fact that MovieLens ratings are not real exposure logs and the promotion process is simulated.

limitations = pd.DataFrame(
    [
        {
            "limitation": "MovieLens contains ratings, not true recommendation impressions.",
            "impact": "The analysis demonstrates an interference workflow but does not estimate real production promotion effects.",
            "mitigation": "Use real slate impression logs or run a randomized slate experiment when available.",
        },
        {
            "limitation": "Promotion assignment and outcomes are simulated.",
            "impact": "The numerical estimates reflect the assumed data-generating process.",
            "mitigation": "Treat the results as a methodology demonstration and validate with online experiments.",
        },
        {
            "limitation": "Genres are coarse substitute clusters.",
            "impact": "Some substitutes may be missed and some same-genre items may not actually compete.",
            "mitigation": "Use learned item embeddings, co-watch patterns, or richer metadata for production analysis.",
        },
        {
            "limitation": "Only one promotion design is simulated.",
            "impact": "Different promotion probabilities, slate sizes, or ranking policies could change the net effect.",
            "mitigation": "Run sensitivity analyses over assignment rules and promotion intensities.",
        },
        {
            "limitation": "Advanced models are evaluated against simulated oracle lift.",
            "impact": "Policy targeting results depend on the simulated outcome mechanism.",
            "mitigation": "Use randomized or valid off-policy evaluation before deploying targeted promotion rules.",
        },
    ]
)

limitations_output = TABLE_DIR / "final_limitations.csv"
limitations.to_csv(limitations_output, index=False)
display(limitations)
limitation impact mitigation
0 MovieLens contains ratings, not true recommend... The analysis demonstrates an interference work... Use real slate impression logs or run a random...
1 Promotion assignment and outcomes are simulated. The numerical estimates reflect the assumed da... Treat the results as a methodology demonstrati...
2 Genres are coarse substitute clusters. Some substitutes may be missed and some same-g... Use learned item embeddings, co-watch patterns...
3 Only one promotion design is simulated. Different promotion probabilities, slate sizes... Run sensitivity analyses over assignment rules...
4 Advanced models are evaluated against simulate... Policy targeting results depend on the simulat... Use randomized or valid off-policy evaluation ...

The limitations make the final report more credible. They separate the transferable causal workflow from the specific numeric simulation results, which should not be presented as real-world platform lift.

12. Final Recommendations

This cell combines the earlier decomposition recommendations with final-report wording. The recommendations are framed as decision rules for recommendation systems where items compete for attention.

recommendations = loaded["recommendations"].copy()
final_recommendations = pd.concat(
    [
        recommendations,
        pd.DataFrame(
            [
                {
                    "decision_area": "Experiment design",
                    "recommendation": "Randomize at the slate, cluster, or market level when interference is plausible.",
                    "evidence": "Item-level treatment assignment can miss displacement across neighboring items.",
                },
                {
                    "decision_area": "Advanced targeting",
                    "recommendation": "Use ML targeting only after validating net slate value against a randomized or off-policy benchmark.",
                    "evidence": "The model-assisted AIPW estimate matched the randomized estimate, while policy targeting mainly helped by avoiding many harmful promotions.",
                },
            ]
        ),
    ],
    ignore_index=True,
)

recommendations_output = TABLE_DIR / "final_recommendations.csv"
final_recommendations.to_csv(recommendations_output, index=False)
display(final_recommendations)
decision_area recommendation evidence
0 Item-level reporting Do not report promoted-item gain alone as the ... The focal item gains 171.6 clicks per 1,000 pr...
1 Slate-level metric Use total slate effect as the primary decision... The net observed slate effect is -307.8 simula...
2 Spillover monitoring Track same-cluster and displaced-item outcomes... Competitor movement more than offsets the dire...
3 Policy design Treat large rank jumps as higher-risk interven... Promotion changes final positions for multiple...
4 Experiment design Randomize at the slate, cluster, or market lev... Item-level treatment assignment can miss displ...
5 Advanced targeting Use ML targeting only after validating net sla... The model-assisted AIPW estimate matched the r...

The recommendations are deliberately operational. The final answer is not simply “promotion is bad.” The stronger lesson is that promotion decisions need slate-level metrics and spillover monitoring whenever items compete for scarce attention.

13. Final Executive Summary and Resume Bullets

This cell writes two markdown files: a final project summary and concise resume bullets. The summary is suitable for a portfolio writeup, while the bullets are written in a compact resume style.

direct_gain = final_effects.query("contrast == 'Direct focal item'")["estimate"].iloc[0]
total_effect = final_effects.query("contrast == 'Total slate'")["estimate"].iloc[0]
same_cluster_effect = final_effects.query("contrast == 'Same-cluster competitor spillover'")["estimate"].iloc[0]
all_non_focal_effect = final_effects.query("contrast == 'All non-focal slate spillover'")["estimate"].iloc[0]

direct_per_1000 = observed_decomposition.query("effect_family == 'Direct'")["estimate_per_1000_promoted_slates"].iloc[0]
same_cluster_per_1000 = observed_decomposition.query("component == 'Same-cluster competitors'")["estimate_per_1000_promoted_slates"].iloc[0]
other_per_1000 = observed_decomposition.query("component == 'Other competitors'")["estimate_per_1000_promoted_slates"].iloc[0]
total_per_1000 = observed_decomposition.query("effect_family == 'Total'")["estimate_per_1000_promoted_slates"].iloc[0]

aipw_estimate = advanced_aipw.query("estimator == 'Cross-fitted LightGBM AIPW'")["estimate"].iloc[0]
best_model_row = advanced_metrics.query("split == 'test'").sort_values("rmse").iloc[0]
best_policy_row = advanced_policy.sort_values("oracle_lift_per_1000_eligible_slates", ascending=False).iloc[0]

summary_text = f"""# Final Summary: Interference and Spillover Effects in Recommendation Slates

## Question

This analysis studies what happens when a lower-ranked item is promoted inside a recommendation slate. The causal issue is interference: promoting one item can affect other items because slate attention is limited.

## Data and Design

MovieLens 32M was used as a realistic user-item preference dataset. Since MovieLens does not contain real impression logs or promotion assignments, the workflow constructed simulated recommendation slates, selected focal items, randomized focal promotion at the slate level, and generated outcomes from an explicit competition model.

## Main Finding

The promoted focal item gained engagement, but the full slate lost value after competitor displacement was included.

- Direct focal item effect: {direct_gain:.3f} simulated click-rate lift.
- Same-cluster competitor spillover: {same_cluster_effect:.3f}.
- All non-focal spillover: {all_non_focal_effect:.3f}.
- Total slate effect: {total_effect:.3f} simulated clicks per slate.

In product units, the focal item gained {direct_per_1000:,.1f} simulated clicks per 1,000 promoted slates, while same-cluster competitors changed by {same_cluster_per_1000:,.1f} and other competitors changed by {other_per_1000:,.1f}. The net total slate effect was {total_per_1000:,.1f} simulated clicks per 1,000 promoted slates.

## Advanced Modeling

Flexible outcome models were used to predict slate-level outcomes and estimate conditional net promotion effects. {best_model_row['model']} had the best held-out RMSE at {best_model_row['rmse']:.3f} clicks per slate. A cross-fitted LightGBM AIPW estimate was {aipw_estimate:.3f}, close to the randomized total-effect estimate of {total_effect:.3f}. The best evaluated targeting rule was `{best_policy_row['policy']}`, with {best_policy_row['oracle_lift_per_1000_eligible_slates']:.1f} oracle lift per 1,000 eligible slates.

## Recommendation

Do not evaluate promotion policies using promoted-item gains alone. When items compete for visibility, report slate-level total effects and monitor substitute or displaced-item spillovers. Advanced models can help target safer promotions, but they should be validated against randomized or valid off-policy benchmarks.

## Limitations

The numerical results are from a simulation built on MovieLens ratings, not a real production experiment. Genres are coarse substitute clusters, and the outcome mechanism is assumed. The value of the work is the transferable causal workflow: define interference, randomize at the right level, decompose direct and indirect effects, and evaluate policies by net slate value.
"""

resume_bullets = f"""# Resume Bullets: Interference and Spillover Effects

- Built an end-to-end causal inference workflow for recommendation slate interference using MovieLens 32M, simulated randomized promotions, and slate-level outcome construction.
- Estimated direct, spillover, and total effects under item competition; found a +{direct_per_1000:,.1f} focal-click gain but a {total_per_1000:,.1f} net slate-click change per 1,000 promoted slates after competitor displacement.
- Implemented slate-clustered estimators, bootstrap checks, direct/indirect/total decomposition, and sensitivity analyses across same-cluster, displaced-item, and all-non-focal spillover definitions.
- Trained LightGBM and XGBoost outcome models for conditional net-effect prediction, compared model-assisted AIPW against randomized estimates, and evaluated targeted promotion policies.
"""

summary_path = WRITEUP_DIR / "final_project_summary.md"
resume_path = WRITEUP_DIR / "resume_bullets.md"
summary_path.write_text(summary_text)
resume_path.write_text(resume_bullets)

print(summary_text)
print("\nSaved markdown files:")
print(summary_path)
print(resume_path)
# Final Summary: Interference and Spillover Effects in Recommendation Slates

## Question

This analysis studies what happens when a lower-ranked item is promoted inside a recommendation slate. The causal issue is interference: promoting one item can affect other items because slate attention is limited.

## Data and Design

MovieLens 32M was used as a realistic user-item preference dataset. Since MovieLens does not contain real impression logs or promotion assignments, the workflow constructed simulated recommendation slates, selected focal items, randomized focal promotion at the slate level, and generated outcomes from an explicit competition model.

## Main Finding

The promoted focal item gained engagement, but the full slate lost value after competitor displacement was included.

- Direct focal item effect: 0.172 simulated click-rate lift.
- Same-cluster competitor spillover: -0.058.
- All non-focal spillover: -0.044.
- Total slate effect: -0.308 simulated clicks per slate.

In product units, the focal item gained 171.6 simulated clicks per 1,000 promoted slates, while same-cluster competitors changed by -145.4 and other competitors changed by -334.0. The net total slate effect was -307.8 simulated clicks per 1,000 promoted slates.

## Advanced Modeling

Flexible outcome models were used to predict slate-level outcomes and estimate conditional net promotion effects. XGBoost had the best held-out RMSE at 1.405 clicks per slate. A cross-fitted LightGBM AIPW estimate was -0.304, close to the randomized total-effect estimate of -0.308. The best evaluated targeting rule was `Promote only predicted-positive slates`, with -3.7 oracle lift per 1,000 eligible slates.

## Recommendation

Do not evaluate promotion policies using promoted-item gains alone. When items compete for visibility, report slate-level total effects and monitor substitute or displaced-item spillovers. Advanced models can help target safer promotions, but they should be validated against randomized or valid off-policy benchmarks.

## Limitations

The numerical results are from a simulation built on MovieLens ratings, not a real production experiment. Genres are coarse substitute clusters, and the outcome mechanism is assumed. The value of the work is the transferable causal workflow: define interference, randomize at the right level, decompose direct and indirect effects, and evaluate policies by net slate value.


Saved markdown files:
/home/apex/Documents/ranking_sys/notebooks/projects/project_4_interference_spillover_effects/writeup/final_project_summary.md
/home/apex/Documents/ranking_sys/notebooks/projects/project_4_interference_spillover_effects/writeup/resume_bullets.md

The markdown outputs are the primary portfolio artifacts. The summary explains the causal story in plain language, while the resume bullets compress the work into concrete technical accomplishments.

14. Artifact Index

This final cell writes an index of final tables, figures, and markdown files. The index makes it easy to locate the polished outputs without opening every notebook.

selected_figures = [
    FIGURE_DIR / "25_final_main_effects.png",
    FIGURE_DIR / "26_final_direct_indirect_total_decomposition.png",
    FIGURE_DIR / "27_final_spillover_sensitivity.png",
    FIGURE_DIR / "28_final_policy_targeting.png",
    FIGURE_DIR / "21_advanced_feature_importance.png",
    FIGURE_DIR / "24_advanced_heterogeneity_segments.png",
]

selected_tables = [
    TABLE_DIR / "final_main_effects.csv",
    TABLE_DIR / "final_direct_indirect_total_decomposition.csv",
    TABLE_DIR / "final_spillover_definition_sensitivity.csv",
    TABLE_DIR / "final_advanced_model_metrics.csv",
    TABLE_DIR / "final_advanced_aipw_estimates.csv",
    TABLE_DIR / "final_advanced_policy_targeting.csv",
    TABLE_DIR / "final_advanced_takeaways.csv",
    TABLE_DIR / "final_limitations.csv",
    TABLE_DIR / "final_recommendations.csv",
]

selected_markdown = [
    WRITEUP_DIR / "final_project_summary.md",
    WRITEUP_DIR / "resume_bullets.md",
]

artifact_rows = []
for artifact_type, files in [
    ("figure", selected_figures),
    ("table", selected_tables),
    ("markdown", selected_markdown),
]:
    for path in files:
        artifact_rows.append(
            {
                "artifact_type": artifact_type,
                "file_name": path.name,
                "path": str(path),
                "exists": path.exists(),
            }
        )

artifact_index = pd.DataFrame(artifact_rows)
artifact_index_output = TABLE_DIR / "artifact_index.csv"
artifact_index.to_csv(artifact_index_output, index=False)

display(artifact_index)
print(f"Artifact index saved to: {artifact_index_output}")
artifact_type file_name path exists
0 figure 25_final_main_effects.png /home/apex/Documents/ranking_sys/notebooks/int... True
1 figure 26_final_direct_indirect_total_decomposition.png /home/apex/Documents/ranking_sys/notebooks/int... True
2 figure 27_final_spillover_sensitivity.png /home/apex/Documents/ranking_sys/notebooks/int... True
3 figure 28_final_policy_targeting.png /home/apex/Documents/ranking_sys/notebooks/int... True
4 figure 21_advanced_feature_importance.png /home/apex/Documents/ranking_sys/notebooks/int... True
5 figure 24_advanced_heterogeneity_segments.png /home/apex/Documents/ranking_sys/notebooks/int... True
6 table final_main_effects.csv /home/apex/Documents/ranking_sys/notebooks/int... True
7 table final_direct_indirect_total_decomposition.csv /home/apex/Documents/ranking_sys/notebooks/int... True
8 table final_spillover_definition_sensitivity.csv /home/apex/Documents/ranking_sys/notebooks/int... True
9 table final_advanced_model_metrics.csv /home/apex/Documents/ranking_sys/notebooks/int... True
10 table final_advanced_aipw_estimates.csv /home/apex/Documents/ranking_sys/notebooks/int... True
11 table final_advanced_policy_targeting.csv /home/apex/Documents/ranking_sys/notebooks/int... True
12 table final_advanced_takeaways.csv /home/apex/Documents/ranking_sys/notebooks/int... True
13 table final_limitations.csv /home/apex/Documents/ranking_sys/notebooks/int... True
14 table final_recommendations.csv /home/apex/Documents/ranking_sys/notebooks/int... True
15 markdown final_project_summary.md /home/apex/Documents/ranking_sys/notebooks/int... True
16 markdown resume_bullets.md /home/apex/Documents/ranking_sys/notebooks/int... True
Artifact index saved to: /home/apex/Documents/ranking_sys/notebooks/projects/project_4_interference_spillover_effects/writeup/tables/artifact_index.csv

The artifact index closes the workflow. The final report can now be reviewed from the notebook, the markdown summary, or the saved tables and figures in the writeup folder.

Final Takeaway

This workflow demonstrates why interference matters in recommendation systems. A promotion can look successful at the item level while hurting the slate after competitor displacement is counted. The strongest final recommendation is to evaluate promotion policies with slate-level total effects, not just promoted-item outcomes.