02. Pricing and Promotions

Pricing is one of the highest-leverage causal inference problems in industry.

A small change in price can affect demand, revenue, margin, inventory, customer acquisition, retention, and brand perception. Promotions add another layer: a discount can increase purchases today while reducing full-price demand tomorrow.

This notebook treats pricing and promotions as causal decision problems. We will use simulations to study:

why naive price elasticity estimates can be badly biased,
how randomized price variation and instruments help identify demand response,
why revenue lift is not the same as profit lift,
how discount depth changes customer behavior,
how promotions can create stockpiling and post-promotion dips,
how to translate estimates into a pricing recommendation.

Learning Goals

By the end of this notebook, you should be able to:

Define causal price elasticity and distinguish it from predictive demand modeling.
Explain why observed prices are often endogenous.
Estimate elasticity from observational panel data and diagnose bias.
Use instrumental-variable logic for pricing when valid cost shocks are available.
Analyze a randomized discount-depth experiment.
Evaluate promotions using contribution profit, not only revenue.
Detect pull-forward and stockpiling using distributed-lag effects.
Build a professional pricing readout with assumptions, estimates, risks, and decision guidance.

1. Setup

We will use pandas, statsmodels, linearmodels, scipy, seaborn, and Graphviz.

import warnings

warnings.filterwarnings("ignore")

from IPython.display import Markdown, display
from graphviz import Digraph
from linearmodels.iv import IV2SLS
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.special import expit
from scipy.stats import norm
import seaborn as sns
import statsmodels.formula.api as smf


rng = np.random.default_rng(20260430)

sns.set_theme(style="whitegrid", context="notebook")
plt.rcParams["figure.figsize"] = (10, 5)
plt.rcParams["axes.spines.top"] = False
plt.rcParams["axes.spines.right"] = False
plt.rcParams["figure.dpi"] = 130


def dollars(x, digits=0):
    return f"${x:,.{digits}f}"


def pct(x, digits=1):
    return f"{100 * x:.{digits}f}%"


def ratio(x, digits=2):
    return f"{x:.{digits}f}"


def styled_table(df, money_cols=None, pct_cols=None, ratio_cols=None):
    money_cols = money_cols or []
    pct_cols = pct_cols or []
    ratio_cols = ratio_cols or []

    fmt = {}
    for col in money_cols:
        fmt[col] = lambda v: dollars(v, 2) if abs(v) < 100 else dollars(v, 0)
    for col in pct_cols:
        fmt[col] = lambda v: pct(v, 1)
    for col in ratio_cols:
        fmt[col] = lambda v: f"{v:,.3f}"

    return df.style.format(fmt)

2. The Causal Question in Pricing

A predictive demand model asks:

Given the observed price and context, how many units will we sell?

A causal pricing model asks:

What would demand have been if we had set a different price?

That counterfactual distinction matters. In potential-outcome notation:

\[ Y_i(p) = \text{outcome for unit } i \text{ if price were set to } p \]

For a price change from $p_0$ to $p_1$, the causal effect is:

\[ E[Y(p_1)-Y(p_0)] \]

For continuous prices, a common estimand is own-price elasticity:

\[ \epsilon = \frac{\partial Q/Q}{\partial P/P} = \frac{\partial \log Q}{\partial \log P} \]

If $\epsilon=-2$, then a 1% price increase decreases quantity demanded by approximately 2%, holding other demand drivers fixed.

For business decisions, elasticity is only an intermediate object. The decision objective is usually profit:

\[ \Pi(p) = (p-c)Q(p) \]

where $c$ is marginal cost. A price can increase revenue while decreasing profit, or decrease units while increasing profit. The causal estimate must be connected to the decision metric.

3. Why Pricing Is Hard: Prices Are Usually Endogenous

Observed prices are often chosen by the business in response to expected demand.

Examples:

A retailer raises prices when demand is strong.
A marketplace surges prices when supply is tight.
A brand discounts when inventory is high.
A competitor’s promotion triggers a defensive price cut.
A store manager marks down slow-moving products.

If we regress sales on observed prices, the coefficient can mix two forces:

the causal effect of price on demand,
the business rule that sets prices higher or lower when demand conditions change.

This is the classic price endogeneity problem. Demand estimation papers often focus on this issue because price can be correlated with unobserved demand shocks. In an industry project, this is the first thing to look for before believing an elasticity estimate.

dot = Digraph("pricing_dag", format="svg")
dot.attr(rankdir="LR", bgcolor="transparent")
dot.attr("node", shape="box", style="rounded,filled", color="#3B4252", fillcolor="#EEF2F7", fontname="DejaVu Sans")
dot.attr("edge", color="#5E6C84", fontname="DejaVu Sans")

dot.node("cost", "Cost shocks\nsupplier costs,\nshipping, FX")
dot.node("demand", "Demand shocks\nseasonality, trend,\ncompetitors, events")
dot.node("price", "Observed price")
dot.node("quantity", "Quantity sold")
dot.node("revenue", "Revenue")
dot.node("profit", "Contribution profit")
dot.node("promo", "Promotion\nfeature, display,\ncoupon")
dot.node("future", "Future demand\nstockpiling,\nreference price")

dot.edge("cost", "price")
dot.edge("demand", "price", label="pricing rule")
dot.edge("demand", "quantity")
dot.edge("price", "quantity", label="causal effect")
dot.edge("quantity", "revenue")
dot.edge("price", "revenue")
dot.edge("quantity", "profit")
dot.edge("price", "profit")
dot.edge("cost", "profit")
dot.edge("promo", "quantity")
dot.edge("promo", "future")
dot.edge("future", "quantity", label="later periods")

dot

Hendel and Nevo (2006) study sales and consumer inventory and emphasize that temporary price reductions can create large sales increases partly because consumers stockpile for future consumption. Mela, Gupta, and Lehmann (1997) and Pauwels, Hanssens, and Siddarth (2002) study longer-run promotion and advertising effects on brand choice and category behavior. These papers are useful reminders that pricing and promotions can have dynamic consequences, not just same-day effects.

The rest of this notebook turns these ideas into a practical workflow.

4. Running Example: Everyday Price Elasticity

Imagine an e-commerce retailer selling multiple products across stores or regions. The pricing team has historical product-store-week data:

price,
unit cost,
units sold,
revenue,
contribution margin,
product, store, and week identifiers.

The team wants to estimate price elasticity and decide whether current prices should move up or down.

We will simulate the data with a known true elasticity. The crucial feature is that price is not randomly assigned. The pricing system raises prices when expected demand is high.

def simulate_pricing_panel(n_products=14, n_stores=16, n_weeks=52, seed=101):
    local_rng = np.random.default_rng(seed)

    products = [f"SKU_{i:02d}" for i in range(n_products)]
    stores = [f"Store_{j:02d}" for j in range(n_stores)]

    base_price = pd.Series(local_rng.uniform(18, 95, n_products), index=products)
    base_cost_ratio = pd.Series(local_rng.uniform(0.42, 0.66, n_products), index=products)
    product_demand = pd.Series(local_rng.normal(0.0, 0.55, n_products), index=products)
    store_demand = pd.Series(local_rng.normal(0.0, 0.32, n_stores), index=stores)
    store_markup = pd.Series(local_rng.normal(0.0, 0.035, n_stores), index=stores)

    true_elasticity = -1.75
    rows = []

    for product in products:
        for store in stores:
            persistent_demand = local_rng.normal(0, 0.15)
            for week in range(n_weeks):
                seasonal = 0.22 * np.sin(2 * np.pi * week / 26) + 0.08 * np.cos(2 * np.pi * week / 13)
                demand_shock = 0.65 * persistent_demand + local_rng.normal(0, 0.35)
                cost_shock = local_rng.normal(0, 0.08)

                unit_cost = base_price[product] * base_cost_ratio[product] * np.exp(cost_shock)

                # The pricing rule is endogenous: prices rise when expected demand is high.
                log_price = (
                    np.log(base_price[product])
                    + 0.55 * cost_shock
                    + 0.24 * demand_shock
                    + store_markup[store]
                    + local_rng.normal(0, 0.035)
                )
                price = np.exp(log_price)

                log_units = (
                    6.2
                    + product_demand[product]
                    + store_demand[store]
                    + seasonal
                    + 0.85 * demand_shock
                    + true_elasticity * (np.log(price) - np.log(base_price[product]))
                    + local_rng.normal(0, 0.18)
                )
                units = np.maximum(np.exp(log_units), 1)
                revenue = price * units
                contribution_profit = (price - unit_cost) * units

                rows.append(
                    {
                        "product": product,
                        "store": store,
                        "week": week,
                        "product_store": f"{product}_{store}",
                        "price": price,
                        "unit_cost": unit_cost,
                        "units": units,
                        "revenue": revenue,
                        "contribution_profit": contribution_profit,
                        "demand_shock_unobserved": demand_shock,
                        "true_elasticity": true_elasticity,
                    }
                )

    df = pd.DataFrame(rows)
    df["log_price"] = np.log(df["price"])
    df["log_cost"] = np.log(df["unit_cost"])
    df["log_units"] = np.log(df["units"])
    return df


pricing = simulate_pricing_panel()

display(pricing.head())
print(f"Rows: {len(pricing):,}")
print(f"Products: {pricing['product'].nunique()}")
print(f"Stores: {pricing['store'].nunique()}")
print(f"Weeks: {pricing['week'].nunique()}")
print(f"True elasticity: {pricing['true_elasticity'].iloc[0]:.2f}")

	product	store	week	product_store	price	unit_cost	units	revenue	contribution_profit	demand_shock_unobserved	true_elasticity	log_price	log_cost	log_units
0	SKU_00	Store_00	0	SKU_00_Store_00	73.571826	48.423691	2757.993889	202910.646804	69358.403391	-0.167774	-1.75	4.298262	3.879989	7.922259
1	SKU_00	Store_00	1	SKU_00_Store_00	85.994920	61.370613	1619.714798	139287.244724	39884.354111	-0.421172	-1.75	4.454288	4.116931	7.390005
2	SKU_00	Store_00	2	SKU_00_Store_00	90.308709	57.209390	2190.240268	197797.771107	72495.462250	0.427364	-1.75	4.503234	4.046718	7.691767
3	SKU_00	Store_00	3	SKU_00_Store_00	107.225114	67.507646	1769.406999	189724.867859	70276.366358	0.301128	-1.75	4.674930	4.212241	7.478400
4	SKU_00	Store_00	4	SKU_00_Store_00	77.702545	60.110874	1390.946855	108080.109963	24469.078669	-0.526568	-1.75	4.352888	4.096191	7.237740

Rows: 11,648
Products: 14
Stores: 16
Weeks: 52
True elasticity: -1.75

The unobserved demand shock is included because this is a simulation. In real data, that column is exactly what we do not observe. The price endogeneity problem comes from this hidden demand shock influencing both price and units.

fig, axes = plt.subplots(1, 2, figsize=(12, 4.5))

sns.scatterplot(
    data=pricing.sample(1200, random_state=4),
    x="log_price",
    y="log_units",
    hue="demand_shock_unobserved",
    palette="viridis",
    alpha=0.65,
    ax=axes[0],
)
axes[0].set_title("Observed price-quantity relationship")
axes[0].set_xlabel("Log price")
axes[0].set_ylabel("Log units")
axes[0].legend(title="Demand shock", loc="best", fontsize=8)

sns.regplot(
    data=pricing.sample(1800, random_state=8),
    x="demand_shock_unobserved",
    y="log_price",
    scatter_kws={"alpha": 0.25, "s": 15},
    line_kws={"color": "#C0392B"},
    ax=axes[1],
)
axes[1].set_title("The pricing rule raises prices in high-demand states")
axes[1].set_xlabel("Unobserved demand shock")
axes[1].set_ylabel("Log price")

plt.tight_layout()
plt.show()

The second plot is the danger. Price is correlated with an unobserved demand driver. If we omit that demand driver, the estimated price coefficient can be much less negative than the true causal elasticity.

5. Naive and Fixed-Effect Elasticity Estimates

Start with two observational regressions:

naive log-log demand,
log-log demand with product, store, and week fixed effects.

The log-log coefficient on price is interpreted as elasticity:

\[ \log(Q_{gst}) = \alpha + \beta \log(P_{gst}) + \varepsilon_{gst} \]

With fixed effects:

\[ \log(Q_{gst}) = \alpha_g + \alpha_s + \lambda_t + \beta \log(P_{gst}) + \varepsilon_{gst} \]

Fixed effects remove stable product/store differences and common week shocks. They do not automatically remove product-store-week demand shocks that the pricing system observed and acted on.

naive_model = smf.ols("log_units ~ log_price", data=pricing).fit(
    cov_type="cluster", cov_kwds={"groups": pricing["product_store"]}
)

fe_model = smf.ols(
    "log_units ~ log_price + C(product) + C(store) + C(week)",
    data=pricing,
).fit(cov_type="cluster", cov_kwds={"groups": pricing["product_store"]})

elasticity_comparison = pd.DataFrame(
    [
        {
            "model": "True causal elasticity",
            "estimate": pricing["true_elasticity"].iloc[0],
            "std_error": np.nan,
            "ci_low": np.nan,
            "ci_high": np.nan,
        },
        {
            "model": "Naive OLS",
            "estimate": naive_model.params["log_price"],
            "std_error": naive_model.bse["log_price"],
            "ci_low": naive_model.conf_int().loc["log_price", 0],
            "ci_high": naive_model.conf_int().loc["log_price", 1],
        },
        {
            "model": "Product/store/week FE",
            "estimate": fe_model.params["log_price"],
            "std_error": fe_model.bse["log_price"],
            "ci_low": fe_model.conf_int().loc["log_price", 0],
            "ci_high": fe_model.conf_int().loc["log_price", 1],
        },
    ]
)

display(styled_table(elasticity_comparison, ratio_cols=["estimate", "std_error", "ci_low", "ci_high"]))

	model	estimate	std_error	ci_low	ci_high
0	True causal elasticity	-1.750	nan	nan	nan
1	Naive OLS	-0.471	0.115	-0.695	-0.246
2	Product/store/week FE	0.718	0.023	0.672	0.763

fig, ax = plt.subplots(figsize=(9, 4.5))

plot_elast = elasticity_comparison.dropna(subset=["estimate"]).copy()
plot_elast["y"] = np.arange(len(plot_elast))

ax.errorbar(
    plot_elast["estimate"],
    plot_elast["y"],
    xerr=[
        plot_elast["estimate"] - plot_elast["ci_low"].fillna(plot_elast["estimate"]),
        plot_elast["ci_high"].fillna(plot_elast["estimate"]) - plot_elast["estimate"],
    ],
    fmt="o",
    color="#2F80ED",
    capsize=4,
)
ax.axvline(pricing["true_elasticity"].iloc[0], color="#C0392B", linestyle="--", label="True elasticity")
ax.axvline(0, color="#6B7280", linewidth=1)
ax.set_yticks(plot_elast["y"])
ax.set_yticklabels(plot_elast["model"])
ax.set_title("Naive observational estimates are biased toward zero")
ax.set_xlabel("Elasticity estimate")
ax.legend()
plt.tight_layout()
plt.show()

The fixed-effect model improves the comparison but still does not fully solve the problem. If the pricing system observes a demand forecast that the analyst does not observe, price remains correlated with the regression error.

This is why pricing teams try to create or find exogenous price variation.

6. Instrumental-Variable Logic for Pricing

One way to estimate causal elasticity is to use a variable that shifts price but does not directly shift demand. Cost shocks are a common candidate:

supplier cost changes,
shipping cost changes,
exchange-rate shocks,
input commodity prices,
platform fee changes.

The logic is:

\[ Z \rightarrow P \rightarrow Q \]

where $Z$ shifts price, but affects quantity only through price after controls.

The exclusion restriction is strong. A cost shock is not valid if it changes product availability, shipping speed, quality, assortment, advertising, or customer expectations. Still, the example is useful because it shows the structure of the argument.

first_stage = smf.ols(
    "log_price ~ log_cost + C(product) + C(store) + C(week)",
    data=pricing,
).fit(cov_type="cluster", cov_kwds={"groups": pricing["product_store"]})

iv_model = IV2SLS.from_formula(
    "log_units ~ 1 + C(product) + C(store) + C(week) + [log_price ~ log_cost]",
    data=pricing,
).fit(cov_type="clustered", clusters=pricing["product_store"])

iv_summary = pd.DataFrame(
    [
        {
            "model": "True causal elasticity",
            "estimate": pricing["true_elasticity"].iloc[0],
            "std_error": np.nan,
            "ci_low": np.nan,
            "ci_high": np.nan,
        },
        {
            "model": "FE OLS",
            "estimate": fe_model.params["log_price"],
            "std_error": fe_model.bse["log_price"],
            "ci_low": fe_model.conf_int().loc["log_price", 0],
            "ci_high": fe_model.conf_int().loc["log_price", 1],
        },
        {
            "model": "IV with cost instrument",
            "estimate": iv_model.params["log_price"],
            "std_error": iv_model.std_errors["log_price"],
            "ci_low": iv_model.conf_int().loc["log_price", "lower"],
            "ci_high": iv_model.conf_int().loc["log_price", "upper"],
        },
    ]
)

display(styled_table(iv_summary, ratio_cols=["estimate", "std_error", "ci_low", "ci_high"]))
print(f"First-stage coefficient on log cost: {first_stage.params['log_cost']:.3f}")
print(f"First-stage t-statistic: {first_stage.tvalues['log_cost']:.1f}")

	model	estimate	std_error	ci_low	ci_high
0	True causal elasticity	-1.750	nan	nan	nan
1	FE OLS	0.718	0.023	0.672	0.763
2	IV with cost instrument	-1.720	0.074	-1.865	-1.576

First-stage coefficient on log cost: 0.558
First-stage t-statistic: 50.8

The IV estimate is much closer to the true elasticity in this simulation because the instrument creates price variation independent of the unobserved demand shock.

In real pricing work, the most important part of an IV analysis is not the software command. It is the credibility of the instrument:

Is the instrument strongly related to price?
Could it affect demand through any other path?
Does it change product availability or customer experience?
Is the estimate local to a subset of prices, products, or stores?

7. From Elasticity to Price Decisions

Suppose we have an elasticity estimate and want to decide whether to adjust price for a product. A constant-elasticity demand approximation is:

\[ Q(p) = A p^\epsilon \]

The profit curve is:

\[ \Pi(p) = (p-c)A p^\epsilon \]

We can calibrate $A$ using current price and current quantity, then compare the price implied by different elasticity estimates.

def price_response_curve(current_price, current_units, unit_cost, elasticity, price_grid):
    A = current_units / (current_price ** elasticity)
    units = A * (price_grid ** elasticity)
    revenue = price_grid * units
    profit = (price_grid - unit_cost) * units
    return pd.DataFrame(
        {
            "price": price_grid,
            "units": units,
            "revenue": revenue,
            "profit": profit,
            "elasticity": elasticity,
        }
    )


target = pricing.query("product == 'SKU_03'").copy()
current_price = target["price"].mean()
current_units = target["units"].mean()
unit_cost = target["unit_cost"].mean()
price_grid = np.linspace(current_price * 0.65, current_price * 1.45, 120)

curves = []
for label, elasticity in [
    ("Naive OLS elasticity", naive_model.params["log_price"]),
    ("FE OLS elasticity", fe_model.params["log_price"]),
    ("IV elasticity", iv_model.params["log_price"]),
    ("True elasticity", pricing["true_elasticity"].iloc[0]),
]:
    curve = price_response_curve(current_price, current_units, unit_cost, elasticity, price_grid)
    curve["model"] = label
    curves.append(curve)

price_curves = pd.concat(curves, ignore_index=True)
best_prices = (
    price_curves.loc[price_curves.groupby("model")["profit"].idxmax(), ["model", "price", "units", "revenue", "profit"]]
    .sort_values("model")
    .reset_index(drop=True)
)

display(
    styled_table(
        best_prices,
        money_cols=["price", "revenue", "profit"],
        ratio_cols=["units"],
    )
)

	model	price	units	revenue	profit
0	FE OLS elasticity	$93.44	345.857	$32,316	$21,178
1	IV elasticity	$76.97	195.094	$15,017	$8,735
2	Naive OLS elasticity	$93.44	222.398	$20,780	$13,618
3	True elasticity	$75.24	201.962	$15,196	$8,692

fig, ax = plt.subplots(figsize=(11, 5))
sns.lineplot(data=price_curves, x="price", y="profit", hue="model", ax=ax)
ax.axvline(current_price, color="#6B7280", linestyle="--", label="Current average price")
ax.axvline(unit_cost, color="#C0392B", linestyle=":", label="Average unit cost")
ax.set_title("Elasticity estimates imply different profit-maximizing prices")
ax.set_xlabel("Candidate price")
ax.set_ylabel("Predicted contribution profit")
ax.legend()
plt.tight_layout()
plt.show()

The naive estimate tends to make demand look too insensitive to price. That can lead to overly aggressive price increases. A biased elasticity estimate is not just a statistical issue; it changes the recommended price.

8. Randomized Discount-Depth Experiment

Now switch from everyday price elasticity to promotions.

Suppose a subscription business wants to offer a first-order discount. The team can randomize eligible customers to:

no discount,
10% discount,
20% discount,
30% discount.

The outcome should not be only conversion or revenue. A deeper discount can increase orders while reducing margin per order.

The treatment is multi-valued:

\[ D_i \in \{0, 0.10, 0.20, 0.30\} \]

The causal object is a dose-response curve:

\[ E[Y_i(d)] \]

for each discount depth $d$.

def simulate_discount_experiment(n=80_000, seed=222):
    local_rng = np.random.default_rng(seed)

    discount_depth = local_rng.choice([0.00, 0.10, 0.20, 0.30], size=n, p=[0.25, 0.25, 0.25, 0.25])
    list_price = local_rng.lognormal(mean=np.log(74), sigma=0.20, size=n)
    unit_cost = list_price * local_rng.uniform(0.38, 0.52, size=n)
    prior_visits = local_rng.poisson(2.6, size=n)
    loyalty = local_rng.beta(2.8, 2.0, size=n)
    price_sensitivity = local_rng.beta(2.0, 2.6, size=n)
    urgency = local_rng.beta(2.4, 3.0, size=n)

    baseline = (
        -3.7
        + 0.24 * prior_visits
        + 1.35 * urgency
        + 0.45 * loyalty
        - 0.15 * price_sensitivity
    )

    # Discounts lift conversion more for price-sensitive customers, less for loyal customers.
    treatment_logit_lift = (
        5.2 * discount_depth * price_sensitivity
        + 1.4 * discount_depth * urgency
        - 1.0 * discount_depth * loyalty
        - 2.2 * np.maximum(discount_depth - 0.20, 0)  # deeper discounts create lower-quality conversion.
    )
    purchase_prob = expit(baseline + treatment_logit_lift)
    purchased = local_rng.binomial(1, purchase_prob)

    paid_price = list_price * (1 - discount_depth)
    units = purchased
    revenue = paid_price * units
    contribution_profit = (paid_price - unit_cost) * units

    return pd.DataFrame(
        {
            "customer_id": np.arange(n),
            "discount_depth": discount_depth,
            "list_price": list_price,
            "paid_price": paid_price,
            "unit_cost": unit_cost,
            "prior_visits": prior_visits,
            "loyalty": loyalty,
            "price_sensitivity": price_sensitivity,
            "urgency": urgency,
            "purchased": purchased,
            "revenue": revenue,
            "contribution_profit": contribution_profit,
        }
    )


discounts = simulate_discount_experiment()

display(discounts.head())
print(f"Rows: {len(discounts):,}")
print("Randomized discount shares:")
print(discounts["discount_depth"].value_counts(normalize=True).sort_index().to_string())

	customer_id	discount_depth	list_price	paid_price	unit_cost	prior_visits	loyalty	price_sensitivity	urgency	purchased	revenue	contribution_profit
0	0	0.1	57.386994	51.648294	25.109448	3	0.475016	0.267409	0.618212	0	0.00000	0.000000
1	1	0.2	88.779029	71.023223	41.006452	1	0.768788	0.653649	0.254488	0	0.00000	0.000000
2	2	0.2	100.339596	80.271677	42.976153	1	0.548399	0.644847	0.253644	0	0.00000	0.000000
3	3	0.3	89.587797	62.711458	41.887416	2	0.544176	0.881218	0.440993	0	0.00000	0.000000
4	4	0.3	97.480915	68.236640	46.244734	3	0.194301	0.247010	0.831915	1	68.23664	21.991907

Rows: 80,000
Randomized discount shares:
discount_depth
0.0    0.249075
0.1    0.252850
0.2    0.250663
0.3    0.247413

Because discount depth is randomized, simple group means estimate the causal dose-response curve. We should still check balance on pre-treatment covariates.

balance_discount = (
    discounts.groupby("discount_depth")
    .agg(
        customers=("customer_id", "size"),
        prior_visits=("prior_visits", "mean"),
        loyalty=("loyalty", "mean"),
        price_sensitivity=("price_sensitivity", "mean"),
        urgency=("urgency", "mean"),
        list_price=("list_price", "mean"),
    )
    .reset_index()
)

display(
    styled_table(
        balance_discount,
        money_cols=["list_price"],
        pct_cols=["discount_depth"],
        ratio_cols=["prior_visits", "loyalty", "price_sensitivity", "urgency"],
    )
)

	discount_depth	customers	prior_visits	loyalty	price_sensitivity	urgency	list_price
0	0.0%	19926	2.602	0.585	0.435	0.447	$75.46
1	10.0%	20228	2.604	0.581	0.435	0.446	$75.38
2	20.0%	20053	2.613	0.585	0.434	0.445	$75.43
3	30.0%	19793	2.608	0.583	0.435	0.447	$75.37

discount_readout = (
    discounts.groupby("discount_depth")
    .agg(
        customers=("customer_id", "size"),
        conversion_rate=("purchased", "mean"),
        revenue_per_customer=("revenue", "mean"),
        profit_per_customer=("contribution_profit", "mean"),
        avg_paid_price=("paid_price", "mean"),
    )
    .reset_index()
)

control = discount_readout.loc[discount_readout["discount_depth"].eq(0)].iloc[0]
discount_readout["incremental_revenue_per_customer"] = (
    discount_readout["revenue_per_customer"] - control["revenue_per_customer"]
)
discount_readout["incremental_profit_per_customer"] = (
    discount_readout["profit_per_customer"] - control["profit_per_customer"]
)

display(
    styled_table(
        discount_readout,
        money_cols=[
            "revenue_per_customer",
            "profit_per_customer",
            "avg_paid_price",
            "incremental_revenue_per_customer",
            "incremental_profit_per_customer",
        ],
        pct_cols=["discount_depth", "conversion_rate"],
    )
)

	discount_depth	customers	conversion_rate	revenue_per_customer	profit_per_customer	avg_paid_price	incremental_revenue_per_customer	incremental_profit_per_customer
0	0.0%	19926	9.6%	$7.23	$3.97	$75.46	$0.00	$0.00
1	10.0%	20228	12.6%	$8.58	$4.29	$67.84	$1.34	$0.31
2	20.0%	20053	15.1%	$9.12	$4.01	$60.35	$1.88	$0.03
3	30.0%	19793	15.7%	$8.26	$2.94	$52.76	$1.02	$-1.03

fig, axes = plt.subplots(1, 3, figsize=(14, 4.5))

sns.lineplot(data=discount_readout, x="discount_depth", y="conversion_rate", marker="o", ax=axes[0], color="#2F80ED")
axes[0].set_title("Conversion rises with discount depth")
axes[0].set_xlabel("Discount depth")
axes[0].set_ylabel("Conversion rate")
axes[0].xaxis.set_major_formatter(lambda x, pos: f"{100*x:.0f}%")
axes[0].yaxis.set_major_formatter(lambda x, pos: f"{100*x:.1f}%")

sns.lineplot(data=discount_readout, x="discount_depth", y="revenue_per_customer", marker="o", ax=axes[1], color="#27AE60")
axes[1].set_title("Revenue response")
axes[1].set_xlabel("Discount depth")
axes[1].set_ylabel("Revenue per customer")
axes[1].xaxis.set_major_formatter(lambda x, pos: f"{100*x:.0f}%")

sns.lineplot(data=discount_readout, x="discount_depth", y="profit_per_customer", marker="o", ax=axes[2], color="#C0392B")
axes[2].set_title("Profit can peak before revenue")
axes[2].set_xlabel("Discount depth")
axes[2].set_ylabel("Contribution profit per customer")
axes[2].xaxis.set_major_formatter(lambda x, pos: f"{100*x:.0f}%")

plt.tight_layout()
plt.show()

The promotion decision should be made on contribution profit and strategic constraints, not conversion alone. A 30% discount may generate more purchases, but those purchases can be less profitable.

This is a common industry mistake: optimizing promotions on conversion rate or gross revenue while ignoring margin.

9. Heterogeneous Discount Response

Promotions often work best for specific customer segments. In this simulation, customers differ in price sensitivity and loyalty.

We can use the randomized experiment to estimate segment-level treatment effects. This is not yet a causal ML model; it is a simple subgroup readout.

segmented = discounts.copy()
segmented["price_sensitivity_segment"] = pd.qcut(
    segmented["price_sensitivity"],
    q=3,
    labels=["Low price sensitivity", "Medium price sensitivity", "High price sensitivity"],
)
segmented["loyalty_segment"] = pd.qcut(
    segmented["loyalty"],
    q=3,
    labels=["Low loyalty", "Medium loyalty", "High loyalty"],
)

segment_profit = (
    segmented.groupby(["price_sensitivity_segment", "discount_depth"], observed=True)
    .agg(profit_per_customer=("contribution_profit", "mean"), conversion_rate=("purchased", "mean"))
    .reset_index()
)

control_segment = segment_profit.query("discount_depth == 0")[
    ["price_sensitivity_segment", "profit_per_customer"]
].rename(columns={"profit_per_customer": "control_profit"})

segment_profit = segment_profit.merge(control_segment, on="price_sensitivity_segment")
segment_profit["incremental_profit"] = segment_profit["profit_per_customer"] - segment_profit["control_profit"]

display(
    styled_table(
        segment_profit,
        money_cols=["profit_per_customer", "control_profit", "incremental_profit"],
        pct_cols=["discount_depth", "conversion_rate"],
    )
)

	price_sensitivity_segment	discount_depth	profit_per_customer	conversion_rate	control_profit	incremental_profit
0	Low price sensitivity	0.0%	$3.98	9.6%	$3.98	$0.00
1	Low price sensitivity	10.0%	$3.95	11.7%	$3.98	$-0.03
2	Low price sensitivity	20.0%	$3.37	12.8%	$3.98	$-0.62
3	Low price sensitivity	30.0%	$2.09	11.2%	$3.98	$-1.90
4	Medium price sensitivity	0.0%	$4.14	9.9%	$4.14	$0.00
5	Medium price sensitivity	10.0%	$4.27	12.5%	$4.14	$0.13
6	Medium price sensitivity	20.0%	$3.91	14.7%	$4.14	$-0.23
7	Medium price sensitivity	30.0%	$2.85	15.3%	$4.14	$-1.30
8	High price sensitivity	0.0%	$3.80	9.1%	$3.80	$0.00
9	High price sensitivity	10.0%	$4.64	13.7%	$3.80	$0.84
10	High price sensitivity	20.0%	$4.76	17.9%	$3.80	$0.97
11	High price sensitivity	30.0%	$3.89	20.7%	$3.80	$0.09

fig, ax = plt.subplots(figsize=(10.5, 5))
sns.lineplot(
    data=segment_profit,
    x="discount_depth",
    y="incremental_profit",
    hue="price_sensitivity_segment",
    marker="o",
    ax=ax,
)
ax.axhline(0, color="#6B7280", linewidth=1)
ax.set_title("Profit response differs by price sensitivity")
ax.set_xlabel("Discount depth")
ax.set_ylabel("Incremental profit per customer")
ax.xaxis.set_major_formatter(lambda x, pos: f"{100*x:.0f}%")
ax.legend(title="")
plt.tight_layout()
plt.show()

This is the bridge from pricing experiments to personalization:

broad discounts may be wasteful for customers who would buy anyway,
targeted discounts may be profitable for price-sensitive customers,
aggressive discounts can train customers to wait for promotions.

The next layer would be a CATE or policy-learning model trained on randomized discount data.

10. Promotion Timing and Stockpiling

For storable products, promotion-week sales can overstate incremental demand. Customers may buy earlier than they otherwise would have, then buy less after the promotion.

Hendel and Nevo (2006) explicitly study sales and consumer inventory, emphasizing that temporary price reductions can create both current sales spikes and demand accumulation or stockpiling behavior. In industry, this means we should inspect cumulative effects, not only same-week lift.

def simulate_stockpiling_panel(n_stores=30, n_weeks=60, seed=333):
    local_rng = np.random.default_rng(seed)
    rows = []
    stores = [f"Store_{i:02d}" for i in range(n_stores)]
    store_scale = pd.Series(local_rng.lognormal(mean=7.2, sigma=0.25, size=n_stores), index=stores)

    promo_weeks = {12, 13, 28, 42}
    for store in stores:
        local_scale = store_scale[store]
        for week in range(n_weeks):
            promotion = int(week in promo_weeks)
            lag1 = int((week - 1) in promo_weeks)
            lag2 = int((week - 2) in promo_weeks)
            season = 0.12 * np.sin(2 * np.pi * week / 26) + 0.05 * np.cos(2 * np.pi * week / 13)
            noise = local_rng.normal(0, 0.08)

            # Promotions create a current spike and then post-promotion demand dips.
            log_units = (
                np.log(local_scale)
                + season
                + 0.48 * promotion
                - 0.18 * lag1
                - 0.10 * lag2
                + noise
            )
            units = np.exp(log_units)
            base_price = 12.0
            price = base_price * (0.78 if promotion else 1.0)
            unit_cost = 5.2

            rows.append(
                {
                    "store": store,
                    "week": week,
                    "promotion": promotion,
                    "promo_lag1": lag1,
                    "promo_lag2": lag2,
                    "price": price,
                    "units": units,
                    "revenue": price * units,
                    "contribution_profit": (price - unit_cost) * units,
                }
            )

    df = pd.DataFrame(rows)
    df["log_units"] = np.log(df["units"])
    return df


stockpile = simulate_stockpiling_panel()

weekly_stockpile = (
    stockpile.groupby("week")
    .agg(units=("units", "mean"), promotion=("promotion", "max"))
    .reset_index()
)

display(stockpile.head())

	store	week	price	units	revenue	contribution_profit	log_units
0	Store_00	0	12.0	1119.652637	13435.831641	7613.637930	7.020774
1	Store_00	1	12.0	1338.838339	16066.060073	9104.100708	7.199558
2	Store_00	2	12.0	1054.374796	12652.497549	7169.748611	6.960703
3	Store_00	3	12.0	1055.424772	12665.097268	7176.888452	6.961699
4	Store_00	4	12.0	1238.167522	14858.010263	8419.539149	7.121388

fig, ax = plt.subplots(figsize=(12, 5))
ax.plot(weekly_stockpile["week"], weekly_stockpile["units"], marker="o", color="#2F80ED")
for promo_week in weekly_stockpile.loc[weekly_stockpile["promotion"].eq(1), "week"]:
    ax.axvspan(promo_week - 0.4, promo_week + 0.4, color="#F2C94C", alpha=0.35)
ax.set_title("Promotion weeks spike, then demand partially falls back")
ax.set_xlabel("Week")
ax.set_ylabel("Average units per store")
plt.tight_layout()
plt.show()

same_week_model = smf.ols(
    "log_units ~ promotion + C(store) + C(week)",
    data=stockpile,
).fit(cov_type="cluster", cov_kwds={"groups": stockpile["store"]})

distributed_lag_model = smf.ols(
    "log_units ~ promotion + promo_lag1 + promo_lag2 + C(store) + C(week)",
    data=stockpile,
).fit(cov_type="cluster", cov_kwds={"groups": stockpile["store"]})

lag_readout = pd.DataFrame(
    [
        {
            "effect": "Same-week promotion effect",
            "estimate_log_points": distributed_lag_model.params["promotion"],
            "approx_percent_effect": np.exp(distributed_lag_model.params["promotion"]) - 1,
        },
        {
            "effect": "Week +1 pull-forward effect",
            "estimate_log_points": distributed_lag_model.params["promo_lag1"],
            "approx_percent_effect": np.exp(distributed_lag_model.params["promo_lag1"]) - 1,
        },
        {
            "effect": "Week +2 pull-forward effect",
            "estimate_log_points": distributed_lag_model.params["promo_lag2"],
            "approx_percent_effect": np.exp(distributed_lag_model.params["promo_lag2"]) - 1,
        },
    ]
)

same_week_only = same_week_model.params["promotion"]
distributed_total = (
    distributed_lag_model.params["promotion"]
    + distributed_lag_model.params["promo_lag1"]
    + distributed_lag_model.params["promo_lag2"]
)

display(styled_table(lag_readout, pct_cols=["approx_percent_effect"], ratio_cols=["estimate_log_points"]))
print(f"Same-week-only log effect: {same_week_only:.3f}")
print(f"Distributed 3-week cumulative log effect: {distributed_total:.3f}")

	effect	estimate_log_points	approx_percent_effect
0	Same-week promotion effect	0.376	45.6%
1	Week +1 pull-forward effect	-0.149	-13.9%
2	Week +2 pull-forward effect	-0.148	-13.8%

Same-week-only log effect: 0.346
Distributed 3-week cumulative log effect: 0.078

The same-week promotion coefficient captures the spike. The distributed-lag model also estimates the post-promotion dips. For storable products, the cumulative effect can be much smaller than the week-of-promotion effect.

This distinction matters for:

grocery and consumer packaged goods,
subscription annual plans,
travel and ticketing,
durable goods,
any category where customers can shift purchase timing.

11. Common Pricing and Promotion Threats

A pricing analysis should include a threat model. The table below is a practical checklist.

pricing_threats = pd.DataFrame(
    [
        {
            "threat": "Price endogeneity",
            "symptom": "Observed price increases during high-demand periods.",
            "diagnostic": "Plot prices against demand forecasts, events, and inventory.",
            "mitigation": "Randomize prices, use valid instruments, or exploit policy rules.",
        },
        {
            "threat": "Stockouts",
            "symptom": "Low price appears to reduce sales because inventory runs out.",
            "diagnostic": "Track availability, stockout flags, and lost sales.",
            "mitigation": "Model censored demand or exclude stockout windows.",
        },
        {
            "threat": "Competitor reaction",
            "symptom": "Competitors match prices after treatment starts.",
            "diagnostic": "Monitor competitor prices and market share.",
            "mitigation": "Short tests, geo tests, or models with competitor controls.",
        },
        {
            "threat": "Customer interference",
            "symptom": "Customers share coupons or compare prices across accounts.",
            "diagnostic": "Measure redemption leakage and cross-account behavior.",
            "mitigation": "Randomize at household, geo, or market level.",
        },
        {
            "threat": "Reference price effects",
            "symptom": "Future full-price conversion declines after repeated discounts.",
            "diagnostic": "Track long-run conversion and willingness to pay.",
            "mitigation": "Limit discount frequency and measure delayed outcomes.",
        },
        {
            "threat": "Margin mix shift",
            "symptom": "Sales lift comes from low-margin products or expensive fulfillment.",
            "diagnostic": "Use contribution margin by order, not only revenue.",
            "mitigation": "Optimize on profit and guardrail product mix.",
        },
        {
            "threat": "Fairness or regulatory risk",
            "symptom": "Personalized prices create unacceptable disparities.",
            "diagnostic": "Audit price exposure by protected and sensitive segments.",
            "mitigation": "Use policy constraints and legal review.",
        },
    ]
)

display(pricing_threats)

	threat	symptom	diagnostic	mitigation
0	Price endogeneity	Observed price increases during high-demand pe...	Plot prices against demand forecasts, events, ...	Randomize prices, use valid instruments, or ex...
1	Stockouts	Low price appears to reduce sales because inve...	Track availability, stockout flags, and lost s...	Model censored demand or exclude stockout wind...
2	Competitor reaction	Competitors match prices after treatment starts.	Monitor competitor prices and market share.	Short tests, geo tests, or models with competi...
3	Customer interference	Customers share coupons or compare prices acro...	Measure redemption leakage and cross-account b...	Randomize at household, geo, or market level.
4	Reference price effects	Future full-price conversion declines after re...	Track long-run conversion and willingness to pay.	Limit discount frequency and measure delayed o...
5	Margin mix shift	Sales lift comes from low-margin products or e...	Use contribution margin by order, not only rev...	Optimize on profit and guardrail product mix.
6	Fairness or regulatory risk	Personalized prices create unacceptable dispar...	Audit price exposure by protected and sensitiv...	Use policy constraints and legal review.

12. Power Planning for Discount Tests

For randomized discount experiments, we can do a simple minimum detectable effect calculation on profit per customer.

For two equal-sized groups:

\[ \text{MDE} = (z_{1-\alpha/2}+z_{\text{power}})\sqrt{\frac{2\sigma^2}{n}} \]

For unequal groups, use:

\[ \text{MDE} = (z_{1-\alpha/2}+z_{\text{power}})\sqrt{\frac{\sigma^2}{n_t}+\frac{\sigma^2}{n_c}} \]

def mde_two_group(sd, n_treat, n_control, alpha=0.05, power=0.80):
    return (norm.ppf(1 - alpha / 2) + norm.ppf(power)) * sd * np.sqrt(1 / n_treat + 1 / n_control)


profit_sd = discounts["contribution_profit"].std(ddof=1)
power_rows = []
for total_n in [10_000, 25_000, 50_000, 100_000, 250_000]:
    n_per_arm = total_n // 4
    mde = mde_two_group(profit_sd, n_per_arm, n_per_arm)
    power_rows.append(
        {
            "total_customers": total_n,
            "customers_per_arm": n_per_arm,
            "mde_profit_per_customer": mde,
            "mde_total_profit_if_scaled_to_all": mde * total_n,
        }
    )

power_discount = pd.DataFrame(power_rows)

display(
    styled_table(
        power_discount,
        money_cols=["mde_profit_per_customer", "mde_total_profit_if_scaled_to_all"],
    )
)

	total_customers	customers_per_arm	mde_profit_per_customer	mde_total_profit_if_scaled_to_all
0	10000	2500	$0.83	$8,302
1	25000	6250	$0.53	$13,126
2	50000	12500	$0.37	$18,564
3	100000	25000	$0.26	$26,253
4	250000	62500	$0.17	$41,510

If the MDE is larger than the profit difference that would change the decision, the experiment is underpowered for the business question. In that case, the team should either increase sample size, reduce the number of arms, extend the measurement window, or accept that the test is exploratory.

13. Decision Memo Template

A pricing readout should make the decision explicit. Below is a concise example using the simulated discount experiment.

best_discount = discount_readout.loc[discount_readout["profit_per_customer"].idxmax()]
control_profit = discount_readout.loc[discount_readout["discount_depth"].eq(0), "profit_per_customer"].iloc[0]
best_lift = best_discount["profit_per_customer"] - control_profit

recommended_depth = best_discount["discount_depth"]
eligible_customers = 600_000
expected_incremental_profit = best_lift * eligible_customers

memo = f'''
### Pricing and Promotion Readout

**Decision:** Select discount depth for the next eligible-customer promotion.

**Design:** Randomized discount-depth experiment with four arms: 0%, 10%, 20%, and 30%.

**Primary outcome:** Contribution profit per eligible customer.

**Best observed arm:** {pct(recommended_depth, 0)} discount.

**Profit per customer in best arm:** {dollars(best_discount["profit_per_customer"], 2)}.

**Incremental profit per customer versus no discount:** {dollars(best_lift, 2)}.

**Expected incremental profit if scaled to {eligible_customers:,} eligible customers:** {dollars(expected_incremental_profit, 0)}.

**Recommendation:** Use the profit-maximizing discount depth for price-sensitive segments, but avoid broad deep discounting until delayed outcomes are measured.

**Caveats:** Monitor future full-price conversion, coupon leakage, product mix, customer fairness, and support or fulfillment costs.
'''

display(Markdown(memo))

Pricing and Promotion Readout

Decision: Select discount depth for the next eligible-customer promotion.

Design: Randomized discount-depth experiment with four arms: 0%, 10%, 20%, and 30%.

Primary outcome: Contribution profit per eligible customer.

Best observed arm: 10% discount.

Profit per customer in best arm: $4.29.

Incremental profit per customer versus no discount: $0.31.

Expected incremental profit if scaled to 600,000 eligible customers: $188,271.

Recommendation: Use the profit-maximizing discount depth for price-sensitive segments, but avoid broad deep discounting until delayed outcomes are measured.

Caveats: Monitor future full-price conversion, coupon leakage, product mix, customer fairness, and support or fulfillment costs.

The recommendation is deliberately not “send the deepest discount to everyone.” A mature pricing decision combines the immediate causal estimate with strategic guardrails.

14. Practical Workflow

A professional pricing or promotion analysis usually follows this sequence:

Define the decision: raise price, lower price, discount, target, or pause.
Define the estimand: elasticity, profit lift, conversion lift, retention, or long-run value.
Identify the source of price variation: randomization, cost shock, policy rule, rollout, or observational panel.
Audit endogeneity: why did prices vary historically?
Choose the unit of analysis: customer, SKU, store, market, or week.
Include margin, inventory, and stockout measures.
Estimate short-run and delayed effects.
Report uncertainty and decision-relevant MDE.
Translate estimates into price, discount, or targeting recommendations.
Monitor post-launch guardrails.

Hands-On Extensions

Try extending this notebook in the following ways:

Add competitor prices to the pricing panel.
Simulate stockouts and show how observed sales understate demand.
Replace the constant-elasticity profit curve with a nonparametric demand curve.
Add heterogeneous elasticity by product category.
Estimate a causal forest for discount response using the randomized discount data.
Compare customer-level and geo-level randomization for personalized pricing constraints.
Build a dashboard that shows revenue, margin, and long-run guardrails side by side.

Key Takeaways

Price elasticity is causal: it asks how demand would change under a counterfactual price.
Observed prices are often endogenous because firms set prices in response to demand conditions.
Fixed effects help, but they do not solve endogeneity when unobserved demand shocks drive price.
Valid instruments or randomized price variation can identify causal demand response.
Promotion evaluation should use contribution profit, not only conversion or revenue.
Discount depth can have heterogeneous effects across customer segments.
Promotions can pull demand forward, so cumulative effects matter.
A professional pricing readout connects estimates to price decisions, margin, uncertainty, and operational risks.

References

Anderson, E. T., & Simester, D. (2003). Effects of $9 price endings on retail sales: Evidence from field experiments. Quantitative Marketing and Economics, 1(1), 93-110. https://doi.org/10.1023/A:1023581927405

Hendel, I., & Nevo, A. (2006). Sales and consumer inventory. The RAND Journal of Economics, 37(3), 543-561. https://doi.org/10.1111/j.1756-2171.2006.tb00030.x

Mela, C. F., Gupta, S., & Lehmann, D. R. (1997). The long-term impact of promotion and advertising on consumer brand choice. Journal of Marketing Research, 34(2), 248-261. https://doi.org/10.1177/002224379703400205

Pauwels, K., Hanssens, D. M., & Siddarth, S. (2002). The long-term effects of price promotions on category incidence, brand choice, and purchase quantity. Journal of Marketing Research, 39(4), 421-439. https://doi.org/10.1509/jmkr.39.4.421.19114

Qiang, S., & Bayati, M. (2016). Dynamic pricing with demand covariates. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2765257