Time-series causal discovery asks a different question from ordinary tabular discovery. In tabular data, rows are usually treated as exchangeable samples from the same joint distribution. In time-series data, row order is part of the data-generating process. A variable at time t-1 may cause another variable at time t, and the direction of time gives us real background knowledge that is not available in a purely cross-sectional dataset.
This notebook introduces the time-series workflow available in causal-learn and related LiNGAM time-series tools shipped with the same package family. We will simulate a small dynamic system with known lagged and contemporaneous causes, then compare several ways to discover the structure:
pairwise Granger testing for two variables;
multivariate Granger-Lasso for several variables and lags;
VAR-LiNGAM for lagged effects plus same-time non-Gaussian effects;
a deliberately naive i.i.d. PC run to show what can go wrong when temporal dependence is ignored.
The examples are intentionally small. A serious time-series causal discovery project should also study stationarity, seasonality, missing time points, irregular sampling, delayed effects, interventions, and domain constraints. For large lagged conditional-independence workflows, a dedicated time-series discovery library such as Tigramite is often the better primary tool. This notebook focuses on what causal-learn can teach directly.
Estimated runtime: about 1-3 minutes on a typical laptop. The repeated stability cells are the slowest part because they refit Granger-Lasso several times.
Learning Goals
By the end of this notebook, you should be able to:
explain the difference between contemporaneous, lagged, and auto-regressive edges;
reshape a multivariate time series into a lagged supervised-learning table;
run causal-learn’s Granger tools and read the coefficient layout correctly;
run VAR-LiNGAM and separate instantaneous effects from lagged effects;
evaluate discovered lagged edges against a known synthetic truth;
understand why ordinary i.i.d. graph discovery is risky on autocorrelated rows.
Notebook Flow
We will move in the same order a real analysis should move:
Set up imports, output paths, and plotting style.
Define the true dynamic data-generating process.
Simulate and inspect a multivariate time series.
Draw the true temporal graph so the estimators have a clear target.
Build lagged design matrices to make the time-series framing concrete.
Run pairwise Granger and multivariate Granger-Lasso.
Run VAR-LiNGAM to recover same-time and lagged effects.
Compare results, tune thresholds, and run simple stability checks.
Show why naive i.i.d. PC can be misleading for time series.
Save a compact reporting checklist and artifact manifest.
Time-Series Causal Discovery Theory
A time-series causal graph usually has nodes like X(t), X(t-1), and X(t-2) rather than only X. That indexing changes the meaning of an edge.
A lagged edge such as match(t-1) -> engagement(t) says earlier match quality helps predict later engagement after accounting for other lagged variables. Lagged edges are often easier to orient because causes must come before effects in time.
A same-time edge such as match(t) -> engagement(t) says there is an instantaneous structural relation inside the same time step. Same-time edges are harder because the timestamp alone does not order the two variables. VAR-LiNGAM uses non-Gaussian residual structure to estimate these same-time directions.
An auto-regressive edge such as engagement(t-1) -> engagement(t) says the process has memory. Auto-regression is not a nuisance detail. If we ignore it, ordinary tabular methods may mistake persistence for causal dependence between variables.
Granger Causality In Plain Language
Granger causality is predictive causality over time. Variable X Granger-causes variable Y if past values of X improve prediction of current Y, after using past values of Y and possibly other variables.
That is useful, but it is not magic. Granger causality can be distorted by hidden common causes, omitted variables, synchronized measurement, nonstationarity, and insufficient lag length. In this notebook we use Granger methods as discovery tools, then evaluate them against a known synthetic truth.
VAR-LiNGAM In Plain Language
VAR-LiNGAM combines two ideas:
a vector autoregression, which models how lagged variables affect current variables;
LiNGAM, which uses linear non-Gaussian residuals to orient same-time causal effects.
The fitted object returns an array of adjacency matrices. Matrix 0 is the same-time matrix. Matrix 1 is lag one, matrix 2 is lag two, and so on. As in the other LiNGAM notebooks, the convention is row = target and column = source.
Assumptions And Caveats
The cleanest version of the workflow assumes evenly spaced observations, no missing timestamps, approximate stationarity, enough data for the selected lag length, and no important omitted time-varying confounders. It also assumes the effect is captured reasonably well by linear lagged relationships when using Granger-Lasso or VAR-LiNGAM.
When those assumptions are not plausible, do not treat the learned graph as a final causal answer. Treat it as a candidate structure that needs sensitivity checks, domain review, and ideally experimental or quasi-experimental support.
Setup
This first code cell imports the packages used throughout the notebook, creates output folders, and pins plotting defaults. The MPLCONFIGDIR line avoids noisy matplotlib cache warnings in restricted notebook environments.
The output confirms that the notebook is writing into the shared tutorial output folder. Every saved artifact in this notebook starts with 13_, which makes it easy to separate these files from earlier tutorial outputs.
Package Versions
Version information is part of reproducible causal discovery. Small implementation details can matter for graph search, regularization, and numerical thresholds, so we save the versions used to execute this notebook.
These versions are not just bookkeeping. Time-series discovery combines statistical tests, regression solvers, and graph conventions, so saving the environment makes later debugging much easier.
Define The Dynamic System
We will simulate six variables with a simple product-analytics flavor:
need: underlying user need or demand pressure;
intent: short-term intent signal;
match: quality of the content or item match;
engagement: current depth of use;
renewal: future value or retention pressure;
support: support burden or friction signal.
The true system has self-memory, cross-lag effects, and a few same-time effects. The self-memory terms make the series autocorrelated. The cross-lag effects are the main Granger-style causal targets. The same-time effects give VAR-LiNGAM something that plain Granger-Lasso cannot fully represent.
This table is the ground truth used for evaluation. Self-lag memory is real in the simulation, but the main discovery score focuses on cross-variable effects because those are the edges analysts usually mean by causal discovery across variables.
Simulation Function
This function creates the actual time series. It uses Laplace noise rather than Gaussian noise so VAR-LiNGAM has the non-Gaussian residual structure it needs for same-time orientation.
The code applies same-time effects in a fixed acyclic order: match can affect engagement, and engagement can affect renewal and support during the same timestamp.
def build_temporal_matrices():"""Return true same-time, lag-one, and lag-two coefficient matrices. Matrix convention: row = target variable, column = source variable. """ lag_one = np.zeros((N_VARIABLES, N_VARIABLES)) lag_two = np.zeros((N_VARIABLES, N_VARIABLES)) same_time = np.zeros((N_VARIABLES, N_VARIABLES))for variable, value in SELF_LAG_STRENGTH.items(): lag_one[VAR_INDEX[variable], VAR_INDEX[variable]] = valuefor edge in CROSS_LAG_EDGES: matrix = lag_one if edge["lag"] ==1else lag_two matrix[VAR_INDEX[edge["target"]], VAR_INDEX[edge["source"]]] = edge["coefficient"]for edge in CONTEMPORANEOUS_EDGES: same_time[VAR_INDEX[edge["target"]], VAR_INDEX[edge["source"]]] = edge["coefficient"]return same_time, lag_one, lag_twodef simulate_time_series(n_steps=1_200, burn_in=120, seed=RANDOM_SEED, noise_scale=0.60):"""Simulate a stationary nonlinear-looking but linear structural time series.""" rng = np.random.default_rng(seed) same_time, lag_one, lag_two = build_temporal_matrices() values = np.zeros((n_steps + burn_in, N_VARIABLES))for t inrange(2, n_steps + burn_in):# Lagged structural component plus non-Gaussian shocks. base = lag_one @ values[t -1] + lag_two @ values[t -2] current = base + rng.laplace(loc=0.0, scale=noise_scale, size=N_VARIABLES)# Same-time effects are applied in a known acyclic order. current[VAR_INDEX["engagement"]] += same_time[VAR_INDEX["engagement"], VAR_INDEX["match"]] * current[VAR_INDEX["match"]] current[VAR_INDEX["renewal"]] += same_time[VAR_INDEX["renewal"], VAR_INDEX["engagement"]] * current[VAR_INDEX["engagement"]] current[VAR_INDEX["support"]] += same_time[VAR_INDEX["support"], VAR_INDEX["engagement"]] * current[VAR_INDEX["engagement"]] values[t] = current frame = pd.DataFrame(values[burn_in:], columns=VARIABLES) frame.insert(0, "time", np.arange(len(frame)))return frametime_series_df = simulate_time_series()time_series_df.to_csv(DATASET_DIR /f"{NOTEBOOK_PREFIX}_synthetic_time_series.csv", index=False)display(time_series_df.head())
time
need
intent
match
engagement
renewal
support
0
0
-0.285133
0.277932
2.887705
0.986429
1.117906
0.885094
1
1
0.413857
-0.507032
0.902884
0.358165
0.804738
1.329916
2
2
0.160326
-0.248611
-1.384983
1.552417
0.050189
-1.493999
3
3
-0.369289
-1.298811
-0.483863
0.010198
0.853096
-2.225829
4
4
-1.029062
-0.573211
-1.947375
-0.248455
-0.050569
-0.846873
The first rows are already past the burn-in period, so the values are sampled from the stable part of the process rather than from the all-zero initialization.
Basic Data Audit
Before running discovery, inspect scale, missingness, and rough distribution shape. A time-series algorithm can fail for mundane reasons like missing timestamps, exploding values, or a variable with almost no variance.
The audit confirms that the simulated data has no missing values or skipped timestamps. The variables have different natural scales, so we will standardize them before regularized regression and VAR-LiNGAM.
Plot A Slice Of The Series
A line plot is not a causal discovery method, but it is still useful. We want to see whether the process looks stable and whether the variables have obvious drift or breaks that would violate stationarity assumptions.
The series fluctuates around a stable range rather than trending away. That is the behavior we want for a first Granger and VAR-LiNGAM teaching example.
Correlation And Autocorrelation Checks
Time-series discovery should look at both cross-variable association and persistence over time. The next cell saves a contemporaneous correlation matrix and simple lag-one autocorrelation values.
correlation_matrix = time_series_df[VARIABLES].corr()autocorrelation_table = pd.DataFrame( {"variable": VARIABLES,"lag_1_autocorrelation": [time_series_df[var].autocorr(lag=1) for var in VARIABLES],"lag_2_autocorrelation": [time_series_df[var].autocorr(lag=2) for var in VARIABLES], })correlation_matrix.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_contemporaneous_correlation.csv")autocorrelation_table.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_autocorrelation.csv", index=False)fig, axes = plt.subplots(1, 2, figsize=(14, 5))sns.heatmap(correlation_matrix, cmap="vlag", center=0, annot=True, fmt=".2f", ax=axes[0])axes[0].set_title("Same-Time Correlation")sns.barplot(data=autocorrelation_table, x="variable", y="lag_1_autocorrelation", color="#64748b", ax=axes[1])axes[1].set_title("Lag-One Autocorrelation")axes[1].set_xlabel("Variable")axes[1].set_ylabel("Autocorrelation")axes[1].tick_params(axis="x", rotation=30)plt.tight_layout()fig.savefig(FIGURE_DIR /f"{NOTEBOOK_PREFIX}_correlation_and_autocorrelation.png", dpi=160, bbox_inches="tight")plt.show()display(autocorrelation_table.round(3))
variable
lag_1_autocorrelation
lag_2_autocorrelation
0
need
0.337
0.130
1
intent
0.304
0.098
2
match
0.323
0.106
3
engagement
0.452
0.173
4
renewal
0.580
0.314
5
support
0.393
0.142
The autocorrelations are clearly nonzero. That is the main warning sign against treating rows as independent observations in an ordinary tabular graph discovery algorithm.
Draw The True Temporal Graph
This figure shows the causal target in a compact temporal layout. Arrows from t-1 or t-2 point into current-time variables. Arrows within the t column are same-time structural effects.
The graph makes the target explicit: Granger methods should recover the dark lagged cross-variable arrows, while VAR-LiNGAM should also recover the amber same-time arrows if the linear non-Gaussian assumptions are helpful enough.
Standardize The Series
Regularized regression and VAR fitting are easier to compare when variables are on similar scales. We standardize the six series but keep the original time column for plotting and alignment.
The means are approximately zero and the standard deviations are approximately one. That keeps the coefficient thresholds used later from being dominated by variable scale.
Build A Lagged Design Matrix
A time-series model can be rewritten as a supervised-learning table. For each current time t, we create columns for each variable at t-1 and t-2, then keep the current variables as targets.
This is the core representation behind Granger-style models.
The first usable row starts at time 2 because lag-two features require two previous observations. In real work, this lag construction is where you make choices about maximum lag, seasonal lags, calendar effects, and leakage prevention.
Helper Functions For Edge Tables And Metrics
The next helpers keep the rest of the notebook readable. They convert coefficient matrices into edge tables, compare learned edges to truth, and save compact metrics.
Reminder: matrices use row = target and column = source.
The metric helpers deliberately evaluate exact lag labels. For example, match(t-1) -> engagement(t) is different from match(t-2) -> engagement(t). In time-series discovery, lag placement is part of the causal claim.
Pairwise Granger Test
We start with the simplest causal-learn time-series tool: a two-variable Granger test. This asks whether past match predicts current engagement, and whether past engagement predicts current match.
This is only pairwise. It does not adjust for the full multivariate system, so it is useful as a teaching step, not as the final graph.
The pairwise test correctly finds strong evidence from past match to current engagement. The reverse direction is not significant here, which matches the main synthetic design for this pair.
Multivariate Granger-Lasso
Now we use Granger.granger_lasso, which fits one regularized regression per target variable using all variables at all selected lags as predictors. This is closer to a discovery workflow because each candidate edge competes against the other lagged variables.
The output is a coefficient matrix with shape (number_of_targets, number_of_variables * max_lag). We convert it into one matrix per lag so the edge convention is easier to read.
At this threshold, Granger-Lasso recovers the intended cross-lag structure cleanly. The coefficient signs also line up with the data-generating process: support burden has a negative effect on renewal, and engagement has a negative effect on later support burden in this synthetic design.
Visualize Granger-Lasso Lag Matrices
A matrix view is often easier than a long edge table. The next figure shows the estimated lag-one and lag-two coefficients. Rows are targets and columns are sources.
fig, axes = plt.subplots(1, 2, figsize=(14, 5))for plot_index, lag inenumerate([1, 2]): matrix = granger_coef_array[lag] sns.heatmap( pd.DataFrame(matrix, index=VARIABLES, columns=VARIABLES), cmap="vlag", center=0, vmin=-0.40, vmax=0.40, annot=True, fmt=".2f", ax=axes[plot_index], ) axes[plot_index].set_title(f"Granger-Lasso Coefficients: Lag {lag}") axes[plot_index].set_xlabel("Source at lag") axes[plot_index].set_ylabel("Target at time t")plt.tight_layout()fig.savefig(FIGURE_DIR /f"{NOTEBOOK_PREFIX}_granger_lasso_lag_heatmaps.png", dpi=160, bbox_inches="tight")plt.show()
The strongest off-diagonal cells correspond to the true lagged edges. The diagonal cells show self-memory, which is expected in a time series and should usually be reported separately from cross-variable causal edges.
Threshold Sensitivity For Granger-Lasso
Regularized coefficients need a practical threshold before they become a graph. Instead of choosing one threshold silently, we scan several values and report precision and recall against the synthetic truth.
The sensitivity table shows why the chosen threshold is reasonable for this teaching dataset. Lower thresholds admit extra weak edges; higher thresholds can eventually drop weaker true lag-two effects.
Plot Threshold Sensitivity
The next plot turns the threshold table into a quick diagnostic. A good threshold region should not be a single fragile point.
The plot makes the threshold tradeoff visible: the useful region is where recall remains high without admitting many false positive edges.
VAR-LiNGAM
Granger-Lasso is designed for lagged predictive effects. VAR-LiNGAM adds a same-time LiNGAM step on the residuals, so it can estimate both instantaneous and lagged coefficients under linear non-Gaussian assumptions.
We fit two lags because that matches the synthetic data-generating process.
VAR-LiNGAM recovers both lagged effects and same-time effects in this controlled setting. The estimated causal order is specifically the same-time LiNGAM order among residual shocks, not a full time-unrolled causal order.
Visualize VAR-LiNGAM Matrices
The same-time matrix is labeled lag 0. Lag 1 and lag 2 then show delayed effects. This separation is the main conceptual advantage of VAR-LiNGAM over a plain lagged regression graph.
The heatmaps show why matrix indexing matters. The same-time effects appear in the first matrix, while the Granger-style delayed effects appear in the later matrices.
Compare Granger-Lasso And VAR-LiNGAM
The two methods answer overlapping but not identical questions. Granger-Lasso is focused on lagged predictive effects. VAR-LiNGAM estimates a richer structural VAR that includes same-time effects.
The comparison is not a winner-takes-all contest. The right estimator depends on whether the analysis needs only lagged predictive structure or also same-time structural direction.
Plot Method Comparison
A compact bar plot makes the difference in target edge sets visible. Because the methods are evaluated against different targets, the labels remind us what each score means.
Both methods perform well because the data were generated from assumptions friendly to these estimators. In real data, this plot should be paired with stability checks and domain review rather than read as proof.
Seed Stability For Granger-Lasso
A single synthetic run can look cleaner than reality. The next cell resimulates the system with several random seeds, refits Granger-Lasso, and checks how many true lagged cross-variable edges are recovered.
The seed check asks whether the result is a stable property of the data-generating process or a lucky draw. In this controlled setup the lagged graph is stable across repeated simulations.
Plot Seed Stability
The next plot shows precision, recall, and F1 across simulation seeds. Flat lines near one are reassuring in this synthetic setting.
The plot is intentionally simple: a stable method should not require a particular random simulation to look good. On real data, an analogous check might resample blocks of time or run rolling-origin splits.
Sample-Size Sensitivity
Time-series discovery can be data hungry because every candidate lag multiplies the number of possible predictors. We rerun Granger-Lasso with increasing sample sizes to show how recovery changes as the available history grows.
The smaller histories can miss weaker effects or admit extra edges. As the sample grows, the graph estimate becomes more reliable in this controlled example.
Plot Sample-Size Sensitivity
This plot makes the data requirement visible. A causal graph that changes dramatically with modest changes in history length should be reported cautiously.
The curve gives a practical lesson: lagged discovery needs enough time points after lag construction. If a project has many variables and few time steps, the graph can be underidentified or unstable.
Naive I.I.D. PC As A Cautionary Example
The next cell intentionally does something risky: it runs PC directly on the same-time rows as if they were independent tabular samples. This is included to show why time-series structure should not be ignored.
The result is not treated as a valid time-series causal graph.
naive_pc_result = pc( scaled_df[VARIABLES].to_numpy(), alpha=0.01, indep_test="fisherz", stable=True, show_progress=False, node_names=VARIABLES,)naive_pc_edges = pd.DataFrame({"edge": [str(edge) for edge in naive_pc_result.G.get_graph_edges()]})naive_pc_edges.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_naive_iid_pc_edges.csv", index=False)display(naive_pc_edges)
edge
0
intent --> match
1
engagement --> match
2
engagement --- renewal
3
engagement --- support
4
renewal --- support
The naive PC graph mixes contemporaneous association, lagged persistence, and indirect effects into one same-time graph. This is the main lesson: time ordering is not optional metadata; it is part of the causal problem.
Runtime Sketch
Runtime grows with the number of variables, candidate lags, and repeated model fits. This quick sketch fits Granger-Lasso on variable prefixes so we can see the local scaling pattern.
This small benchmark is not a universal runtime claim, but it helps set expectations. More variables and more lags mean more candidate predictors for every target variable.
Plot Runtime Sketch
The line plot is a quick sanity check for the local workload. In a real benchmark, you would repeat each point and include more graph sizes.
The runtime remains small here because the tutorial graph is tiny. The same workflow becomes more expensive when the variable count, maximum lag, or resampling plan grows.
When To Use A Dedicated Time-Series Discovery Library
causal-learn gives us Granger tools and VAR-LiNGAM, which are useful for small and medium teaching workflows. For larger time-series causal discovery, especially conditional-independence search over many lags, a specialized library may be more ergonomic.
The next table summarizes the practical decision boundary.
method_guidance = pd.DataFrame( [ {"need": "Quick pairwise lag test","reasonable_tool": "causal-learn Granger.granger_test_2d","main_warning": "Pairwise tests omit other variables and can be confounded.", }, {"need": "Small multivariate lagged graph","reasonable_tool": "causal-learn Granger.granger_lasso","main_warning": "Coefficient threshold and lag length drive the final graph.", }, {"need": "Same-time plus lagged linear non-Gaussian effects","reasonable_tool": "VAR-LiNGAM","main_warning": "Requires linear non-Gaussian assumptions and careful residual checks.", }, {"need": "Large lagged conditional-independence workflow","reasonable_tool": "Dedicated time-series discovery package such as Tigramite","main_warning": "Choose tests, lags, and stationarity checks explicitly.", }, {"need": "Cross-sectional DAG from autocorrelated rows","reasonable_tool": "Usually avoid; first decide the temporal graph target","main_warning": "Naive i.i.d. methods can orient or connect the wrong variables.", }, ])method_guidance.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_time_series_tool_guidance.csv", index=False)display(method_guidance)
need
reasonable_tool
main_warning
0
Quick pairwise lag test
causal-learn Granger.granger_test_2d
Pairwise tests omit other variables and can be...
1
Small multivariate lagged graph
causal-learn Granger.granger_lasso
Coefficient threshold and lag length drive the...
2
Same-time plus lagged linear non-Gaussian effects
VAR-LiNGAM
Requires linear non-Gaussian assumptions and c...
3
Large lagged conditional-independence workflow
Dedicated time-series discovery package such a...
Choose tests, lags, and stationarity checks ex...
4
Cross-sectional DAG from autocorrelated rows
Usually avoid; first decide the temporal graph...
Naive i.i.d. methods can orient or connect the...
This guidance is deliberately conservative. Time-series discovery is strongest when the question is phrased in temporal terms before any algorithm is run.
Reporting Checklist
A time-series causal discovery report should say exactly what temporal target was estimated. The next table is a reusable checklist for documenting that choice.
reporting_checklist = pd.DataFrame( [ {"item": "Sampling unit and cadence","what_to_report": "What one row represents and whether the spacing is regular.","why_it_matters": "Lagged effects only have meaning relative to the measurement interval.", }, {"item": "Maximum lag","what_to_report": "The largest lag tested and why it was chosen.","why_it_matters": "Too few lags miss delayed effects; too many lags increase false positives.", }, {"item": "Same-time edge policy","what_to_report": "Whether same-time effects are estimated, forbidden, or left unresolved.","why_it_matters": "Timestamp order alone cannot orient variables measured at the same time.", }, {"item": "Stationarity checks","what_to_report": "Drift, breaks, seasonality, and distribution changes considered before discovery.","why_it_matters": "Nonstationarity can create spurious lagged dependence.", }, {"item": "Threshold or significance rule","what_to_report": "Coefficient threshold, alpha level, or selection rule used to produce edges.","why_it_matters": "The graph is often sensitive to this conversion step.", }, {"item": "Stability analysis","what_to_report": "Resampling, rolling windows, or seed checks used to assess graph robustness.","why_it_matters": "Unstable edges should be treated as hypotheses, not findings.", }, {"item": "Known omitted variables","what_to_report": "Important unobserved drivers or measurement gaps.","why_it_matters": "Hidden common causes can invalidate Granger and VAR-LiNGAM causal claims.", }, ])reporting_checklist.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_reporting_checklist.csv", index=False)display(reporting_checklist)
item
what_to_report
why_it_matters
0
Sampling unit and cadence
What one row represents and whether the spacin...
Lagged effects only have meaning relative to t...
1
Maximum lag
The largest lag tested and why it was chosen.
Too few lags miss delayed effects; too many la...
2
Same-time edge policy
Whether same-time effects are estimated, forbi...
Timestamp order alone cannot orient variables ...
3
Stationarity checks
Drift, breaks, seasonality, and distribution c...
Nonstationarity can create spurious lagged dep...
4
Threshold or significance rule
Coefficient threshold, alpha level, or selecti...
The graph is often sensitive to this conversio...
5
Stability analysis
Resampling, rolling windows, or seed checks us...
Unstable edges should be treated as hypotheses...
6
Known omitted variables
Important unobserved drivers or measurement gaps.
Hidden common causes can invalidate Granger an...
This checklist turns the notebook into a reusable analysis template. The estimator output is only one part of the causal claim; the temporal assumptions and reporting choices are equally important.
Artifact Manifest
The final cell lists the datasets, tables, and figures saved by this notebook. This makes it easy to find outputs later without scanning the whole folder.
The manifest completes the notebook. The core lesson is that time-series causal discovery is about causal timing first and algorithms second: define the lagged target, respect temporal dependence, then use discovery methods as structured hypothesis generators.