causal-learn Tutorial 00: Environment And Library Tour
This notebook starts the causal-learn tutorial series. The goal is not to learn one algorithm yet. The goal is to get oriented: what the library does, what assumptions causal discovery methods rely on, how graph outputs should be read, and how to run a small smoke test that proves the environment is ready.
causal-learn is mainly a causal discovery library. It helps estimate candidate graph structure from observational or interventional data. That makes it different from DoWhy and EconML: those libraries are usually used after the causal question or graph structure has been specified. Discovery is earlier and more assumption-sensitive, so the tutorial series will repeatedly emphasize diagnostics, stability, and cautious language.
Learning Goals
By the end of this notebook, you should be able to:
Verify that causal-learn is installed and importable from the current environment.
Understand the main algorithm families exposed by the library.
Explain the difference between discovery, identification, and effect estimation.
Recognize common graph outputs such as DAGs, CPDAGs, PAGs, skeletons, and partially oriented edges.
Generate a small synthetic dataset with a known causal graph.
Run a first PC algorithm smoke test and compare the learned graph to the known graph.
Save tables and figures in the same output style as the other tutorial folders.
How This Tutorial Fits The Series
The rest of the causal-learn notebooks will go deep into specific algorithm families. Notebook 00 gives the shared vocabulary and environment checks those notebooks will reuse.
A useful way to think about the series is:
Early notebooks teach graph objects, synthetic data, and independence tests.
Final notebooks focus on benchmarking, stability, an end-to-end case study, and reporting limitations.
The important mindset: discovered graphs are usually candidate structures, not automatic causal truth.
Setup
This cell imports core packages, configures plotting, creates output folders, and checks whether causal-learn is importable. The package is installed under the Python module name causallearn, even though the package name is written as causal-learn.
If causal-learn importable is True, the environment can run the live examples in this notebook. If it is False, the tables and explanatory sections still make sense, but the algorithm cells should be rerun after installing causal-learn.
Package Version Snapshot
Causal discovery results can be sensitive to software versions, especially when graph classes, scoring functions, and independence tests change. This cell records the core package versions used by the tutorial run.
The package snapshot is a small reproducibility habit. When a discovered graph changes after an environment update, this table helps separate methodological changes from software changes.
Discovery Versus Estimation
Before using causal-learn, it is important to separate three related tasks: discovering graph structure, identifying an estimand, and estimating an effect. causal-learn mainly helps with the first task.
workflow_comparison = pd.DataFrame( [ {"task": "Causal discovery","typical_question": "Which variables may be directly connected in a causal graph?","typical_output": "Candidate DAG, CPDAG, PAG, or partially oriented graph.","common_tools": "causal-learn, Tetrad, Tigramite, gCastle.", }, {"task": "Identification","typical_question": "Given a graph and assumptions, what estimand identifies the causal effect?","typical_output": "Backdoor, frontdoor, IV, mediation, or other estimand.","common_tools": "DoWhy, graphical criteria, domain reasoning.", }, {"task": "Effect estimation","typical_question": "How large is the effect, and how does it vary across units?","typical_output": "ATE, CATE, intervals, policy value, sensitivity checks.","common_tools": "EconML, DoWhy estimators, statsmodels, sklearn-style models.", }, ])workflow_comparison.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_discovery_vs_estimation.csv", index=False)display(workflow_comparison)
task
typical_question
typical_output
common_tools
0
Causal discovery
Which variables may be directly connected in a...
Candidate DAG, CPDAG, PAG, or partially orient...
causal-learn, Tetrad, Tigramite, gCastle.
1
Identification
Given a graph and assumptions, what estimand i...
Backdoor, frontdoor, IV, mediation, or other e...
DoWhy, graphical criteria, domain reasoning.
2
Effect estimation
How large is the effect, and how does it vary ...
ATE, CATE, intervals, policy value, sensitivit...
EconML, DoWhy estimators, statsmodels, sklearn...
This distinction keeps expectations healthy. Discovery can suggest candidate structure, but a discovered graph still needs domain review before it becomes the basis for effect estimation or decisions.
Library Capability Map
causal-learn includes several families of methods. This table gives a high-level map so later notebooks have a shared reference point.
capability_map = pd.DataFrame( [ {"family": "Constraint-based discovery","examples": "PC, FCI, CD-NOD","core_idea": "Use conditional independence tests to remove and orient edges.","best_when": "The conditional independence test is well matched to the data and sample size is adequate.", }, {"family": "Score-based discovery","examples": "GES, exact search","core_idea": "Search over graph structures using a decomposable score such as BIC or BDeu.","best_when": "A scoring assumption is credible and the variable set is not too large for the search strategy.", }, {"family": "Functional causal models","examples": "LiNGAM, ANM, PNL","core_idea": "Use functional or noise assumptions to identify causal direction beyond Markov equivalence.","best_when": "Linearity, non-Gaussianity, additive-noise, or post-nonlinear assumptions are scientifically plausible.", }, {"family": "Permutation-based discovery","examples": "GRaSP, BOSS","core_idea": "Search over variable orderings or permutations that imply graph structures.","best_when": "Ordering-based search is computationally feasible and useful for the graph size.", }, {"family": "Hidden causal representation","examples": "GIN","core_idea": "Use constraints designed for latent causal structure and hidden variables.","best_when": "The problem is explicitly about hidden structure rather than only observed-variable DAGs.", }, {"family": "Time-series discovery","examples": "Granger-style tools, VAR-LiNGAM","core_idea": "Use lagged temporal structure to separate past causes from future outcomes.","best_when": "Variables are measured repeatedly and temporal ordering is meaningful.", }, ])capability_map.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_capability_map.csv", index=False)display(capability_map)
family
examples
core_idea
best_when
0
Constraint-based discovery
PC, FCI, CD-NOD
Use conditional independence tests to remove a...
The conditional independence test is well matc...
1
Score-based discovery
GES, exact search
Search over graph structures using a decomposa...
A scoring assumption is credible and the varia...
2
Functional causal models
LiNGAM, ANM, PNL
Use functional or noise assumptions to identif...
Linearity, non-Gaussianity, additive-noise, or...
3
Permutation-based discovery
GRaSP, BOSS
Search over variable orderings or permutations...
Ordering-based search is computationally feasi...
4
Hidden causal representation
GIN
Use constraints designed for latent causal str...
The problem is explicitly about hidden structu...
5
Time-series discovery
Granger-style tools, VAR-LiNGAM
Use lagged temporal structure to separate past...
Variables are measured repeatedly and temporal...
The method family should be chosen from assumptions, not popularity. For example, FCI is more appropriate than PC when latent confounding is plausible, while LiNGAM needs stronger functional assumptions than PC.
Import Capability Check
This cell tests whether the main modules used in the series can be imported. It does not prove every method will work for every dataset, but it catches missing optional components early.
The available modules form the practical menu for the rest of the tutorial. Some algorithms may still require extra assumptions, careful preprocessing, or smaller datasets even when the import succeeds.
Core Assumptions To Track
Causal discovery methods are assumption-heavy. The next table collects assumptions that will appear repeatedly across the series.
assumption_table = pd.DataFrame( [ {"assumption": "Causal Markov condition","plain_language": "A variable is independent of its non-effects after conditioning on its direct causes.","why_it_matters": "It links graph separation to statistical independence patterns.", }, {"assumption": "Faithfulness","plain_language": "Observed independencies come from the graph structure, not exact parameter cancellation.","why_it_matters": "Constraint-based methods can miss or add edges when faithfulness fails.", }, {"assumption": "Causal sufficiency","plain_language": "There are no unobserved common causes among the measured variables.","why_it_matters": "PC relies on this more strongly than FCI-style methods.", }, {"assumption": "Independent and identically distributed rows","plain_language": "Rows are sampled from the same stable distribution without temporal dependence.","why_it_matters": "Standard tabular tests can fail when data are time dependent or nonstationary.", }, {"assumption": "Correct test or score choice","plain_language": "The independence test or score matches the data type and distribution well enough.","why_it_matters": "A poor test can produce a poor graph even when causal assumptions are reasonable.", }, {"assumption": "No major measurement leakage","plain_language": "Variables are measured at the intended time and do not include future information.","why_it_matters": "Leaky variables can create graph structure that looks predictive but is causally invalid.", }, ])assumption_table.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_core_assumptions.csv", index=False)display(assumption_table)
assumption
plain_language
why_it_matters
0
Causal Markov condition
A variable is independent of its non-effects a...
It links graph separation to statistical indep...
1
Faithfulness
Observed independencies come from the graph st...
Constraint-based methods can miss or add edges...
2
Causal sufficiency
There are no unobserved common causes among th...
PC relies on this more strongly than FCI-style...
3
Independent and identically distributed rows
Rows are sampled from the same stable distribu...
Standard tabular tests can fail when data are ...
4
Correct test or score choice
The independence test or score matches the dat...
A poor test can produce a poor graph even when...
5
No major measurement leakage
Variables are measured at the intended time an...
Leaky variables can create graph structure tha...
A graph output is only as credible as the assumptions behind it. Later notebooks will make these assumptions concrete by changing data-generating processes and watching graph recovery change.
Graph Vocabulary
Discovery outputs are not always fully directed DAGs. This vocabulary table will make later outputs easier to read.
graph_vocabulary = pd.DataFrame( [ ("DAG", "Directed acyclic graph", "A graph with directed edges and no directed cycles."), ("Skeleton", "Adjacency pattern", "The undirected edge structure before directions are considered."), ("V-structure", "Collider pattern", "A pattern like A -> C <- B where A and B are not adjacent."), ("CPDAG", "Completed partially directed acyclic graph", "Represents a Markov equivalence class of DAGs."), ("PAG", "Partial ancestral graph", "Represents possible ancestral relations when hidden confounding may exist."), ("Circle endpoint", "Unresolved edge mark", "Used in PAGs when orientation is not fully determined."), ("Markov equivalence", "Same independence model", "Different DAGs can imply the same observed conditional independencies."), ("SHD", "Structural Hamming distance", "A graph-difference count used in benchmarks."), ], columns=["term", "short_name", "meaning"],)graph_vocabulary.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_graph_vocabulary.csv", index=False)display(graph_vocabulary)
term
short_name
meaning
0
DAG
Directed acyclic graph
A graph with directed edges and no directed cy...
1
Skeleton
Adjacency pattern
The undirected edge structure before direction...
2
V-structure
Collider pattern
A pattern like A -> C <- B where A and B are n...
3
CPDAG
Completed partially directed acyclic graph
Represents a Markov equivalence class of DAGs.
4
PAG
Partial ancestral graph
Represents possible ancestral relations when h...
5
Circle endpoint
Unresolved edge mark
Used in PAGs when orientation is not fully det...
6
Markov equivalence
Same independence model
Different DAGs can imply the same observed con...
7
SHD
Structural Hamming distance
A graph-difference count used in benchmarks.
The key idea is that uncertainty in edge direction is normal. A partially oriented graph is often the honest output, not a failure.
Teaching DAG For The Smoke Test
The first executable example uses a small known DAG with six variables. We keep the graph simple enough to inspect by eye but rich enough to include a collider and downstream pathways.
The graph says that need and intent both affect match, match drives engagement, and engagement affects later renewal and support outcomes. This is a teaching graph, not a claim about a real product system.
Draw The True DAG
A fixed layout makes it easier to compare the true graph and the learned graph later. The arrows in this figure represent the data-generating structure used by the simulation.
# The causal-learn tutorial uses the same DAG visual language as the DoWhy# tutorial: wide white canvas, rounded pastel boxes, bold labels, and dark# annotation arrows. Drawing arrows manually keeps arrowheads clear of boxes.node_positions = {"need": (0.10, 0.76),"intent": (0.10, 0.24),"match": (0.34, 0.52),"engagement": (0.66, 0.52),"renewal": (0.90, 0.72),"support": (0.90, 0.30),}node_labels = {"need": "Need\nscore","intent": "Intent\nsignal","match": "Match\nquality","engagement": "Engagement","renewal": "Renewal\nvalue","support": "Support\nload",}node_colors = {"need": "#eef2ff","intent": "#eef2ff","match": "#e0f2fe","engagement": "#e0f2fe","renewal": "#dcfce7","support": "#dcfce7",}edge_radii = { ("need", "match"): -0.04, ("intent", "match"): 0.04, ("match", "engagement"): 0.00, ("engagement", "renewal"): -0.04, ("engagement", "support"): 0.04, ("intent", "renewal"): 0.18,}def draw_teaching_style_graph(edge_table, title, path, edge_radii=None):"""Draw a causal graph using the shared tutorial DAG style.""" edge_radii = edge_radii or {} fig, ax = plt.subplots(figsize=(12, 6)) ax.set_xlim(0, 1) ax.set_ylim(0, 1) ax.set_axis_off()for row in edge_table.itertuples(index=False): source = row.source target = row.target edge_type =getattr(row, "edge_type", "directed") directed = edge_type =="directed" ax.annotate("", xy=node_positions[target], xytext=node_positions[source], arrowprops=dict( arrowstyle="-|>"if directed else"-", color="#334155", linewidth=1.5, mutation_scale=18, shrinkA=34, shrinkB=46, linestyle="-"if directed else"--", connectionstyle=f"arc3,rad={edge_radii.get((source, target), 0.0)}", ), zorder=1, )for node, (x, y) in node_positions.items(): ax.text( x, y, node_labels[node], ha="center", va="center", fontsize=10.5, fontweight="bold", bbox=dict( boxstyle="round,pad=0.45", facecolor=node_colors[node], edgecolor="#334155", linewidth=1.2, ), zorder=2, ) ax.set_title(title, pad=18) fig.savefig(path, dpi=160, bbox_inches="tight") plt.show()true_graph_edge_table = pd.DataFrame( [{"source": source, "target": target, "edge_type": "directed"} for source, target in true_graph.edges()])true_dag_path = FIGURE_DIR /f"{NOTEBOOK_PREFIX}_true_teaching_dag.png"draw_teaching_style_graph(true_graph_edge_table, "True Teaching DAG", true_dag_path, edge_radii=edge_radii)
This figure is the benchmark. The PC smoke test below will try to recover the graph from simulated samples, but it will only see the data matrix, not this diagram.
Generate Synthetic Data From The DAG
This cell simulates continuous variables from linear structural equations with Gaussian noise. That choice is deliberate: it matches the assumptions of the Fisher-Z conditional independence test used by the PC smoke test.
The simulated data are intentionally friendly to PC: continuous, linear, acyclic, causally sufficient, and sampled from one stable distribution. Later notebooks will relax these conditions.
Data Field Summary
Even synthetic data should have a small data dictionary. This table records what each variable represents in the teaching example.
field_summary = pd.DataFrame( [ ("need", "Root cause", "Baseline user need for help or guidance."), ("intent", "Root cause", "Early intent or motivation signal."), ("match", "Intermediate variable", "How well available content or options match the user's needs."), ("engagement", "Intermediate variable", "Observed engagement generated from match quality."), ("renewal", "Downstream variable", "Later value or renewal proxy affected by intent and engagement."), ("support", "Downstream variable", "Support burden affected by engagement patterns."), ], columns=["field", "graph_role", "description"],)field_summary.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_field_summary.csv", index=False)display(field_summary)
field
graph_role
description
0
need
Root cause
Baseline user need for help or guidance.
1
intent
Root cause
Early intent or motivation signal.
2
match
Intermediate variable
How well available content or options match th...
3
engagement
Intermediate variable
Observed engagement generated from match quality.
4
renewal
Downstream variable
Later value or renewal proxy affected by inten...
5
support
Downstream variable
Support burden affected by engagement patterns.
The field summary also reminds us that causal discovery algorithms do not know semantic roles. They only see statistical patterns. Human review is still required.
Basic Data Checks
Before discovery, inspect shape, missingness, and simple distribution summaries. Conditional-independence tests can behave badly when the data contain missing values, extreme outliers, or incompatible data types.
The dataset is complete and numeric, which is exactly what the first smoke test expects. Real datasets usually need more careful preprocessing before discovery.
Correlation Heatmap
Marginal correlation is not causal discovery, but it is still a useful first check. It shows which variables are strongly associated before conditioning on anything else.
The heatmap shows associations along causal pathways, but it cannot distinguish direct causes from indirect paths. PC will use conditional independence, not just marginal correlation.
Prepare The Data Matrix
Most causal-learn algorithms expect a NumPy array with rows as samples and columns as variables. We standardize the columns for a clean smoke test and keep the column order explicit.
The order in node_names matters because graph outputs refer to column positions. Passing node names into algorithms makes outputs much easier to read.
PC Algorithm Smoke Test
The PC algorithm is a classic constraint-based discovery method. It starts from a dense graph, removes edges based on conditional independence tests, and then applies orientation rules.
Here we use the Fisher-Z test because the data are continuous and generated from a linear Gaussian setup. This is a friendly first test, not a claim that Fisher-Z is always appropriate.
ifnot CAUSAL_LEARN_AVAILABLE:raiseImportError("causal-learn is not available. Install causal-learn before running the PC smoke test.")from causallearn.search.ConstraintBased.PC import pcpc_result = pc( data_matrix, alpha=0.005, indep_test="fisherz", stable=True, show_progress=False, node_names=node_names,)pc_edges = [str(edge) for edge in pc_result.G.get_graph_edges()]pc_edge_table = pd.DataFrame({"learned_edge": pc_edges})pc_edge_table.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_pc_smoke_test_edges.csv", index=False)display(pc_edge_table)
learned_edge
0
need --- match
1
intent --> renewal
2
match --- engagement
3
engagement --> renewal
4
engagement --- support
The learned edges are the first proof that the environment is functional. The output may include directed edges and partially oriented or undirected edges, depending on what can be inferred from the data and assumptions.
Inspect The Raw Graph Matrix
causal-learn graph objects store endpoint information in a matrix. The exact endpoint coding is library-specific, so the safest habit is to pair the matrix with readable edge strings.
The matrix is useful for programmatic evaluation, while the edge strings are easier for humans. Later notebooks will build more formal graph-conversion helpers.
Convert The Learned Graph To A Simple Edge Table
For this first notebook, we convert the matrix into a simplified edge table with three edge types: directed, undirected, and partially marked. This is enough for a smoke-test comparison.
def simple_edge_table_from_matrix(graph_matrix, names): rows = [] matrix = np.asarray(graph_matrix)for i inrange(len(names)):for j inrange(i +1, len(names)): a = matrix[i, j] b = matrix[j, i]if a ==0and b ==0:continueif a ==-1and b ==1: rows.append({"source": names[i], "target": names[j], "edge_type": "directed", "readable_edge": f"{names[i]} -> {names[j]}"})elif a ==1and b ==-1: rows.append({"source": names[j], "target": names[i], "edge_type": "directed", "readable_edge": f"{names[j]} -> {names[i]}"})elif a ==-1and b ==-1: rows.append({"source": names[i], "target": names[j], "edge_type": "undirected", "readable_edge": f"{names[i]} -- {names[j]}"})else: rows.append({"source": names[i], "target": names[j], "edge_type": "partially_marked", "readable_edge": f"{names[i]} ({a},{b}) {names[j]}"})return pd.DataFrame(rows)learned_edge_table = simple_edge_table_from_matrix(pc_result.G.graph, node_names)learned_edge_table.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_pc_simplified_edge_table.csv", index=False)display(learned_edge_table)
source
target
edge_type
readable_edge
0
need
match
undirected
need -- match
1
intent
renewal
directed
intent -> renewal
2
match
engagement
undirected
match -- engagement
3
engagement
renewal
directed
engagement -> renewal
4
engagement
support
undirected
engagement -- support
The simplified table makes the graph output easier to compare to the known DAG. It also illustrates why graph-reading utilities are useful: raw graph objects are powerful, but not always report-ready.
Draw The Learned PC Graph
This plot uses the same layout as the true DAG so the differences are easy to spot. Directed edges are shown with arrows; undirected or unresolved edges are shown as dashed lines.
The learned graph should be close to the true graph because the data were designed for this method. Later notebooks will show less friendly conditions where recovery becomes weaker.
Simple Graph Recovery Metrics
Because this is synthetic data, we know the true edges. The following metrics compare the learned skeleton and directed edges to the known graph. These are teaching metrics, not the only way to evaluate discovery quality.
Skeleton recovery is usually easier than orientation recovery. That is not a weakness of the metric; it reflects the fact that some directions are not identifiable from observational conditional independencies alone.
Alpha Sensitivity
Constraint-based methods depend on the significance level used for conditional independence tests. The next cell reruns PC across several alpha values and records how the graph changes.
Even in a friendly setting, alpha is a modeling choice. A responsible discovery workflow reports sensitivity rather than presenting one graph as if it were inevitable.
Plot Alpha Sensitivity
The next figure summarizes how precision, recall, and edge count change with alpha. This is a compact way to spot brittle graph recovery.
In this small example the graph is fairly stable, which is reassuring. In real data, a graph that changes dramatically across reasonable alpha values should be reported as unstable.
A Tiny Independence-Test Intuition Check
PC is built from conditional independence tests. This cell gives a simple intuition check by comparing marginal and conditional relationships in the teaching data.
def residualize(target, controls): controls = np.asarray(controls)if controls.ndim ==1: controls = controls.reshape(-1, 1) controls = np.column_stack([np.ones(len(controls)), controls]) beta = np.linalg.lstsq(controls, target, rcond=None)[0]return target - controls @ betaneed_values = teaching_data["need"].to_numpy()intent_values = teaching_data["intent"].to_numpy()match_values = teaching_data["match"].to_numpy()marginal_corr = stats.pearsonr(need_values, intent_values).statisticconditional_corr_given_match = stats.pearsonr( residualize(need_values, match_values), residualize(intent_values, match_values),).statisticconditional_corr_given_match_engagement = stats.pearsonr( residualize(need_values, teaching_data[["match", "engagement"]].to_numpy()), residualize(intent_values, teaching_data[["match", "engagement"]].to_numpy()),).statisticindependence_demo = pd.DataFrame( [ {"relationship": "corr(need, intent)", "value": marginal_corr, "why_it_matters": "Root variables are marginally close to independent in the data-generating process."}, {"relationship": "corr(need, intent | match)", "value": conditional_corr_given_match, "why_it_matters": "Conditioning on a collider can create dependence."}, {"relationship": "corr(need, intent | match, engagement)", "value": conditional_corr_given_match_engagement, "why_it_matters": "Conditioning on descendants of a collider can also change dependence patterns."}, ])independence_demo.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_independence_intuition.csv", index=False)display(independence_demo)
relationship
value
why_it_matters
0
corr(need, intent)
0.0102
Root variables are marginally close to indepen...
1
corr(need, intent | match)
-0.6398
Conditioning on a collider can create dependence.
2
corr(need, intent | match, engagement)
-0.6394
Conditioning on descendants of a collider can ...
This small table shows why causal discovery is more subtle than correlation screening. Conditioning choices can create or remove associations depending on the graph structure.
Practical Startup Checklist
Before starting any causal discovery analysis, run a short checklist. This notebook creates a reusable version for the rest of the series.
startup_checklist = pd.DataFrame( [ ("Data timing", "Confirm every variable is measured at the intended time and does not include future leakage."), ("Variable roles", "Decide whether variables are observed causes, outcomes, mediators, selection variables, or environment indices."), ("Data type", "Choose methods and tests that match continuous, discrete, mixed, or time-series data."), ("Hidden confounding risk", "Decide whether causal sufficiency is plausible; if not, consider FCI-style methods."), ("Sample size", "Check whether conditional independence tests or score searches are realistic for the number of variables."), ("Domain constraints", "Collect required or forbidden edges before looking at results when possible."), ("Sensitivity", "Plan to vary alpha, scores, bootstrap samples, or algorithm families."), ("Reporting", "State that the output is a candidate graph unless stronger validation is available."), ], columns=["check", "why_it_matters"],)startup_checklist.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_startup_checklist.csv", index=False)display(startup_checklist)
check
why_it_matters
0
Data timing
Confirm every variable is measured at the inte...
1
Variable roles
Decide whether variables are observed causes, ...
2
Data type
Choose methods and tests that match continuous...
3
Hidden confounding risk
Decide whether causal sufficiency is plausible...
4
Sample size
Check whether conditional independence tests o...
5
Domain constraints
Collect required or forbidden edges before loo...
6
Sensitivity
Plan to vary alpha, scores, bootstrap samples,...
7
Reporting
State that the output is a candidate graph unl...
The checklist is intentionally conservative. Causal discovery is powerful, but it can also produce very confident-looking graphs from weak assumptions.
Output Inventory
This cell lists the artifacts created by notebook 00. The list is useful when the notebook is used as a setup check for the rest of the tutorial series.
The output inventory confirms that the notebook saved both teaching figures and reusable tables. This is the same artifact pattern used across the other tutorial folders.
Final Takeaways
Notebook 00 should leave you with four core ideas:
causal-learn is for learning candidate graph structure from data.
Method choice is assumption choice: PC, FCI, GES, LiNGAM, ANM, and other methods answer different discovery problems.
Partially oriented graphs are normal because observational data often cannot identify every direction.
Every discovered graph needs sensitivity checks and domain review before it supports effect estimation or decisions.
The next notebook should focus on graph objects, DAGs, CPDAGs, PAGs, edge marks, and graph evaluation metrics in more depth.