causal-learn Tutorial 12: Permutation-Based Methods: GRaSP And BOSS
Permutation-based causal discovery searches over variable orderings. Once an order is proposed, each variable can only choose parents that appear earlier in that order. This turns the graph search problem into an order-search problem: find a good ordering, then infer a sparse DAG consistent with that ordering.
This notebook introduces two causal-learn permutation-based methods:
BOSS: Best Order Score Search.
GRaSP: Greedy Relaxation of the Sparsest Permutation.
Both methods use local scores and order mutations to search for a graph. They are useful bridges between exact score search and scalable greedy graph search: more structured than arbitrary edge edits, but much cheaper than enumerating all DAGs.
Expected runtime: usually under 1 minute on this teaching dataset. The stability loops run multiple random starts, but each run is small and fast.
Learning Goals
By the end of this notebook, you should be able to:
Explain why variable orderings are useful search objects for DAG discovery.
Understand how a fixed order restricts possible parent sets.
Run BOSS and GRaSP from causal-learn.
Compare permutation-based output with GES and exact search on a small graph.
Check random-start stability for order-based methods.
Study how GRaSP depth and sample size affect graph recovery.
Report permutation-based results without overstating what the search guarantees.
Notebook Flow
We start with order-search theory, then set up the environment and load the same linear Gaussian teaching dataset used by the GES and exact-search notebooks. After that we show what a fixed order implies, run BOSS and GRaSP, compare them with GES and exact search, and stress-test stability across seeds, GRaSP depths, and sample sizes.
The notebook uses the linear Gaussian dataset because the permutation methods here use BIC-style local scores. That keeps the score assumptions aligned with the data-generating mechanism.
Permutation-Based Discovery Theory
A DAG always has at least one topological order: an ordering of variables where every parent appears before its children. If we know the correct order, the graph search becomes much easier. For each variable, we only need to decide which earlier variables are its parents.
For example, suppose the order is:
need, intent, match, engagement, renewal, support
Then match may choose parents from need and intent, but it cannot choose renewal as a parent because renewal appears later. The order itself forbids backward edges.
Permutation-based algorithms exploit this fact. They search over orders and use local scores to decide which parents are useful within each order.
From Orders To Parent Sets
For a fixed order, the best parent set for each node can be scored locally. With a decomposable score, the total graph score is:
\[
S(G) = \sum_i S_i(\mathrm{Pa}(X_i))
\]
The local score for X_i only depends on the parent set of X_i. Under an order, the candidate parent set must be a subset of variables that appear before X_i.
This is the key computational move. Instead of exploring arbitrary directed graphs, the algorithm asks: “if this order were true, which earlier variables would each node choose as parents?” Then it mutates the order to see if a better score is possible.
Sparsest Permutation Intuition
The sparsest permutation idea says that the right causal order should allow a sparse graph that explains the conditional dependence structure. If the order is wrong, the algorithm may need extra edges or awkward orientations to explain the same data.
This does not mean the sparsest graph is always the true graph. Finite samples, weak effects, hidden variables, and score mismatch can all distort sparsity. But the idea is powerful: a good order should make parent selection simple.
GRaSP searches for better permutations by relaxing and mutating orders. BOSS searches by moving variables through an order when doing so improves the score.
BOSS Versus GRaSP
BOSS and GRaSP are both order-based, but they move through order space differently.
BOSS repeatedly considers moving variables to better positions in the order. After an order is updated, parent sets are recomputed from local scores. It is a greedy best-order search.
GRaSP starts with an order and uses covered-edge style relaxations, called tucks in the implementation, to improve the sparsest permutation. The depth argument controls how deep the relaxation search can go. Larger depth can escape some shallow local traps, but it can also cost more time.
Both methods can depend on random starts or random traversal choices. That is why stability across seeds is part of this notebook.
Relationship To GES And Exact Search
GES greedily edits an equivalence-class graph by adding and removing edges. Exact search optimizes the score globally over a small allowed space. BOSS and GRaSP take a third route: they search over permutations and derive parent sets consistent with those permutations.
All three families can use decomposable scores, so they can agree on friendly small problems. They can also disagree because their search paths and constraints differ.
For a small teaching graph, exact search is a useful benchmark. For larger graphs, permutation methods may be more practical than exact search while still giving a structured search over possible causal orderings.
What Permutation Methods Can And Cannot Claim
Permutation methods can find high-scoring DAGs and useful causal orders under score assumptions. They are especially helpful when order structure is easier to search than arbitrary graph structure.
They cannot prove causal truth from score optimization alone. They still depend on observed variables, score choice, sample size, model class, and absence of major hidden confounding. A high-scoring order is evidence under a model, not a final causal proof.
A careful report should include the score function, penalty, random seeds or restarts, depth settings for GRaSP, runtime, stability, and comparison to other methods.
Setup
This cell imports the scientific Python stack, BOSS, GRaSP, GES, exact BIC search, and plotting utilities. It also applies a compatibility wrapper for the BIC score used by recent NumPy versions. The wrapper preserves the same scoring formula while safely converting one-element matrix results into Python scalars.
from pathlib import Pathfrom importlib.metadata import PackageNotFoundError, versionimport contextlibimport ioimport itertoolsimport osimport randomimport timeimport warnings# Resolve paths before importing matplotlib so the cache directory stays writable.CWD = Path.cwd()if CWD.name =="causal_learn"and (CWD /"outputs").exists(): NOTEBOOK_DIR = CWDelse: NOTEBOOK_DIR = (CWD /"notebooks"/"tutorials"/"causal_learn").resolve()OUTPUT_DIR = NOTEBOOK_DIR /"outputs"DATASET_DIR = OUTPUT_DIR /"datasets"TABLE_DIR = OUTPUT_DIR /"tables"FIGURE_DIR = OUTPUT_DIR /"figures"REPORT_DIR = OUTPUT_DIR /"reports"for directory in [OUTPUT_DIR, DATASET_DIR, TABLE_DIR, FIGURE_DIR, REPORT_DIR, OUTPUT_DIR /"matplotlib_cache"]: directory.mkdir(parents=True, exist_ok=True)os.environ.setdefault("MPLCONFIGDIR", str(OUTPUT_DIR /"matplotlib_cache"))warnings.filterwarnings("ignore", category=FutureWarning)warnings.filterwarnings("ignore", message="Using 'local_score_BIC_from_cov' instead for efficiency")import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsfrom IPython.display import displayfrom matplotlib.patches import FancyArrowPatch, FancyBboxPatchimport causallearn.search.PermutationBased.BOSS as boss_moduleimport causallearn.search.PermutationBased.GRaSP as grasp_moduleimport causallearn.search.ScoreBased.GES as ges_modulefrom causallearn.search.PermutationBased.BOSS import bossfrom causallearn.search.PermutationBased.GRaSP import graspfrom causallearn.search.ScoreBased.GES import gesfrom causallearn.search.ScoreBased.ExactSearch import bic_exact_searchNOTEBOOK_PREFIX ="12"sns.set_theme(style="whitegrid", context="notebook")plt.rcParams["figure.dpi"] =120plt.rcParams["savefig.facecolor"] ="white"def local_score_BIC_from_cov(Data, i, PAi, parameters=None):"""Compatibility wrapper for causal-learn BIC-from-cov scoring under recent NumPy scalar behavior.""" cov, n = Data lambda_value =0.5if parameters isNoneelse parameters.get("lambda_value", 0.5) sigma =float(cov[i, i])iflen(PAi) >0: yX = cov[np.ix_([i], PAi)] XX = cov[np.ix_(PAi, PAi)]try: XX_inv = np.linalg.inv(XX)except np.linalg.LinAlgError: XX_inv = np.linalg.pinv(XX) sigma =float(cov[i, i] - (yX @ XX_inv @ yX.T).item()) sigma =max(sigma, 1e-12)return-0.5* n * (1+ np.log(sigma)) - lambda_value * (len(PAi) +1) * np.log(n)# Patch the modules that build LocalScoreClass objects during this notebook session.boss_module.local_score_BIC_from_cov = local_score_BIC_from_covgrasp_module.local_score_BIC_from_cov = local_score_BIC_from_covges_module.local_score_BIC_from_cov = local_score_BIC_from_covpackages = ["causal-learn", "numpy", "pandas", "matplotlib", "seaborn"]version_rows = []for package in packages:try: package_version = version(package)except PackageNotFoundError: package_version ="not installed" version_rows.append({"package": package, "version": package_version})package_versions = pd.DataFrame(version_rows)package_versions.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_package_versions.csv", index=False)display(package_versions)
package
version
0
causal-learn
0.1.4.5
1
numpy
2.4.4
2
pandas
3.0.2
3
matplotlib
3.10.9
4
seaborn
0.13.2
The setup confirms the environment and patches the local score function before any permutation method is run. BOSS, GRaSP, and GES will all use the same safe BIC wrapper in this notebook.
Load The Linear Gaussian Teaching Data
We use the linear Gaussian dataset from the synthetic data tutorial because it matches the BIC score assumptions well. The known true graph lets us measure recovery rather than judging graphs by appearance alone.
The data has six observed variables. That is small enough to compare BOSS, GRaSP, GES, and exact search directly without turning the notebook into a runtime exercise.
Helper Functions
These helpers convert causal-learn graph objects into tidy edge tables, compute recovery metrics, score fixed orders, and draw graphs in the same visual style used throughout the tutorial.
def parse_causallearn_edge(edge):"""Convert a causal-learn edge object into source, endpoint pattern, and target strings.""" parts =str(edge).strip().split()iflen(parts) !=3:return {"source": str(edge), "edge_type": "unknown", "target": "unknown"}return {"source": parts[0], "edge_type": parts[1], "target": parts[2]}def graph_to_edge_table(graph, label):"""Return a tidy edge table from a causal-learn graph object.""" rows = [parse_causallearn_edge(edge) for edge in graph.get_graph_edges()] edge_df = pd.DataFrame(rows, columns=["source", "edge_type", "target"])if edge_df.empty: edge_df = pd.DataFrame(columns=["source", "edge_type", "target"]) edge_df.insert(0, "run", label)return edge_dfdef adjacency_to_edge_table(adjacency, variables, label):"""Convert a parent-row, child-column adjacency matrix into a tidy directed edge table.""" rows = []for parent_idx, child_idx inzip(*np.where(adjacency >0)): rows.append( {"run": label,"source": variables[int(parent_idx)],"edge_type": "-->","target": variables[int(child_idx)], } )return pd.DataFrame(rows, columns=["run", "source", "edge_type", "target"])def directed_pairs(edge_df):"""Extract definite directed pairs from an edge table.""" pairs =set()for row in edge_df.itertuples(index=False):if row.edge_type =="-->": pairs.add((row.source, row.target))elif row.edge_type =="<--": pairs.add((row.target, row.source))return pairsdef skeleton_pairs(edge_df):"""Extract adjacencies while ignoring direction.""" pairs =set()for row in edge_df.itertuples(index=False):if row.target !="unknown": pairs.add(frozenset([row.source, row.target]))return pairsdef summarize_against_truth(edge_df, truth_df, label):"""Compute graph-recovery metrics against a directed truth table.""" true_directed =set(zip(truth_df["source"], truth_df["target"])) true_skeleton = {frozenset(edge) for edge in true_directed} learned_directed = directed_pairs(edge_df) learned_skeleton = skeleton_pairs(edge_df) correct_directed = learned_directed & true_directed reversed_true = {(src, dst) for src, dst in true_directed if (dst, src) in learned_directed} missing_skeleton = true_skeleton - learned_skeleton extra_skeleton = learned_skeleton - true_skeleton unresolved_true =0for src, dst in true_directed: pair =frozenset([src, dst])if pair in learned_skeleton and (src, dst) notin learned_directed and (dst, src) notin learned_directed: unresolved_true +=1 directed_count =len(learned_directed)return pd.DataFrame( [ {"run": label,"learned_edges_total": len(edge_df),"definite_directed_edges": directed_count,"true_edges": len(true_directed),"correct_directed_edges": len(correct_directed),"directed_precision": len(correct_directed) / directed_count if directed_count else np.nan,"directed_recall": len(correct_directed) /len(true_directed) if true_directed else np.nan,"reversed_true_edges": len(reversed_true),"unresolved_true_adjacencies": unresolved_true,"missing_true_adjacencies": len(missing_skeleton),"extra_adjacencies": len(extra_skeleton), } ] )def run_boss(data_df, label, seed=1, lambda_value=0.5):"""Run BOSS with a controlled random seed.""" random.seed(seed) np.random.seed(seed) start = time.perf_counter()buffer= io.StringIO()with contextlib.redirect_stdout(buffer): graph = boss( data_df.to_numpy(), score_func="local_score_BIC_from_cov", parameters={"lambda_value": lambda_value}, verbose=False, node_names=list(data_df.columns), ) elapsed = time.perf_counter() - start edge_table = graph_to_edge_table(graph, label=label)return graph, edge_table, elapseddef run_grasp(data_df, label, seed=1, depth=4, lambda_value=0.5):"""Run GRaSP with a controlled random seed and chosen depth.""" random.seed(seed) np.random.seed(seed) start = time.perf_counter()buffer= io.StringIO()with contextlib.redirect_stdout(buffer): graph = grasp( data_df.to_numpy(), score_func="local_score_BIC_from_cov", depth=depth, parameters={"lambda_value": lambda_value}, verbose=False, node_names=list(data_df.columns), ) elapsed = time.perf_counter() - start edge_table = graph_to_edge_table(graph, label=label)return graph, edge_table, elapseddef run_ges_bic(data_df, label, lambda_value=0.5):"""Run BIC-GES quietly for comparison.""" start = time.perf_counter()buffer= io.StringIO()with contextlib.redirect_stdout(buffer): record = ges( data_df.to_numpy(), score_func="local_score_BIC", node_names=list(data_df.columns), lambda_value=lambda_value, ) elapsed = time.perf_counter() - start edge_table = graph_to_edge_table(record["G"], label=label)return record, edge_table, elapseddef run_exact_bic(data_df, label):"""Run exact BIC search with A* for the small teaching graph.""" start = time.perf_counter() adjacency, stats = bic_exact_search(data_df.to_numpy(), search_method="astar", verbose=False) elapsed = time.perf_counter() - start edge_table = adjacency_to_edge_table(adjacency, list(data_df.columns), label=label)return adjacency, stats, edge_table, elapseddef best_parent_sets_for_order(data_df, order, lambda_value=0.5, max_parents=None):"""Exhaustively choose the best parent set for each node given a fixed order.""" data = data_df.to_numpy() cov = np.cov(data.T, ddof=0) n = data.shape[0] index = {name: i for i, name inenumerate(data_df.columns)} rows = [] edge_rows = [] total_score =0.0for position, child inenumerate(order): child_idx = index[child] candidate_parents =list(order[:position]) parent_limit =len(candidate_parents) if max_parents isNoneelsemin(max_parents, len(candidate_parents)) best_score =-np.inf best_parents = []for size inrange(parent_limit +1):for parent_names in itertools.combinations(candidate_parents, size): parent_indices = [index[name] for name in parent_names] score = local_score_BIC_from_cov((cov, n), child_idx, parent_indices, {"lambda_value": lambda_value})if score > best_score: best_score = score best_parents =list(parent_names) total_score += best_score rows.append( {"order_label": " -> ".join(order),"child": child,"position": position +1,"allowed_parent_count": len(candidate_parents),"selected_parents": ", ".join(best_parents) if best_parents else"none","local_score": best_score, } )for parent in best_parents: edge_rows.append({"run": "fixed_order", "source": parent, "edge_type": "-->", "target": child}) parent_table = pd.DataFrame(rows) edge_table = pd.DataFrame(edge_rows, columns=["run", "source", "edge_type", "target"])return parent_table, edge_table, total_scoreGRAPH_POSITIONS = {"need": (0.11, 0.72),"intent": (0.11, 0.28),"match": (0.39, 0.50),"engagement": (0.62, 0.50),"renewal": (0.89, 0.72),"support": (0.89, 0.28),}NODE_LABELS = {name: name.title() for name in GRAPH_POSITIONS}NODE_COLORS = {"need": "#e0f2fe","intent": "#dbeafe","match": "#ecfccb","engagement": "#fef3c7","renewal": "#fee2e2","support": "#f3e8ff",}def trim_edge_to_box(start, end, box_w=0.145, box_h=0.095, gap=0.012):"""Return edge endpoints that stop just outside source and target boxes.""" x0, y0 = start x1, y1 = end dx = x1 - x0 dy = y1 - y0 length =float(np.hypot(dx, dy))if length ==0:return start, end effective_w = box_w +0.04 effective_h = box_h +0.04 x_limit = (effective_w /2) /abs(dx) if dx else np.inf y_limit = (effective_h /2) /abs(dy) if dy else np.inf t =min(x_limit, y_limit) + gap / lengthreturn (x0 + dx * t, y0 + dy * t), (x1 - dx * t, y1 - dy * t)def draw_box_graph(edge_df, title, path, note=None):"""Draw a DAG-style graph with rounded boxes and visible arrowheads.""" fig, ax = plt.subplots(figsize=(12, 6.2)) ax.set_axis_off() ax.set_xlim(-0.02, 1.02) ax.set_ylim(0.04, 0.96) box_w, box_h =0.145, 0.095for row in edge_df.itertuples(index=False):if row.source notin GRAPH_POSITIONS or row.target notin GRAPH_POSITIONS:continue raw_start = GRAPH_POSITIONS[row.source] raw_end = GRAPH_POSITIONS[row.target]if row.edge_type =="<--": raw_start, raw_end = raw_end, raw_start start, end = trim_edge_to_box(raw_start, raw_end, box_w=box_w, box_h=box_h)if row.edge_type in {"-->", "<--"}: arrowstyle ="-|>" mutation_scale =18 linewidth =1.8 color ="#334155"else: arrowstyle ="-" mutation_scale =1 linewidth =1.5 color ="#64748b" arrow = FancyArrowPatch( start, end, arrowstyle=arrowstyle, mutation_scale=mutation_scale, linewidth=linewidth, color=color, connectionstyle="arc3,rad=0.035", zorder=2, ) ax.add_patch(arrow)for node, (x, y) in GRAPH_POSITIONS.items(): rect = FancyBboxPatch( (x - box_w /2, y - box_h /2), box_w, box_h, boxstyle="round,pad=0.018", facecolor=NODE_COLORS[node], edgecolor="#1f2937", linewidth=1.1, zorder=5, ) ax.add_patch(rect) ax.text(x, y, NODE_LABELS[node], ha="center", va="center", fontsize=10.5, fontweight="bold", zorder=6)if note: ax.text(0.50, 0.08, note, ha="center", va="center", fontsize=10, color="#475569") ax.set_title(title, pad=18, fontsize=14, fontweight="bold") fig.savefig(path, dpi=160, bbox_inches="tight") plt.show()def truth_as_edge_table(truth_df, label="truth"):"""Convert a truth table into a plotting-ready edge table."""return truth_df.assign(run=label, edge_type="-->")[["run", "source", "edge_type", "target"]]
The helper set includes a small fixed-order scorer. That scorer is not a replacement for BOSS or GRaSP; it is a teaching device for seeing how a proposed order restricts parent choices.
Draw The True DAG
Before running order-based methods, we draw the known data-generating DAG. This is the target used in the recovery metrics.
true_plot_edges = truth_as_edge_table(true_edges)draw_box_graph( true_plot_edges, title="True DAG For The Permutation-Based Tutorial", path=FIGURE_DIR /f"{NOTEBOOK_PREFIX}_true_dag.png", note="The teaching data is generated from this six-edge linear Gaussian DAG.",)
The true graph has two root variables, a mediator chain, and two downstream outcomes. A good order should place causes before effects so parent selection can recover these edges.
Fixed-Order Parent Selection Demo
This cell shows why order matters. We compare a plausible causal order with a deliberately reversed order. For each order, the code finds the best local BIC parent set for every node using only variables that appear earlier in the order.
The causal order should score better and recover the direct structure more naturally. The reversed order may still fit dependencies, but it must explain them with backward or awkward parent choices.
Run BOSS And GRaSP
Now we run the actual permutation-based algorithms. The baseline uses the same BIC penalty as the GES comparison later and fixes the random seed so the notebook is reproducible.
On this friendly six-variable dataset, both methods should recover the same core graph. That is reassuring, but the stability sections below are still necessary because order-search algorithms can depend on random starts and search settings.
Draw The BOSS Graph
The BOSS graph is drawn first because BOSS is a clean baseline for best-order score search in this notebook.
draw_box_graph( boss_edges, title="BOSS Learned Graph", path=FIGURE_DIR /f"{NOTEBOOK_PREFIX}_boss_graph.png", note="BOSS searches over variable orders and derives parent sets consistent with the selected order.",)
The graph layout matches the true DAG layout so differences are easy to see. For this baseline seed and dataset, the learned graph should be visually aligned with the true structure.
Compare With GES And Exact Search
Permutation-based search is one way to optimize graph structure. To anchor the result, we compare BOSS and GRaSP with GES and exact BIC search on the same data.
Agreement across BOSS, GRaSP, GES, and exact search is a strong teaching result for this small aligned dataset. In real data, disagreement is often more informative than agreement because it exposes assumption and search sensitivity.
Plot Method Comparison
The table above is compact, but a plot makes recovery differences easier to scan. We focus on directed correctness, unresolved true adjacencies, and extra adjacencies.
The plot should show that all methods perform well in this controlled setting. The later sensitivity sections are where differences become more visible.
Random-Seed Stability
BOSS and GRaSP use random order traversal or random starting choices. This cell repeats each method across several seeds and records whether the recovered graph changes.
The stability summary is the practical guardrail. If a method returns several distinct graphs across seeds, the report should mention that sensitivity rather than presenting one seed as definitive.
Plot Seed Stability
This plot shows recovery variation across random seeds. For a stable method and an easy dataset, the points should cluster tightly.
fig, ax = plt.subplots(figsize=(9, 5))sns.stripplot( data=seed_stability_metrics, x="method", y="correct_directed_edges", color="#334155", size=7, alpha=0.75, ax=ax,)ax.set_title("Random-Seed Stability For BOSS And GRaSP")ax.set_xlabel("Method")ax.set_ylabel("Correct Directed Edges")ax.set_ylim(-0.2, len(true_edges) +0.5)plt.tight_layout()fig.savefig(FIGURE_DIR /f"{NOTEBOOK_PREFIX}_seed_stability.png", dpi=160, bbox_inches="tight")plt.show()
Seed sensitivity is not automatically bad, but it changes how results should be communicated. Stable relationships across many starts are more credible than relationships that appear in one lucky order.
GRaSP Depth Sensitivity
GRaSP has a depth parameter controlling how deeply it searches relaxation moves. Larger depth can help escape shallow local structures, but it can also increase runtime. This cell compares several depths across a few seeds.
Depth sensitivity shows that a search setting can change the result even when the score and data are fixed. That is exactly why algorithm settings belong in the report.
Plot GRaSP Depth Sensitivity
This plot summarizes how correct directed recovery changes with depth across seeds.
For this dataset, deeper search should generally be more stable. On larger datasets, depth can trade off accuracy and runtime, so it is worth tuning deliberately rather than leaving it unexplained.
Sample-Size Sensitivity
Order-based methods still rely on local scores estimated from data. With fewer rows, local scores are noisier, and the preferred order or parent sets can change. This cell runs BOSS and GRaSP on increasing sample sizes.
Sample-size sensitivity is about evidence quality, not just runtime. A graph that appears only at one sample size deserves more caution than a graph that stabilizes as the sample grows.
Plot Sample-Size Sensitivity
This plot tracks correct directed recovery as the number of rows increases.
The plot should be read as a stability diagnostic. It does not prove population recovery, but it shows whether the method becomes more reliable as score estimates get more data.
Runtime And Scaling Sketch
Permutation methods scale with variables, possible parent choices, and score evaluations. This quick sketch runs prefixes of the variable list to show how runtime changes with variable count on the same dataset.
The absolute times are tiny here because the graph is tiny. The purpose is to show the workflow for tracking runtime as variable count grows, not to claim that all real permutation searches will be this fast.
Plot Runtime Scaling Sketch
This plot shows elapsed time by variable count for the tiny prefix runs.
fig, ax = plt.subplots(figsize=(8.5, 5))sns.lineplot(data=scaling_table, x="variable_count", y="elapsed_seconds", hue="method", marker="o", ax=ax)ax.set_title("Runtime Sketch For Permutation-Based Methods")ax.set_xlabel("Number Of Variables")ax.set_ylabel("Elapsed Seconds")ax.legend(title="Method")plt.tight_layout()fig.savefig(FIGURE_DIR /f"{NOTEBOOK_PREFIX}_runtime_scaling_sketch.png", dpi=160, bbox_inches="tight")plt.show()
For larger datasets, runtime should be measured and reported directly. Search depth, score type, parent-set complexity, and number of restarts can all change compute cost.
Reporting Checklist
Permutation-based results should be reported as score-based search outputs with order-search settings. This checklist records the details that make BOSS and GRaSP analyses easier to audit.
reporting_checklist = pd.DataFrame( [ {"item": "State the score function","what_to_report": "BIC, BDeu, CV score, or another local score plus penalty settings.","why_it_matters": "The selected order and graph optimize this score, not a score-free causal criterion.", }, {"item": "Report random seeds or restarts","what_to_report": "Seeds used, number of starts, and whether graphs changed across starts.","why_it_matters": "Permutation search can depend on random traversal or initial order.", }, {"item": "Document GRaSP depth","what_to_report": "Depth setting and sensitivity results when GRaSP is used.","why_it_matters": "Depth changes the relaxation search and can change recovery.", }, {"item": "Compare with another method","what_to_report": "GES, PC, exact search on a small subproblem, or domain-constrained alternatives.","why_it_matters": "Agreement across search families is more reassuring than one graph alone.", }, {"item": "Track runtime and graph size","what_to_report": "Elapsed time, variable count, learned edge count, and any parent limits.","why_it_matters": "Order search can become expensive as graph size and score complexity grow.", }, {"item": "Separate optimization from causality","what_to_report": "What was optimized and which causal assumptions are still required.","why_it_matters": "A high-scoring permutation is not causal proof by itself.", }, ])reporting_checklist.to_csv(TABLE_DIR /f"{NOTEBOOK_PREFIX}_permutation_reporting_checklist.csv", index=False)display(reporting_checklist)
item
what_to_report
why_it_matters
0
State the score function
BIC, BDeu, CV score, or another local score pl...
The selected order and graph optimize this sco...
1
Report random seeds or restarts
Seeds used, number of starts, and whether grap...
Permutation search can depend on random traver...
2
Document GRaSP depth
Depth setting and sensitivity results when GRa...
Depth changes the relaxation search and can ch...
3
Compare with another method
GES, PC, exact search on a small subproblem, o...
Agreement across search families is more reass...
4
Track runtime and graph size
Elapsed time, variable count, learned edge cou...
Order search can become expensive as graph siz...
5
Separate optimization from causality
What was optimized and which causal assumption...
A high-scoring permutation is not causal proof...
The checklist is deliberately practical. It keeps the final graph connected to the score, search settings, stability checks, and assumptions that produced it.
Artifact Manifest
The final cell lists every dataset, table, and figure generated by this notebook. This makes it easy to review outputs without searching the folder manually.
This notebook now gives a complete permutation-based discovery workflow: order-search theory, fixed-order parent selection, BOSS and GRaSP runs, comparison with GES and exact search, seed stability, GRaSP depth sensitivity, sample-size sensitivity, runtime sketching, and reporting guidance.
The next tutorial can move to time-series causal discovery, where lag structure provides another source of direction information.