Math Foundations — IsalGraph

1. Interpreter State

Definition 1 (Interpreter State)

An IsalGraph interpreter state is a triple $\mathcal{S} = (G,\, \mathcal{L},\, \pi)$ where:

$G = (V_G, E_G)$ is a finite, simple graph with nodes identified by contiguous integers $\{0, 1, \ldots, |V_G|-1\}$.
$\mathcal{L}$ is an array-backed circular doubly-linked list (CDLL) with nodes carrying graph-node indices as integer payloads. $\mathrm{val}_\mathcal{L}(\ell)$ denotes the payload of CDLL node $\ell$, and $\mathrm{next}(\ell)$, $\mathrm{prev}(\ell)$ denote successor and predecessor in circular order.
$\pi = (\pi_1, \pi_2)$ is a pair of pointers (CDLL node indices), where $\pi_1$ is the primary pointer and $\pi_2$ is the secondary pointer.

Critical invariant: The CDLL index space and the graph node index space are distinct. Use $\mathrm{val}_\mathcal{L}(\pi_i)$ to obtain the graph node at pointer $\pi_i$.

Initial State

The initial interpreter state $\mathcal{S}_0$ has: $G$ containing exactly one node (node 0) with no edges; $\mathcal{L}$ containing exactly one node with payload 0; and both $\pi_1$ and $\pi_2$ pointing to this single CDLL node.

2. Instruction Set and Semantics

The instruction alphabet is:

\Sigma = \{\texttt{N},\, \texttt{n},\, \texttt{P},\, \texttt{p},\, \texttt{V},\, \texttt{v},\, \texttt{C},\, \texttt{c},\, \texttt{W}\}

Instruction	Category	Formal Effect
N	Primary move (fwd)	$\pi_1 \leftarrow \mathrm{next}_\mathcal{L}(\pi_1)$
P	Primary move (bwd)	$\pi_1 \leftarrow \mathrm{prev}_\mathcal{L}(\pi_1)$
n	Secondary move (fwd)	$\pi_2 \leftarrow \mathrm{next}_\mathcal{L}(\pi_2)$
p	Secondary move (bwd)	$\pi_2 \leftarrow \mathrm{prev}_\mathcal{L}(\pi_2)$
V	Node via primary	Add node $u$; add edge $(\mathrm{val}_\mathcal{L}(\pi_1),\, u)$; insert $u$ into $\mathcal{L}$ after $\pi_1$
v	Node via secondary	Add node $u$; add edge $(\mathrm{val}_\mathcal{L}(\pi_2),\, u)$; insert $u$ into $\mathcal{L}$ after $\pi_2$
C	Edge (pri → sec)	Add edge $(\mathrm{val}_\mathcal{L}(\pi_1),\, \mathrm{val}_\mathcal{L}(\pi_2))$
c	Edge (sec → pri)	Add edge $(\mathrm{val}_\mathcal{L}(\pi_2),\, \mathrm{val}_\mathcal{L}(\pi_1))$
W	No-op	State unchanged

Remark

Every string $w \in \Sigma^*$ decodes to a valid finite simple graph. No instruction produces undefined or inconsistent state. The pointer does not advance after a $\texttt{V}$ or $\texttt{v}$ instruction.

3. String-to-Graph Algorithm (S2G)

Algorithm 1: S2G(w, directed)

Input: String $w \in \Sigma^*$; Boolean $\mathit{directed}$

Output: Graph $G$ such that $\mathrm{S2G}(w) = G$

1. Initialize $\mathcal{S}_0 = (G_0, \mathcal{L}_0, (\pi_1, \pi_2))$ // node 0, both ptrs on it

2. for each instruction $\sigma_i$ in $w$:

Execute $\sigma_i$ according to Table 1, updating $\mathcal{S}$

3. return $G$

Example

$\mathrm{S2G}(\texttt{VvNV},\, \mathit{false})$ produces a graph with nodes $\{0,1,2,3\}$ and edges $\{(0,1),\, (0,2),\, (2,3)\}$.

4. Graph-to-String Algorithm (G2S)

Pair Generation and Cost

Definition 2 (Displacement Pairs)

For integer $M > 0$, let:

\mathcal{P}(M) = \bigl\{(a,b) \;\big|\; a, b \in \{-M, \ldots, M\}\bigr\}

sorted in increasing order of the tuple $(|a|+|b|,\; |a|,\; a,\; b)$ lexicographically.

The cost of pair $(a,b)$ is $|a| + |b|$, which equals the number of pointer-move instructions ($\texttt{N}/\texttt{P}/\texttt{n}/\texttt{p}$) emitted.

Algorithm 2: G2S(G, v₀)

Input: Connected graph $G=(V,E)$; starting node $v_0 \in V$

Output: String $w \in \Sigma^*$ with $\mathrm{S2G}(w) \cong G$

1. Initialize output state with $v_0$; set $n_\mathrm{left} = |V|-1$, $e_\mathrm{left} = |E|$

2. while $n_\mathrm{left} > 0$ or $e_\mathrm{left} > 0$:

for each $(a,b) \in \mathcal{P}(M)$ in sorted order:

Compute tentative pointer positions after $|a|$ + $|b|$ moves

Try (in priority): $\texttt{V}$, $\texttt{v}$, $\texttt{C}$, $\texttt{c}$

if any operation succeeds: emit moves + instruction, break

3. return concatenation of emitted instructions

String Length Decomposition

For a graph $G$ with $N$ nodes and $M$ edges, any IsalGraph string has length:

|w| \;=\; \underbrace{(N-1)}_{\text{one }\texttt{V}/\texttt{v}\text{ per non-root}} \;+\; \underbrace{M - (N-1)}_{\text{one }\texttt{C}/\texttt{c}\text{ per extra edge}} \;+\; \underbrace{\sum_k (|a_k| + |b_k|)}_{\text{pointer moves}}

The first two terms are fixed by $G$; minimizing $|w|$ reduces to minimizing total pointer travel.

5. Canonical Strings

5.1 Exhaustive Canonical String

Definition 3 (Exhaustive Canonical String)

For a finite, simple, connected graph $G = (V, E)$ with $N = |V|$ nodes, let $\mathcal{W}^+(G, v) \subseteq \Sigma^*$ denote the set of all IsalGraph strings producible by $\mathrm{G2S}(G, v)$ when neighbor choice is made exhaustively over all valid options.

The exhaustive canonical string of $G$ is:

w^{+}_G \;=\; \operatorname*{lex\text{-}min}_{w \,\in\, \mathcal{W}^+(G)} w

where:

\mathcal{W}^+(G) \;=\; \operatorname*{arg\,min}_{w \,\in\, \bigcup_{v \in V} \mathcal{W}^+(G,v)} |w|

5.2 Structural Triplet

Definition 4 (Structural Triplet)

For graph $G = (V, E)$ and node $v \in V$, let $N_k(v)$ denote the set of nodes at graph-theoretic distance exactly $k$ from $v$. The structural triplet of $v$ is:

\tau(v) \;=\; \bigl(|N_1(v)|,\; |N_2(v)|,\; |N_3(v)|\bigr) \;\in\; \mathbb{N}^3

Complexity: Per-node triplet is $O(N + E)$ via BFS truncated at distance 3. Full collection is $O(N(N+E))$ per graph.

5.3 Pruned Canonical String

Definition 5 (Pruned Canonical String)

For graph $G = (V, E)$, let $\mathcal{W}^*(G, v)$ denote all IsalGraph strings producible by $\mathrm{G2S}(G, v)$ when uninserted neighbor choice is made over the pruned candidate set:

\mathcal{C}_{\mathrm{pruned}} \;=\; \bigl\{\, c \in \mathcal{C} \;\big|\; \tau(c) = \max_{c' \in \mathcal{C}}\,\tau(c') \,\bigr\}

The pruned canonical string is:

w^*_G \;=\; \operatorname*{lex\text{-}min}_{w \,\in\, \mathcal{W}^*(G)} w

where:

\mathcal{W}^*(G) \;=\; \operatorname*{arg\,min}_{w \,\in\, \bigcup_{v \in V} \mathcal{W}^*(G,v)} |w|

Remark

Pruned and exhaustive canonical strings may differ in length but both are complete invariants. On vertex-transitive graphs (where all nodes share the same triplet), pruning is vacuous and the algorithms agree exactly.

6. Completeness Theorem

Theorem 1 (Complete Graph Invariant)

Let $G = (V_G, E_G)$ and $H = (V_H, E_H)$ be finite, simple, connected graphs. Then:

w^*_G = w^*_H \quad \iff \quad G \cong H

Proof

($\Rightarrow$) Equal strings imply isomorphic graphs:

$\mathrm{S2G}$ is a deterministic function of $w$. If $w^*_G = w^*_H =: w$, then $G' := \mathrm{S2G}(w)$ is uniquely defined. By round-trip correctness, every execution of $\mathrm{G2S}(G, v_0)$ yields a string that $\mathrm{S2G}$ decodes back to a graph isomorphic to $G$. Therefore: $G \cong G' \cong H$.

($\Leftarrow$) Isomorphic graphs yield equal strings:

Suppose $\phi\colon V_G \xrightarrow{\;\sim\;} V_H$ is a graph isomorphism.

Step 1 (Triplet invariance): Graph-theoretic distances are preserved under isomorphism: $\mathrm{dist}_G(u, v) = \mathrm{dist}_H(\phi(u), \phi(v))$. Therefore $|N_k^G(v)| = |N_k^H(\phi(v))|$ for all $v$ and $k$, giving:

\tau_G(v) = \tau_H(\phi(v)) \quad \forall v \in V_G

This preserves the pruned candidate set: $\phi(\mathcal{C}_{\mathrm{pruned}}^{G}) = \mathcal{C}_{\mathrm{pruned}}^{H}$.

Step 2 (Execution-path bijection): Any pruned execution path $\Pi$ on $G$ starting at $v_0$ maps bijectively to a pruned execution path $\phi(\Pi)$ on $H$ starting at $\phi(v_0)$. By Step 1, both emit identical strings:

w(\Pi, G) = w(\phi(\Pi), H)

Step 3 (String-set equality): Applying the bijection over all starting nodes:

\bigcup_{v \in V_G} \mathcal{W}^*(G, v) = \bigcup_{v \in V_H} \mathcal{W}^*(H, v)

Therefore $\mathcal{W}^*(G) = \mathcal{W}^*(H)$ and:

w^*_G = \operatorname*{lex\text{-}min}_{w \in \mathcal{W}^*(G)} w = \operatorname*{lex\text{-}min}_{w \in \mathcal{W}^*(H)} w = w^*_H

■

Graph G

↔

Graph H

Step 1/5: Show graphs

7. Graph Distance Metric

Definition 6 (Levenshtein Distance)

For two strings $w_1, w_2 \in \Sigma^*$, their Levenshtein distance is:

d_{\mathrm{Lev}}(w_1, w_2) \;=\; \min\bigl\{ k \;\big|\; w_1 \xrightarrow{\text{$k$ edits}} w_2 \bigr\}

where a single edit is a character insertion, deletion, or substitution. Computable in $O(|w_1| \cdot |w_2|)$ via standard dynamic programming.

Definition 7 (Graph Edit Distance)

The graph edit distance $\mathrm{GED}(G, H)$ is the minimum number of elementary edit operations (node insertion, node deletion, edge insertion, edge deletion) needed to transform $G$ into a graph isomorphic to $H$. Computing GED is $\mathsf{NP}$-hard.

Corollary 1 (IsalGraph Distance is a Metric)

The IsalGraph graph distance:

d_{\mathrm{IsalGraph}}(G, H) = d_{\mathrm{Lev}}(w^*_G, w^*_H)

is a metric on the space of isomorphism classes of finite, simple, connected graphs.

Proof

Non-negativity and symmetry: Inherited from Levenshtein distance.
Identity of indiscernibles: $d_{\mathrm{IsalGraph}}(G, H) = 0 \iff w^*_G = w^*_H \iff G \cong H$ (by Theorem 1 and $d_{\mathrm{Lev}}(w_1, w_2) = 0 \iff w_1 = w_2$).
Triangle inequality: For any graphs $G, H, K$: $d_{\mathrm{IsalGraph}}(G, K) = d_{\mathrm{Lev}}(w^*_G, w^*_K) \leq d_{\mathrm{Lev}}(w^*_G, w^*_H) + d_{\mathrm{Lev}}(w^*_H, w^*_K) = d_{\mathrm{IsalGraph}}(G, H) + d_{\mathrm{IsalGraph}}(H, K)$

■

8. Locality Property

Empirical Observation

Let $G, H$ be finite, simple, connected graphs with $k = \mathrm{GED}(G,H)$. Empirically:

Monotonicity: $d_{\mathrm{IsalGraph}}(G, H)$ is a non-decreasing function of $k$ (adding more edit operations produces strings further apart).
Strong correlation: Over broad graph families, Pearson and Spearman correlations between $d_{\mathrm{IsalGraph}}$ and GED are high.

Distinguishing Feature

Unlike Hamming distance on adjacency matrices (where a single node insertion changes $O(N)$ entries), the IsalGraph distance reflects the instruction-level cost of re-encoding: naturally bounded by changed edges plus pointer-movement costs.

Explore these concepts interactively:

Playground → Graph Space Explorer →

Mathematical Foundations