Skip to main content

Mathematical Foundations

Complete formal treatment of the IsalGraph framework: interpreter state, instruction semantics, encoding and decoding algorithms, canonical strings, and the graph distance metric. Based on López-Rubio & Pascual-González (2026).

1. Interpreter State

Definition 1 (Interpreter State)

An IsalGraph interpreter state is a triple \(\mathcal{S} = (G,\, \mathcal{L},\, \pi)\) where:

  • \(G = (V_G, E_G)\) is a finite, simple graph with nodes identified by contiguous integers \(\{0, 1, \ldots, |V_G|-1\}\).
  • \(\mathcal{L}\) is an array-backed circular doubly-linked list (CDLL) with nodes carrying graph-node indices as integer payloads. \(\mathrm{val}_\mathcal{L}(\ell)\) denotes the payload of CDLL node \(\ell\), and \(\mathrm{next}(\ell)\), \(\mathrm{prev}(\ell)\) denote successor and predecessor in circular order.
  • \(\pi = (\pi_1, \pi_2)\) is a pair of pointers (CDLL node indices), where \(\pi_1\) is the primary pointer and \(\pi_2\) is the secondary pointer.

Critical invariant: The CDLL index space and the graph node index space are distinct. Use \(\mathrm{val}_\mathcal{L}(\pi_i)\) to obtain the graph node at pointer \(\pi_i\).

Initial State

The initial interpreter state \(\mathcal{S}_0\) has: \(G\) containing exactly one node (node 0) with no edges; \(\mathcal{L}\) containing exactly one node with payload 0; and both \(\pi_1\) and \(\pi_2\) pointing to this single CDLL node.

2. Instruction Set and Semantics

The instruction alphabet is:

$$\Sigma = \{\texttt{N},\, \texttt{n},\, \texttt{P},\, \texttt{p},\, \texttt{V},\, \texttt{v},\, \texttt{C},\, \texttt{c},\, \texttt{W}\}$$
Instruction Category Formal Effect
N Primary move (fwd) \(\pi_1 \leftarrow \mathrm{next}_\mathcal{L}(\pi_1)\)
P Primary move (bwd) \(\pi_1 \leftarrow \mathrm{prev}_\mathcal{L}(\pi_1)\)
n Secondary move (fwd) \(\pi_2 \leftarrow \mathrm{next}_\mathcal{L}(\pi_2)\)
p Secondary move (bwd) \(\pi_2 \leftarrow \mathrm{prev}_\mathcal{L}(\pi_2)\)
V Node via primary Add node \(u\); add edge \((\mathrm{val}_\mathcal{L}(\pi_1),\, u)\); insert \(u\) into \(\mathcal{L}\) after \(\pi_1\)
v Node via secondary Add node \(u\); add edge \((\mathrm{val}_\mathcal{L}(\pi_2),\, u)\); insert \(u\) into \(\mathcal{L}\) after \(\pi_2\)
C Edge (pri → sec) Add edge \((\mathrm{val}_\mathcal{L}(\pi_1),\, \mathrm{val}_\mathcal{L}(\pi_2))\)
c Edge (sec → pri) Add edge \((\mathrm{val}_\mathcal{L}(\pi_2),\, \mathrm{val}_\mathcal{L}(\pi_1))\)
W No-op State unchanged
Remark

Every string \(w \in \Sigma^*\) decodes to a valid finite simple graph. No instruction produces undefined or inconsistent state. The pointer does not advance after a \(\texttt{V}\) or \(\texttt{v}\) instruction.

3. String-to-Graph Algorithm (S2G)

Algorithm 1: S2G(w, directed)
Input: String \(w \in \Sigma^*\); Boolean \(\mathit{directed}\)
Output: Graph \(G\) such that \(\mathrm{S2G}(w) = G\)
 
1. Initialize \(\mathcal{S}_0 = (G_0, \mathcal{L}_0, (\pi_1, \pi_2))\) // node 0, both ptrs on it
2. for each instruction \(\sigma_i\) in \(w\):
Execute \(\sigma_i\) according to Table 1, updating \(\mathcal{S}\)
3. return \(G\)
Example

\(\mathrm{S2G}(\texttt{VvNV},\, \mathit{false})\) produces a graph with nodes \(\{0,1,2,3\}\) and edges \(\{(0,1),\, (0,2),\, (2,3)\}\).

Interactive: Step Through S2G

Enter any instruction string and step through the interpreter state at each instruction. The graph builds up in real time with pointer positions shown.

4. Graph-to-String Algorithm (G2S)

Pair Generation and Cost

Definition 2 (Displacement Pairs)

For integer \(M > 0\), let:

$$\mathcal{P}(M) = \bigl\{(a,b) \;\big|\; a, b \in \{-M, \ldots, M\}\bigr\}$$

sorted in increasing order of the tuple \((|a|+|b|,\; |a|,\; a,\; b)\) lexicographically.

The cost of pair \((a,b)\) is \(|a| + |b|\), which equals the number of pointer-move instructions (\(\texttt{N}/\texttt{P}/\texttt{n}/\texttt{p}\)) emitted.

Algorithm 2: G2S(G, v₀)
Input: Connected graph \(G=(V,E)\); starting node \(v_0 \in V\)
Output: String \(w \in \Sigma^*\) with \(\mathrm{S2G}(w) \cong G\)
 
1. Initialize output state with \(v_0\); set \(n_\mathrm{left} = |V|-1\), \(e_\mathrm{left} = |E|\)
2. while \(n_\mathrm{left} > 0\) or \(e_\mathrm{left} > 0\):
for each \((a,b) \in \mathcal{P}(M)\) in sorted order:
Compute tentative pointer positions after \(|a|\) + \(|b|\) moves
Try (in priority): \(\texttt{V}\), \(\texttt{v}\), \(\texttt{C}\), \(\texttt{c}\)
if any operation succeeds: emit moves + instruction, break
3. return concatenation of emitted instructions
String Length Decomposition

For a graph \(G\) with \(N\) nodes and \(M\) edges, any IsalGraph string has length:

$$|w| \;=\; \underbrace{(N-1)}_{\text{one }\texttt{V}/\texttt{v}\text{ per non-root}} \;+\; \underbrace{M - (N-1)}_{\text{one }\texttt{C}/\texttt{c}\text{ per extra edge}} \;+\; \underbrace{\sum_k (|a_k| + |b_k|)}_{\text{pointer moves}}$$

The first two terms are fixed by \(G\); minimizing \(|w|\) reduces to minimizing total pointer travel.

5. Canonical Strings

5.1 Exhaustive Canonical String

Definition 3 (Exhaustive Canonical String)

For a finite, simple, connected graph \(G = (V, E)\) with \(N = |V|\) nodes, let \(\mathcal{W}^+(G, v) \subseteq \Sigma^*\) denote the set of all IsalGraph strings producible by \(\mathrm{G2S}(G, v)\) when neighbor choice is made exhaustively over all valid options.

The exhaustive canonical string of \(G\) is:

$$w^{+}_G \;=\; \operatorname*{lex\text{-}min}_{w \,\in\, \mathcal{W}^+(G)} w$$

where:

$$\mathcal{W}^+(G) \;=\; \operatorname*{arg\,min}_{w \,\in\, \bigcup_{v \in V} \mathcal{W}^+(G,v)} |w|$$

5.2 Structural Triplet

Definition 4 (Structural Triplet)

For graph \(G = (V, E)\) and node \(v \in V\), let \(N_k(v)\) denote the set of nodes at graph-theoretic distance exactly \(k\) from \(v\). The structural triplet of \(v\) is:

$$\tau(v) \;=\; \bigl(|N_1(v)|,\; |N_2(v)|,\; |N_3(v)|\bigr) \;\in\; \mathbb{N}^3$$

Complexity: Per-node triplet is \(O(N + E)\) via BFS truncated at distance 3. Full collection is \(O(N(N+E))\) per graph.

5.3 Pruned Canonical String

Definition 5 (Pruned Canonical String)

For graph \(G = (V, E)\), let \(\mathcal{W}^*(G, v)\) denote all IsalGraph strings producible by \(\mathrm{G2S}(G, v)\) when uninserted neighbor choice is made over the pruned candidate set:

$$\mathcal{C}_{\mathrm{pruned}} \;=\; \bigl\{\, c \in \mathcal{C} \;\big|\; \tau(c) = \max_{c' \in \mathcal{C}}\,\tau(c') \,\bigr\}$$

The pruned canonical string is:

$$w^*_G \;=\; \operatorname*{lex\text{-}min}_{w \,\in\, \mathcal{W}^*(G)} w$$

where:

$$\mathcal{W}^*(G) \;=\; \operatorname*{arg\,min}_{w \,\in\, \bigcup_{v \in V} \mathcal{W}^*(G,v)} |w|$$
Remark

Pruned and exhaustive canonical strings may differ in length but both are complete invariants. On vertex-transitive graphs (where all nodes share the same triplet), pruning is vacuous and the algorithms agree exactly.

6. Completeness Theorem

Theorem 1 (Complete Graph Invariant)

Let \(G = (V_G, E_G)\) and \(H = (V_H, E_H)\) be finite, simple, connected graphs. Then:

$$w^*_G = w^*_H \quad \iff \quad G \cong H$$
Proof
(\(\Rightarrow\)) Equal strings imply isomorphic graphs:

\(\mathrm{S2G}\) is a deterministic function of \(w\). If \(w^*_G = w^*_H =: w\), then \(G' := \mathrm{S2G}(w)\) is uniquely defined. By round-trip correctness, every execution of \(\mathrm{G2S}(G, v_0)\) yields a string that \(\mathrm{S2G}\) decodes back to a graph isomorphic to \(G\). Therefore: \(G \cong G' \cong H\).

(\(\Leftarrow\)) Isomorphic graphs yield equal strings:

Suppose \(\phi\colon V_G \xrightarrow{\;\sim\;} V_H\) is a graph isomorphism.

Step 1 (Triplet invariance): Graph-theoretic distances are preserved under isomorphism: \(\mathrm{dist}_G(u, v) = \mathrm{dist}_H(\phi(u), \phi(v))\). Therefore \(|N_k^G(v)| = |N_k^H(\phi(v))|\) for all \(v\) and \(k\), giving:

$$\tau_G(v) = \tau_H(\phi(v)) \quad \forall v \in V_G$$

This preserves the pruned candidate set: \(\phi(\mathcal{C}_{\mathrm{pruned}}^{G}) = \mathcal{C}_{\mathrm{pruned}}^{H}\).

Step 2 (Execution-path bijection): Any pruned execution path \(\Pi\) on \(G\) starting at \(v_0\) maps bijectively to a pruned execution path \(\phi(\Pi)\) on \(H\) starting at \(\phi(v_0)\). By Step 1, both emit identical strings:

$$w(\Pi, G) = w(\phi(\Pi), H)$$

Step 3 (String-set equality): Applying the bijection over all starting nodes:

$$\bigcup_{v \in V_G} \mathcal{W}^*(G, v) = \bigcup_{v \in V_H} \mathcal{W}^*(H, v)$$

Therefore \(\mathcal{W}^*(G) = \mathcal{W}^*(H)\) and:

$$w^*_G = \operatorname*{lex\text{-}min}_{w \in \mathcal{W}^*(G)} w = \operatorname*{lex\text{-}min}_{w \in \mathcal{W}^*(H)} w = w^*_H$$

Interactive: See the Proof in Action

Choose a pair of isomorphic graphs and step through the proof: see the isomorphism \(\phi\), encode both graphs, and verify that their canonical strings are identical.

Graph G
Graph H
Step 1/5: Show graphs

7. Graph Distance Metric

Definition 6 (Levenshtein Distance)

For two strings \(w_1, w_2 \in \Sigma^*\), their Levenshtein distance is:

$$d_{\mathrm{Lev}}(w_1, w_2) \;=\; \min\bigl\{ k \;\big|\; w_1 \xrightarrow{\text{$k$ edits}} w_2 \bigr\}$$

where a single edit is a character insertion, deletion, or substitution. Computable in \(O(|w_1| \cdot |w_2|)\) via standard dynamic programming.

Definition 7 (Graph Edit Distance)

The graph edit distance \(\mathrm{GED}(G, H)\) is the minimum number of elementary edit operations (node insertion, node deletion, edge insertion, edge deletion) needed to transform \(G\) into a graph isomorphic to \(H\). Computing GED is \(\mathsf{NP}\)-hard.

Corollary 1 (IsalGraph Distance is a Metric)

The IsalGraph graph distance:

$$d_{\mathrm{IsalGraph}}(G, H) = d_{\mathrm{Lev}}(w^*_G, w^*_H)$$

is a metric on the space of isomorphism classes of finite, simple, connected graphs.

Proof
  • Non-negativity and symmetry: Inherited from Levenshtein distance.
  • Identity of indiscernibles: \(d_{\mathrm{IsalGraph}}(G, H) = 0 \iff w^*_G = w^*_H \iff G \cong H\) (by Theorem 1 and \(d_{\mathrm{Lev}}(w_1, w_2) = 0 \iff w_1 = w_2\)).
  • Triangle inequality: For any graphs \(G, H, K\):
    $$d_{\mathrm{IsalGraph}}(G, K) = d_{\mathrm{Lev}}(w^*_G, w^*_K) \leq d_{\mathrm{Lev}}(w^*_G, w^*_H) + d_{\mathrm{Lev}}(w^*_H, w^*_K) = d_{\mathrm{IsalGraph}}(G, H) + d_{\mathrm{IsalGraph}}(H, K)$$

8. Locality Property

Empirical Observation

Let \(G, H\) be finite, simple, connected graphs with \(k = \mathrm{GED}(G,H)\). Empirically:

  1. Monotonicity: \(d_{\mathrm{IsalGraph}}(G, H)\) is a non-decreasing function of \(k\) (adding more edit operations produces strings further apart).
  2. Strong correlation: Over broad graph families, Pearson and Spearman correlations between \(d_{\mathrm{IsalGraph}}\) and GED are high.
Distinguishing Feature

Unlike Hamming distance on adjacency matrices (where a single node insertion changes \(O(N)\) entries), the IsalGraph distance reflects the instruction-level cost of re-encoding: naturally bounded by changed edges plus pointer-movement costs.

Explore these concepts interactively:

Playground → Graph Space Explorer →