Mathematical Foundations
Complete formal treatment of the IsalGraph framework: interpreter state, instruction semantics, encoding and decoding algorithms, canonical strings, and the graph distance metric. Based on López-Rubio & Pascual-González (2026).
1. Interpreter State
An IsalGraph interpreter state is a triple \(\mathcal{S} = (G,\, \mathcal{L},\, \pi)\) where:
- \(G = (V_G, E_G)\) is a finite, simple graph with nodes identified by contiguous integers \(\{0, 1, \ldots, |V_G|-1\}\).
- \(\mathcal{L}\) is an array-backed circular doubly-linked list (CDLL) with nodes carrying graph-node indices as integer payloads. \(\mathrm{val}_\mathcal{L}(\ell)\) denotes the payload of CDLL node \(\ell\), and \(\mathrm{next}(\ell)\), \(\mathrm{prev}(\ell)\) denote successor and predecessor in circular order.
- \(\pi = (\pi_1, \pi_2)\) is a pair of pointers (CDLL node indices), where \(\pi_1\) is the primary pointer and \(\pi_2\) is the secondary pointer.
Critical invariant: The CDLL index space and the graph node index space are distinct. Use \(\mathrm{val}_\mathcal{L}(\pi_i)\) to obtain the graph node at pointer \(\pi_i\).
The initial interpreter state \(\mathcal{S}_0\) has: \(G\) containing exactly one node (node 0) with no edges; \(\mathcal{L}\) containing exactly one node with payload 0; and both \(\pi_1\) and \(\pi_2\) pointing to this single CDLL node.
2. Instruction Set and Semantics
The instruction alphabet is:
| Instruction | Category | Formal Effect |
|---|---|---|
| N | Primary move (fwd) | \(\pi_1 \leftarrow \mathrm{next}_\mathcal{L}(\pi_1)\) |
| P | Primary move (bwd) | \(\pi_1 \leftarrow \mathrm{prev}_\mathcal{L}(\pi_1)\) |
| n | Secondary move (fwd) | \(\pi_2 \leftarrow \mathrm{next}_\mathcal{L}(\pi_2)\) |
| p | Secondary move (bwd) | \(\pi_2 \leftarrow \mathrm{prev}_\mathcal{L}(\pi_2)\) |
| V | Node via primary | Add node \(u\); add edge \((\mathrm{val}_\mathcal{L}(\pi_1),\, u)\); insert \(u\) into \(\mathcal{L}\) after \(\pi_1\) |
| v | Node via secondary | Add node \(u\); add edge \((\mathrm{val}_\mathcal{L}(\pi_2),\, u)\); insert \(u\) into \(\mathcal{L}\) after \(\pi_2\) |
| C | Edge (pri → sec) | Add edge \((\mathrm{val}_\mathcal{L}(\pi_1),\, \mathrm{val}_\mathcal{L}(\pi_2))\) |
| c | Edge (sec → pri) | Add edge \((\mathrm{val}_\mathcal{L}(\pi_2),\, \mathrm{val}_\mathcal{L}(\pi_1))\) |
| W | No-op | State unchanged |
Every string \(w \in \Sigma^*\) decodes to a valid finite simple graph. No instruction produces undefined or inconsistent state. The pointer does not advance after a \(\texttt{V}\) or \(\texttt{v}\) instruction.
3. String-to-Graph Algorithm (S2G)
\(\mathrm{S2G}(\texttt{VvNV},\, \mathit{false})\) produces a graph with nodes \(\{0,1,2,3\}\) and edges \(\{(0,1),\, (0,2),\, (2,3)\}\).
4. Graph-to-String Algorithm (G2S)
Pair Generation and Cost
For integer \(M > 0\), let:
sorted in increasing order of the tuple \((|a|+|b|,\; |a|,\; a,\; b)\) lexicographically.
The cost of pair \((a,b)\) is \(|a| + |b|\), which equals the number of pointer-move instructions (\(\texttt{N}/\texttt{P}/\texttt{n}/\texttt{p}\)) emitted.
For a graph \(G\) with \(N\) nodes and \(M\) edges, any IsalGraph string has length:
The first two terms are fixed by \(G\); minimizing \(|w|\) reduces to minimizing total pointer travel.
5. Canonical Strings
5.1 Exhaustive Canonical String
For a finite, simple, connected graph \(G = (V, E)\) with \(N = |V|\) nodes, let \(\mathcal{W}^+(G, v) \subseteq \Sigma^*\) denote the set of all IsalGraph strings producible by \(\mathrm{G2S}(G, v)\) when neighbor choice is made exhaustively over all valid options.
The exhaustive canonical string of \(G\) is:
where:
5.2 Structural Triplet
For graph \(G = (V, E)\) and node \(v \in V\), let \(N_k(v)\) denote the set of nodes at graph-theoretic distance exactly \(k\) from \(v\). The structural triplet of \(v\) is:
Complexity: Per-node triplet is \(O(N + E)\) via BFS truncated at distance 3. Full collection is \(O(N(N+E))\) per graph.
5.3 Pruned Canonical String
For graph \(G = (V, E)\), let \(\mathcal{W}^*(G, v)\) denote all IsalGraph strings producible by \(\mathrm{G2S}(G, v)\) when uninserted neighbor choice is made over the pruned candidate set:
The pruned canonical string is:
where:
Pruned and exhaustive canonical strings may differ in length but both are complete invariants. On vertex-transitive graphs (where all nodes share the same triplet), pruning is vacuous and the algorithms agree exactly.
6. Completeness Theorem
Let \(G = (V_G, E_G)\) and \(H = (V_H, E_H)\) be finite, simple, connected graphs. Then:
\(\mathrm{S2G}\) is a deterministic function of \(w\). If \(w^*_G = w^*_H =: w\), then \(G' := \mathrm{S2G}(w)\) is uniquely defined. By round-trip correctness, every execution of \(\mathrm{G2S}(G, v_0)\) yields a string that \(\mathrm{S2G}\) decodes back to a graph isomorphic to \(G\). Therefore: \(G \cong G' \cong H\).
Suppose \(\phi\colon V_G \xrightarrow{\;\sim\;} V_H\) is a graph isomorphism.
Step 1 (Triplet invariance): Graph-theoretic distances are preserved under isomorphism: \(\mathrm{dist}_G(u, v) = \mathrm{dist}_H(\phi(u), \phi(v))\). Therefore \(|N_k^G(v)| = |N_k^H(\phi(v))|\) for all \(v\) and \(k\), giving:
This preserves the pruned candidate set: \(\phi(\mathcal{C}_{\mathrm{pruned}}^{G}) = \mathcal{C}_{\mathrm{pruned}}^{H}\).
Step 2 (Execution-path bijection): Any pruned execution path \(\Pi\) on \(G\) starting at \(v_0\) maps bijectively to a pruned execution path \(\phi(\Pi)\) on \(H\) starting at \(\phi(v_0)\). By Step 1, both emit identical strings:
Step 3 (String-set equality): Applying the bijection over all starting nodes:
Therefore \(\mathcal{W}^*(G) = \mathcal{W}^*(H)\) and:
7. Graph Distance Metric
For two strings \(w_1, w_2 \in \Sigma^*\), their Levenshtein distance is:
where a single edit is a character insertion, deletion, or substitution. Computable in \(O(|w_1| \cdot |w_2|)\) via standard dynamic programming.
The graph edit distance \(\mathrm{GED}(G, H)\) is the minimum number of elementary edit operations (node insertion, node deletion, edge insertion, edge deletion) needed to transform \(G\) into a graph isomorphic to \(H\). Computing GED is \(\mathsf{NP}\)-hard.
The IsalGraph graph distance:
is a metric on the space of isomorphism classes of finite, simple, connected graphs.
- Non-negativity and symmetry: Inherited from Levenshtein distance.
- Identity of indiscernibles: \(d_{\mathrm{IsalGraph}}(G, H) = 0 \iff w^*_G = w^*_H \iff G \cong H\) (by Theorem 1 and \(d_{\mathrm{Lev}}(w_1, w_2) = 0 \iff w_1 = w_2\)).
-
Triangle inequality: For any graphs \(G, H, K\):
$$d_{\mathrm{IsalGraph}}(G, K) = d_{\mathrm{Lev}}(w^*_G, w^*_K) \leq d_{\mathrm{Lev}}(w^*_G, w^*_H) + d_{\mathrm{Lev}}(w^*_H, w^*_K) = d_{\mathrm{IsalGraph}}(G, H) + d_{\mathrm{IsalGraph}}(H, K)$$
8. Locality Property
Let \(G, H\) be finite, simple, connected graphs with \(k = \mathrm{GED}(G,H)\). Empirically:
- Monotonicity: \(d_{\mathrm{IsalGraph}}(G, H)\) is a non-decreasing function of \(k\) (adding more edit operations produces strings further apart).
- Strong correlation: Over broad graph families, Pearson and Spearman correlations between \(d_{\mathrm{IsalGraph}}\) and GED are high.
Unlike Hamming distance on adjacency matrices (where a single node insertion changes \(O(N)\) entries), the IsalGraph distance reflects the instruction-level cost of re-encoding: naturally bounded by changed edges plus pointer-movement costs.
Explore these concepts interactively: