Represent graph structure as instruction strings for Transformer models
Project description
IsalGraph (Instruction Set and Language for Graphs) is a Python library that represents the structure of any finite simple graph as a compact string over a nine-character instruction alphabet. The encoding is executed by a small virtual machine comprising a sparse graph, a circular doubly-linked list (CDLL) of graph-node references, and two traversal pointers.
Lopez-Rubio, E. and Pascual-Gonzalez, M. (2025). Representation of Graphs by Sequences of Instructions. Preprint submitted to Pattern Recognition. [arXiv:2512.10429]
Key Properties
| Property | Description |
|---|---|
| Universal validity | Every string over the alphabet {N, n, P, p, V, v, C, c, W} decodes to a valid finite simple graph. No invalid states exist. |
| Reversibility | The greedy GraphToString algorithm encodes any connected graph into a string w such that S2G(w) is isomorphic to the input. |
| Canonical completeness | The canonical string w* is a complete graph invariant: w*_G = w*_H if and only if G and H are isomorphic. Formally proved via structural-triplet pruning. |
| Metric locality | The Levenshtein distance between IsalGraph strings correlates strongly with graph edit distance (GED), enabling fast structural comparison. |
Installation
The core library has zero dependencies (Python standard library only).
pip install isalgraph
With optional adapters for graph libraries:
pip install isalgraph[networkx] # NetworkX adapter
pip install isalgraph[igraph] # igraph adapter
pip install isalgraph[pyg] # PyTorch Geometric adapter
pip install isalgraph[all] # Everything
Requires Python >= 3.10.
Quick Start
Encode a graph as a string
from isalgraph import SparseGraph, GraphToString
# Build a triangle: 0--1--2--0
g = SparseGraph(max_nodes=3, directed_graph=False)
g.add_node(); g.add_node(); g.add_node()
g.add_edge(0, 1); g.add_edge(1, 2); g.add_edge(2, 0)
gts = GraphToString(g)
string, trace = gts.run(initial_node=0)
print(string) # e.g. "VVPnC"
Decode a string back to a graph
from isalgraph import StringToGraph
stg = StringToGraph("VVPnC", directed_graph=False)
graph, trace = stg.run()
print(graph.node_count()) # 3
print(graph.logical_edge_count()) # 3
Compute canonical strings (complete graph invariant)
from isalgraph import GreedyMinG2S, ExhaustiveG2S
# Greedy-min: fast polynomial encoding (default)
algo = GreedyMinG2S()
w_greedy = algo.encode(g)
# Exhaustive: true canonical string (complete invariant)
algo = ExhaustiveG2S()
w_canonical = algo.encode(g)
Use with NetworkX
import networkx as nx
from isalgraph.adapters.networkx_adapter import NetworkXAdapter
adapter = NetworkXAdapter()
# NetworkX graph -> IsalGraph string
G = nx.petersen_graph()
sg = adapter.from_external(G, directed=False)
string = GreedyMinG2S().encode(sg)
# IsalGraph string -> NetworkX graph
G_recovered = adapter.from_isalgraph_string(string, directed=False)
assert nx.is_isomorphic(G, G_recovered)
Use with PyTorch Geometric
from torch_geometric.data import Data
from isalgraph.adapters.pyg_adapter import PyGAdapter
adapter = PyGAdapter()
# PyG Data -> IsalGraph string -> PyG Data
data = Data(edge_index=edge_index, num_nodes=n)
sg = adapter.from_external(data, directed=False)
string = GreedyMinG2S().encode(sg)
data_recovered = adapter.from_isalgraph_string(string, directed=False)
The Instruction Set
The IsalGraph virtual machine maintains three components: a graph G, a circular doubly-linked list L, and two pointers (primary and secondary). Nine instructions manipulate this state:
| Instruction | Type | Effect |
|---|---|---|
N / P |
Primary move | Move primary pointer forward / backward in the CDLL |
n / p |
Secondary move | Move secondary pointer forward / backward in the CDLL |
V |
Node via primary | Add new node u to G; add edge from primary's graph node to u; insert u into CDLL after primary |
v |
Node via secondary | Same as V but via secondary pointer |
C |
Edge (primary -> secondary) | Add edge between graph nodes at current pointer positions |
c |
Edge (secondary -> primary) | Reverse of C (differs from C only for directed graphs) |
W |
No-op | State unchanged |
Every string over this alphabet produces a valid graph -- there are no invalid states.
Encoding Algorithms
Three encoding strategies are provided, spanning the speed-quality trade-off:
| Algorithm | Class | Complexity | Description |
|---|---|---|---|
| Greedy-rnd(v_0) | GreedySingleG2S |
O(T_greedy) | Single greedy run from one starting node. Fastest. |
| Greedy-Min | GreedyMinG2S |
O(N * T_greedy) | Greedy from all starting nodes, returns lexmin shortest. Default. |
| Canonical (Pruned) | PrunedExhaustiveG2S |
Exponential (pruned) | Backtracking search with structural-triplet pruning. Complete invariant. |
| Canonical (Exhaustive) | ExhaustiveG2S |
Exponential | Full backtracking search. Complete invariant. |
Empirical Results
Evaluated on five real-world graph benchmark datasets (IAM Letter LOW/MED/HIGH, LINUX, AIDS) with exact GED ground truth.
Message Length Compactness
IsalGraph encodings require between 53% and 74% of the bits needed by the GED standard construction model.
OLS regression slopes beta < 1 across all methods and datasets confirm systematic compression. Canonical (Pruned): beta = 0.537, R^2 = 0.947; Greedy-Min: beta = 0.538, R^2 = 0.945. IsalGraph produces shorter bit representations for 99.6% of graphs.
Metric Locality
Local changes in graph structure produce bounded changes in the instruction string. The Levenshtein distance between canonical strings approximates GED.
Single edge edit (GED = 1) produces small, localised changes in the IsalGraph canonical string (Levenshtein distance = 1), illustrating metric locality.
Shortest paths between two graphs under GED (top) and Levenshtein distance (bottom). Both distances equal 5 but traverse entirely different sequences of intermediate graphs.
Architecture
isalgraph/
core/ # Zero external dependencies (stdlib only)
cdll.py # Array-backed circular doubly-linked list
sparse_graph.py # Adjacency-set graph representation
string_to_graph.py # S2G: string -> graph decoder
graph_to_string.py # G2S: graph -> string encoder
canonical.py # Exhaustive canonical string search
canonical_pruned.py # Structural-triplet pruned canonical search
algorithms/ # G2S algorithm implementations
adapters/ # Optional bridges to graph libraries
networkx_adapter.py
igraph_adapter.py
pyg_adapter.py
types.py # Type aliases
errors.py # Exception hierarchy
The core has zero external dependencies. Adapters import their respective libraries independently.
Development
git clone https://github.com/MarioPasc/IsalGraph.git
cd IsalGraph
pip install -e ".[dev]"
# Run tests
python -m pytest tests/unit/ -v # Unit tests
python -m pytest tests/integration/ -v # Integration tests
python -m pytest tests/property/ -v # Property-based tests
python -m pytest tests/ -v --cov=isalgraph # Full suite with coverage
# Lint and type check
python -m ruff check --fix src/ tests/
python -m ruff format src/ tests/
python -m mypy src/isalgraph/
Interactive Documentation
The project website includes:
- Step-by-step algorithm visualisations for S2G and G2S
- An interactive playground for encoding and decoding graphs
- A graph explorer for navigating IsalGraph string neighbourhoods
Citation
If you use IsalGraph in your research, please cite:
@article{lopezrubio2025isalgraph,
title = {Representation of Graphs by Sequences of Instructions},
author = {L{\'o}pez-Rubio, Ezequiel and Pascual-Gonz{\'a}lez, Mario},
journal = {Preprint submitted to Pattern Recognition},
year = {2025},
note = {arXiv:2512.10429}
}
License
MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file isalgraph-0.1.0.tar.gz.
File metadata
- Download URL: isalgraph-0.1.0.tar.gz
- Upload date:
- Size: 5.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa6d9506e528182957c489a704f41ec2c648a29dece7a69ddaee8f784ff7e0d3
|
|
| MD5 |
497c1f750a332e7080f7145a42729323
|
|
| BLAKE2b-256 |
2763a2eb527d40caccde1645639a247089a5c624625a155feb476c25faaea3a3
|
File details
Details for the file isalgraph-0.1.0-py3-none-any.whl.
File metadata
- Download URL: isalgraph-0.1.0-py3-none-any.whl
- Upload date:
- Size: 46.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5e5eeae8594bc5ad22348f80537360815edb649b808b366d2d71c200b7d5404
|
|
| MD5 |
277c65b419b4bf73cea1b7beeda713ad
|
|
| BLAKE2b-256 |
6af05640dc5f386a43f9fcc045838c370b32d82e2ec0634a46ca87dc0e097fa5
|