The generalized graph input line entry system (GGILES) package produces sequential, string representations of graphs in a NetworkX format via depth-first graph traversal, and reproduces NetworkX graphs from the string representation. GGILES is flexible and customizable, accommodating a large range of graph types and formats.

These details have not been verified by PyPI

Project links

Project description

Generalized Graph Input Line Entry System (GGILES) by

Overview

Visual Overview

Features

GGILES features graph sequentialization and subsequent graph reconstruction for:

Graphs with node types and (optionally) edge types.
Directed and undirected graphs.
Graphs with node and edge attributes as float or string

GGILES also features:

Default edge types: Default edges are not included in the string sequentialization for token efficiency and inferred in graph reconstruction.
Attribute serialization with or without attribute name (keyword attribute).
Optional or enforced attributes.
Generating unique sequences: Graph invariant calculation for sequence canonicalization.
A tokenizer for the sequence based on a regular expression.

All code examples below can also be found and run in the example notebook.

Overview
Features
Table of contents
Installation
Quickstart: Generating a string representation from a graph
Including the node and edge attributes in the sequence
- Optional attributes
Unique sequences for graphs
Configuring node- and edge types
Tokenization
References
Copyright and license
Contact

Installation

Install GGILES from pip:

pip install ggiles

or from GitHub:

pip install git+https://github.com/process-intelligence-research/Generalized-graph-line-entry-system

Alternatively, get the latest updates by cloning the repo and installing the editable version of the package with:

git clone https://github.com/process-intelligence-research/Generalized-graph-line-entry-system
cd Generalized-graph-line-entry-system
pip install .

Quickstart: Generating a string representation from a graph

Here is an example graph.

import networkx as nx
import matplotlib.pyplot as plt

G = nx.DiGraph()
G.add_node(0, type="a", id="1")
G.add_node(1, type="c", id="2")
G.add_node(2, type="b", id="3")
G.add_node(3, type="d", id="4")
G.add_node(4, type="b", id="5")
G.add_node(5, type="c", id="6")
G.add_node(6, type="a", id="7")
G.add_node(7, type="e", id="8")
G.add_node(8, type="a", id="9")
G.add_edge(0, 1, type="a-c", flt=0.1)
G.add_edge(1, 2, type="c-b", flt=1.2)
G.add_edge(2, 3, type="b-d", flt=2.3)
G.add_edge(3, 4, type="d-b", flt=3.4)
G.add_edge(4, 1, type="b-c", flt=4.1)
G.add_edge(3, 5, type="d-c", flt=3.5)
G.add_edge(6, 4, type="a-b", flt=4.6)
G.add_edge(4, 7, type="b-e", flt=5.7)

# Plot the graph
pos = nx.kamada_kawai_layout(G)  
edge_labels = nx.get_edge_attributes(G, "type")  
node_labels = nx.get_node_attributes(G, "type") 

plt.figure(figsize=(10, 8))
nx.draw(G, pos, with_labels=True, labels=node_labels, node_color='cyan', node_size=500, font_size=10, font_weight='bold')
nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_color='green', font_weight='bold')

plt.title("Example graph")

plt.show()

We want to create a string representation of this graph, where the type attribute of nodes and edges is used as a token.

from ggiles.graph_invariant import GraphInvariantProcessor, MorganAlgorithm, AlphabeticNodeOrdering
from ggiles import make_converter

invariant = GraphInvariantProcessor(layers=[MorganAlgorithm(), AlphabeticNodeOrdering("type")])
graph_converter = make_converter(invariant=invariant)
sequence = graph_converter.graph_to_sequence(G)
print(sequence)

Result

v(a)e(a-b)v(b)<1[e(b-e)v(e)]e(b-c)v(c)<&|v(a)e(a-c)&|e(c-b)v(b)e(b-d)v(d)e(d-b)1e(d-c)v(c)|nv(a)

Here, v(.) denotes a node (vertex), e(.) an edge, [.] a branch, <&|.&.| a converging branch, 1.<1 a cycle, and |n a new subgraph.

Can we reconstruct the graph (with the information encoded in the sequence)?

reconstructed_graph = graph_converter.sequence_to_graph(sequence)
print(nx.is_isomorphic(G, reconstructed_graph, 
        node_match=lambda x, y: x["type"] == y["type"], 
        edge_match=lambda x, y: x["type"] == y["type"])
     )

Using directed graphs works similarly by changing a flag in the converter constructor

graph_converter = make_converter(invariant=invariant, is_directed=False)

Including the node and edge attributes in the sequence

Attributes can be assigned to both nodes and edges. Attributes can be assinged without attribute key, or keyword-attributes with attribute key. To specify which attributes should be included in the sequence representation, add the attribute names to node_attributes, node_kwarg_attributes, edge_attributes, edge_kwarg_attributes. Any attribute in the graph can either be of type string (to represent categorical values) or of type float/int (to represent numerical values).

from ggiles.attribute_encoding import AttributeConfig

invariant = GraphInvariantProcessor(layers=[MorganAlgorithm(), AlphabeticNodeOrdering("type")])
attr_config = AttributeConfig(
    node_attributes=["id"],
    edge_kwarg_attributes=["flt"]
    )

graph_converter = make_converter(invariant=invariant, attribute_config=attr_config)
sequence = graph_converter.graph_to_sequence(G)
print(sequence)

Result: v(a){"7"}e(a-b){"flt":4.6}v(b){"5"}...

Note how attributes are included after the respective token in {.}. If multiple attributes are included, they are separated by ,.

Optional attributes

If specified, attributes are required in nodes/edges. If attributes should be treated as optional, set the treat_attributes_as_optional in converter construction to True.

from ggiles import AttributeConfig

invariant = GraphInvariantProcessor(layers=[MorganAlgorithm(), AlphabeticNodeOrdering("type")])
attr_config = AttributeConfig(
    node_attributes=["id"],
    )

graph_converter = make_converter(invariant=invariant, 
                                  attribute_config=attr_config, 
                                  treat_attributes_as_optional=True)

G_opt = G.copy()

del G_opt.nodes[6]["id"]
sequence = graph_converter.graph_to_sequence(G_opt)
print(sequence)

Result: v(a)e(a-b)v(b){"5"}...

Notice how the node "b" (corresponding to the graph node with id 6) has no attribute. Not enabling treat_attributes_as_optional in this case raises an exception.

Keyword attributes are decoded back into the graph automatically. Attributes without keywords need to be assigned a keyword before being reassigned to a new graph. By default, the attribute keywords are assigned in the order specified in node_attributes and edge_attributes in AttributeConfig.

Unique sequences for graphs

GGILES can produce unique graph sequences. For this, a function for calculating a graph invariant needs to be specified. The graph invariant assigns unique ranks to the nodes of a graph, so that branching decisions can be made with respect to these ranks. The invariant is calculated by the GraphInvariantProcessor class. There are many ways to assign ranks to nodes, and some of them cannot always produce unique ranks to all graphs. An effective choice for the calculation of invariants can depend on the properties of the graphs in a specific use-case. Therefore, the GraphInvariantProcessor can have several GraphInvariantLayers. Each layer calculates node ranks. If a layer does not produce unique ranks and nodes with tied ranks remain, the ranks of lower level layers are used to break the ties.

Lower ranks are chosen first as start nodes of branches, new subgraphs, or converging branches. This convention promotes nodes with lower ranks to lie on the main sequence, and nodes with higher ranks to be pushed to branches and converging branches.

The following layers are inbuilt

MorganAlgorithm: Ranks nodes using the Morgan Algorithm.
CategoricalAttributePriorization: Ranks nodes by attribute value priority.
DescendantSizePrioritization: Ranks nodes by descendant subgraph size.
AlphabeticNodeOrdering: Ranks nodes by alphabetical attribute order.
CustomAttributePrioritization: Ranks nodes using a custom attribute function.
DescendantRankSumOrdering: Ranks nodes by the sum of ranks in their descendant subgraphs.
DisregardEqualDescendants: Assigns 'random' ranks to nodes with isomorphic descendant graphs.
FastDisregardEqualDescendants: Faster version for branched, non-connected directed graphs.

In the example below, the ranks are calculated with the morgan algorithm. If tied nodes remain, the ties are broken by ordering nodes alphabetically on the value of the "type" attribute.

from ggiles.graph_invariant import GraphInvariantProcessor, MorganAlgorithm, AlphabeticNodeOrdering

invariant = GraphInvariantProcessor(layers=[MorganAlgorithm(), AlphabeticNodeOrdering("type")])
graph_converter = make_converter(invariant=invariant)

Graph invariants are complex and may be dependent on the specific structure of the graphs in question. Therefore, it makes sense to validate the graph invariant calculation against a set of example graphs to ensure that the ranking is consistent and unique across different graph structures. For this, use the processor's validate_against_example_graphs method:

example_graphs: list[nx.Graph] = [] # Add example graphs here
print(invariant.validate_against_example_graphs(example_graphs))

Configuring node- and edge types

Changing the token type attribute

By default, GGILES uses the "type" attribte in nodes and edges for the selection of the strings for the tokens. This can be changed in a custom GraphTokenConfig:

from ggiles import GraphTokenConfig

token_config = GraphTokenConfig(
    node_type_attribute="id"
)

graph_converter = make_converter(invariant=invariant, graph_token_config=token_config)
sequence = graph_converter.graph_to_sequence(G)
print(sequence)

Result: v(7)e(a-b)v(5)...

Optional type for edges

For graphs without explicit edge types, the edge type token does not have to be included in the sequence representation. In this case, set the edge_type_attribute in GraphTokenConfig to None:

from ggiles import GraphTokenConfig

token_config = GraphTokenConfig(
    edge_type_attribute=None
)

graph_converter = make_converter(invariant=invariant, graph_token_config=token_config)
sequence = graph_converter.graph_to_sequence(G)
print(sequence)

Result: v(a)v(b)<1[v(e)]v(c)<&|v(a)&|v(b)v(d)1v(c)|nv(a)

Notice how edge tokens are not included anymore.

Default type for edges

It is also possible to omit the edge tokens of a single type. This is useful if a certain edge type occurs very frequently and can be treated as a default. This can be achieved by setting the default_edge_type in GraphTokenConfig:

from ggiles import GraphTokenConfig

invariant = GraphInvariantProcessor(layers=[MorganAlgorithm(), AlphabeticNodeOrdering("type")])
token_config = GraphTokenConfig(
    default_edge_type="a-b"
)
graph_converter = make_converter(invariant=invariant, graph_token_config=token_config)
sequence = graph_converter.graph_to_sequence(G)
print(sequence)

Result: v(a)v(b)<1[e(b-e)v(e)]e(b-c)v(c)...

Now, all edges with type a-b are omitted and inferred when parsing back to a graph.

Tokenization

The graph string representations can be tokenized with the GGILES package with the default tokenization scheme. The GGILESTokenizer produces a Regex expression that identifies node tokens, edge tokens, string attributes and digits, and syntax markers. Make sure to apply the same settings to the tokenizer as for the converter.

from ggiles import GGILESTokenizer

invariant = GraphInvariantProcessor(layers=[MorganAlgorithm(), AlphabeticNodeOrdering("type")])
graph_converter = make_converter(invariant=invariant)
sequence = graph_converter.graph_to_sequence(G)


tokenizer = GGILESTokenizer()
tokens = tokenizer.tokenize(sequence)
print(tokens[0:2])

Result: ["v(a)", "e(a-b)"]

References

Vogel, G., Hirtreiter, E., Schulze Balhorn, L., & Schweidtmann, A. M. (2023). SFILES 2.0: An extended text-based flowsheet representation. Optimization and Engineering, 24(4), 2911–2933. https://doi.org/10.1007/s11081-023-09798-9
d’Anterroches, L. (2006). Process flow sheet generation and design through a group contribution approach [Dissertation]. Technical University of Denmark.
Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, 28(1), 31–36. https://doi.org/10.1021/ci00057a005

Copyright and license

GGILES is published under MIT license (see license file)

Contact

📧 Contact

🌐 PI research

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.1.0

May 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ggiles-1.1.0.tar.gz (319.5 kB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ggiles-1.1.0-py3-none-any.whl (46.5 kB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file ggiles-1.1.0.tar.gz.

File metadata

Download URL: ggiles-1.1.0.tar.gz
Upload date: May 19, 2026
Size: 319.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for ggiles-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f3cbf8bccb72f6ff57a951903e0841c9e7a4037cfd309776935de0e14599ed03`
MD5	`86cd54257098b0d10cfe8fc5eb3832fe`
BLAKE2b-256	`a64aa2f44b1a5e86cdbee8c859833a804cc53e73cceb2399c3600bf77cd31995`

See more details on using hashes here.

File details

Details for the file ggiles-1.1.0-py3-none-any.whl.

File metadata

Download URL: ggiles-1.1.0-py3-none-any.whl
Upload date: May 19, 2026
Size: 46.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for ggiles-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9681d274b229a3ed12813004fd7673cdb90fdeea7685f761686df97d871431c7`
MD5	`6aff2d4a45bc863187aa90d46a2adbcd`
BLAKE2b-256	`7ff5485d14c7118a98021e27b11fa5ce08d621fb3da344532b3b2ca841609aa9`

See more details on using hashes here.

GGILES 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Generalized Graph Input Line Entry System (GGILES) by

Overview

Features

Table of contents

Installation

Quickstart: Generating a string representation from a graph

Including the node and edge attributes in the sequence

Optional attributes

Unique sequences for graphs

Configuring node- and edge types

Changing the token type attribute

Optional type for edges

Default type for edges

Tokenization

References

Copyright and license

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes