Skip to main content

COGILES - Colored Graph Input Line Entry System: a SMILES-inspired string notation for colored graphs

Project description

COGILES — Colored Graph Input Line Entry System

What is COGILES?

COGILES is a compact, human-readable string notation for defining colored graphs. It is strongly inspired by SMILES (Simplified Molecular Input Line Entry System) used in chemistry for representing molecular structures, but adapted for arbitrary colored graphs rather than molecules.

A COGILES string encodes both the structure (nodes and edges) and the visual attributes (node colors) of a graph in a single line of text.

Origin

COGILES was originally developed as part of the visual_graph_datasets library (introduced in version 0.13.0). The goal of this project is to extract it into a standalone library that can be imported and used independently by other programs.

Syntax

Node Types

The set of node types is fixed and opinionated — like SMILES, COGILES defines a specific alphabet and users take it or leave it.

Following the SMILES convention, common colors use a single uppercase letter. Colors whose first letter collides with another use a two-letter code (uppercase + lowercase), similar to how SMILES distinguishes elements like B (boron) from Br (bromine).

Letter Color RGB
R Red (1.0, 0.0, 0.0)
G Green (0.0, 0.5, 0.0)
B Blue (0.0, 0.0, 1.0)
Y Yellow (1.0, 1.0, 0.0)
C Cyan (0.0, 1.0, 1.0)
M Magenta (1.0, 0.0, 1.0)
W White (1.0, 1.0, 1.0)
K Black (0.0, 0.0, 0.0)
O Orange (1.0, 0.5, 0.0)
P Purple (0.5, 0.0, 0.5)
T Teal (0.0, 0.5, 0.5)
L Lime (0.5, 1.0, 0.0)
I Indigo (0.3, 0.0, 0.5)
Bl Black (0.0, 0.0, 0.0)
Br Brown (0.6, 0.3, 0.0)
Gr Gray (0.5, 0.5, 0.5)
Pi Pink (1.0, 0.4, 0.7)
Ol Olive (0.5, 0.5, 0.0)

Note: K and Bl are both valid for Black (K follows the CMYK convention).

Sequential Connection

Nodes placed next to each other in the string are automatically connected by an edge.

  • RRRGGG — a chain of 6 nodes: 3 red followed by 3 green

Branches

Parentheses () create branches. The first node inside a branch connects to the node immediately before the opening parenthesis. Inside a branch, normal sequential rules apply.

  • RRR(BB)RRRR — a main chain of 7 red nodes with a side branch of 2 blue nodes attached to the 3rd red node
  • Y(G)(G)(G) — a star: yellow center with 3 green leaves
  • BB(GG)CC — 2 blue nodes, then a branch of 2 green nodes and a continuation of 2 cyan nodes

Anchors

Anchors are written as -N (dash followed by an integer) and placed after a node. All nodes sharing the same anchor number are connected, enabling cycles and other non-tree structures.

  • R-1RRRRR-1 — a cycle of 6 red nodes (first and last connected via anchor 1)
  • R-1-2GGG-2BBB-1 — first node has two anchors, creating connections to distant nodes

Rules:

  • A single anchor with a given number has no effect; at least two nodes must share the same anchor number.
  • When an anchor number appears more than twice, all subsequent occurrences connect to the node where the anchor first appeared.
  • A single node can have multiple anchors (e.g., R-1-2).

Breaks (Disconnected Components)

A period . between two nodes prevents the default sequential connection, allowing the definition of multiple disconnected graphs in one string.

  • R-1RR-1.RR(GG)R — two separate graph components: a triangle and a branching structure

Grammar (PEG)

The COGILES syntax is formally defined as a PEG grammar (parsed with the parsimonious library):

graph           = (branch / node / anchor / break)*

branch_node     = (break_node / node) branch+
anchor_node     = (branch_node / break_node / node) anchor+
break_node      = break node

branch          = lpar graph rpar
lpar            = "("
rpar            = ")"

break           = "."
anchor          = ~r"-[\d]+"
node            = ~r"(Bl|Br|Gr|Pi|Ol|[RGBYKCMWOPTLI])"

Graph Representation

Parsing a COGILES string produces a NetworkX graph (nx.Graph). Each node has a color attribute storing its RGB tuple. Edges are unweighted — no edge attributes are stored.

import cogiles

G = cogiles.parse("R-1RRRRR-1")
# G is a nx.Graph with 6 nodes
# G.nodes[0]["color"] == (1.0, 0.0, 0.0)
# G has 6 edges forming a cycle

Encoding converts a NetworkX graph back into a canonical COGILES string:

s = cogiles.encode(G)
# s == "R-1RRRRR-1"  (deterministic canonical form)

Planned Library Scope

The standalone cogiles library should provide:

  1. Parsingcogiles.parse(string) -> nx.Graph — decode a COGILES string into a NetworkX graph
  2. Encodingcogiles.encode(graph) -> string — encode a NetworkX graph back into a canonical COGILES string (deterministic, starting from the lowest-index node)
  3. Node types — a fixed, opinionated mapping between letters, color names, and RGB tuples (to be finalized)
  4. Validation — a custom CogilesParseError exception with descriptive messages and position info for malformed strings

Design Decisions

Decision Choice
Package name cogiles (import cogiles)
Graph representation NetworkX (nx.Graph)
Parser library parsimonious (PEG grammar)
Node types Fixed set of 17 colors (13 single-letter + 4 two-letter + 1 alias), not extensible
Edge attributes Not included — edges are unweighted
Error handling Custom CogilesParseError with position info
Testing pytest + nox
Packaging pyproject.toml only (PEP 621)

Dependencies

  • networkx — for graph representation
  • parsimonious — for PEG grammar parsing

Dev Dependencies

  • pytest — test framework
  • nox — test runner / task automation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cogiles-0.3.0.tar.gz (2.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cogiles-0.3.0-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file cogiles-0.3.0.tar.gz.

File metadata

  • Download URL: cogiles-0.3.0.tar.gz
  • Upload date:
  • Size: 2.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.2

File hashes

Hashes for cogiles-0.3.0.tar.gz
Algorithm Hash digest
SHA256 8b1b2ffb511619c62988f387a3521a2d30bc395140e004bcb63b2aaa86de9275
MD5 1c7bac3977ab3cbf6631441adb98660c
BLAKE2b-256 572f09f6c229e9c345c6856da58922afb784555f0c3d1048949fd3effc633da7

See more details on using hashes here.

File details

Details for the file cogiles-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: cogiles-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.2

File hashes

Hashes for cogiles-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 49d6a1d389176cc5d85273f157bf51608a79d9148b00db50a307530feab40d14
MD5 de48c3851a1e116e48b8fb062227d0cf
BLAKE2b-256 e76f863afd0d73129d29cefc6aa84bd33c44af8eb4830fb3d9f47f275bf99014

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page