Graph generation and storage library with update tracking
Project description
This library builds a graph (entities and relations) incrementally, and stores them in append-only logs. Now supports optional community tagging.
Table of Contents
Architecture
The library follows a modular architecture with the following components:
Core Components
-
GraphBuilder
- Main class for building and managing the graph.
- Supports incremental entity and relation updates.
- Maintains append-only logs for all changes.
- Integrates with configurable indexers for efficient lookups.
-
GraphCompactor
- Processes the append-only logs to create compacted representations.
- Merges entity updates based on timestamps.
- Builds adjacency lists for efficient graph traversal.
- Supports sharding for large datasets.
-
Indexers
- Abstract interface for entity indexing
- Implementations:
SQLiteIndexer: Persistent storage using SQLiteMemoryIndexer: In-memory storage with JSON serialization
Architecture Diagram
graph TD
subgraph "Graph Builder"
GB[GraphBuilder] --> |add_entity| EL[Entity Log]
GB --> |add_relation| RL[Relation Log]
GB --> |index| IDX[Indexer]
end
subgraph "Graph Compactor"
GC[GraphCompactor] --> |read| EL
GC --> |read| RL
GC --> |write| ES[Entity Shards]
GC --> |write| AL[Adjacency Lists]
end
subgraph "Storage"
EL --> |append-only| LOGS[Logs]
RL --> |append-only| LOGS
IDX --> |persist| DB[(SQLite DB)]
ES --> |sharded| STORAGE[Storage]
AL --> |compacted| STORAGE
end
classDef primary fill:#E6D5AC,stroke:#D4B483,stroke-width:2px,color:#000
classDef secondary fill:#D4E6AC,stroke:#B4D483,stroke-width:2px,color:#000
classDef storage fill:#ACD4E6,stroke:#83B4D4,stroke-width:2px,color:#000
class GB,GC primary
class EL,RL,ES,AL secondary
class LOGS,DB,STORAGE storage
Storage Structure
output_dir/
├── entities/ # Compacted entity shards
├── relations/ # Relation data
├── logs/ # Append-only update logs
│ ├── entity_updates.jsonl
│ └── relation_updates.jsonl
├── adjacency/ # Compacted adjacency lists
└── index.db # SQLite index (if using SQLiteIndexer)
Data Model
- Entities: Nodes in the graph with properties
- Relations: Directed edges between entities with properties
- Updates: Timestamped changes to entities and relations
- Shards: Partitioned storage for efficient processing
Usage
Basic Usage
from graph_builder.storage_manager import GraphBuilder
from graph_builder.config import GraphBuilderConfig
config = GraphBuilderConfig(output_dir="graph_output")
graph = GraphBuilder(config)
# Ingest entities with optional community
graph.add_entity(1, {"name": "Alice", "type": "Person", "community_id": "team_alpha"})
graph.add_entity(2, {"name": "Bob", "type": "Person", "community_id": "team_alpha"})
# Create a relation
graph.add_relation(100, 1, 2, {"type": "FRIENDS_WITH"})
graph.finalize()
Compaction
from graph_builder import GraphCompactor
# Compact the graph data
compactor = GraphCompactor(base_dir="graph_output")
compactor.compact_entities() # Merge entity updates
compactor.build_adjacency() # Build adjacency lists
Features
- Incremental Updates: Support for timestamped updates to entities and relations
- Efficient Storage: Append-only logs with periodic compaction
- Flexible Indexing: Choose between SQLite or in-memory indexing
- Sharding: Support for large datasets through sharding
- Timestamp Tracking: All changes are tracked with UTC timestamps
Installation
pip install graph-builder
Requirements
- Python 3.12+
- SQLite3 (for SQLiteIndexer)
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graph_builder-0.2.0.tar.gz.
File metadata
- Download URL: graph_builder-0.2.0.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d22b80a64666a532fe7604a9d9145fb82f3a1871cd55259378bcfc166d90019e
|
|
| MD5 |
1964d986a2277e29233a02ee9a4f74aa
|
|
| BLAKE2b-256 |
6d2a3cbd3db94a0aecd13eeae9ce6db5f7f2c0d54a3cab359f2974fa86b5ffbb
|
Provenance
The following attestation bundles were made for graph_builder-0.2.0.tar.gz:
Publisher:
publish.yml on beanone/graph_builder
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graph_builder-0.2.0.tar.gz -
Subject digest:
d22b80a64666a532fe7604a9d9145fb82f3a1871cd55259378bcfc166d90019e - Sigstore transparency entry: 218828240
- Sigstore integration time:
-
Permalink:
beanone/graph_builder@ce381f84240b643b4d7261415db6f2f90c662658 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/beanone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ce381f84240b643b4d7261415db6f2f90c662658 -
Trigger Event:
push
-
Statement type:
File details
Details for the file graph_builder-0.2.0-py3-none-any.whl.
File metadata
- Download URL: graph_builder-0.2.0-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1082b77224abc60baaaed4bdded089af7a6c09622933023d7b856c83546a139
|
|
| MD5 |
40ccc31daeafe9ff284fc872f512a3bf
|
|
| BLAKE2b-256 |
c0853581bd06e92c3b6c79ec799f2c07e4b7fb580c7e05f1e69d4be7b0ceb43b
|
Provenance
The following attestation bundles were made for graph_builder-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on beanone/graph_builder
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graph_builder-0.2.0-py3-none-any.whl -
Subject digest:
a1082b77224abc60baaaed4bdded089af7a6c09622933023d7b856c83546a139 - Sigstore transparency entry: 218828244
- Sigstore integration time:
-
Permalink:
beanone/graph_builder@ce381f84240b643b4d7261415db6f2f90c662658 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/beanone
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ce381f84240b643b4d7261415db6f2f90c662658 -
Trigger Event:
push
-
Statement type: