Skip to main content

Graph generation and storage library with update tracking

Project description

Graph Context Banner

This library enables fast graph traversal and lookup from file-based storage with sharded and indexed structure. Now includes community exploration.

Python Versions License: MIT Tests Coverage Code Quality PyPI version

Features

  • Efficient reading of graph data from JSONL files
  • Support for multiple indexing strategies (SQLite and Memory)
  • Entity caching for improved performance
  • Adjacency list-based neighbor lookup
  • Property-based entity search
  • Community lookup
  • Configurable cache size

Architecture

The library is organized into the following components:

Core Components

  • GraphReader: Main class for reading and querying graph data
  • GraphReaderConfig: Configuration class for customizing reader behavior

Indexers

The library supports multiple indexing strategies through a plugin architecture:

  • BaseIndexer: Abstract base class for indexers
  • SQLiteIndexer: SQLite-based indexing for persistent storage
  • MemoryIndexer: In-memory indexing for faster access

Data Structure

The library expects data to be organized in the following directory structure:

base_dir/
├── entities/
│   └── shard_*.jsonl
├── relations/
│   └── shard_*.jsonl
└── adjacency/
    └── adjacency.jsonl

Architecture Diagram

graph TD
    GR[GraphReader]
    GC[GraphReaderConfig]
    BI[BaseIndexer]
    SI[SQLiteIndexer]
    MI[MemoryIndexer]
    EF[Entity Files]
    RF[Relation Files]
    AF[Adjacency File]
    DB[(SQLite DB)]

    GR --> GC
    GR --> BI
    SI --> BI
    MI --> BI
    GR --> EF
    GR --> RF
    GR --> AF
    SI --> DB

    style GR fill:#e6f3ff,stroke:#000000,stroke-width:2px,color:#000000
    style GC fill:#e6f3ff,stroke:#000000,stroke-width:2px,color:#000000
    style BI fill:#fff2e6,stroke:#000000,stroke-width:2px,color:#000000
    style SI fill:#fff2e6,stroke:#000000,stroke-width:2px,color:#000000
    style MI fill:#fff2e6,stroke:#000000,stroke-width:2px,color:#000000
    style EF fill:#f0fff0,stroke:#000000,stroke-width:2px,color:#000000
    style RF fill:#f0fff0,stroke:#000000,stroke-width:2px,color:#000000
    style AF fill:#f0fff0,stroke:#000000,stroke-width:2px,color:#000000
    style DB fill:#f0fff0,stroke:#000000,stroke-width:2px,color:#000000

Installation

pip install beanone-graph

Usage

from graph_reader import GraphReader, GraphReaderConfig

config = GraphReaderConfig(base_dir="graph_output")
reader = GraphReader(config)

# Get an entity
entity = reader.get_entity(1)
print("Entity:", entity)

# Get neighbors
neighbors = reader.get_neighbors(1)
print("Neighbors:", neighbors)

# Search
matches = reader.search_by_property("name", "Alice")
print("Matches:", matches)

# Get entity's community
community = reader.get_entity_community(1)
print("Community:", community)

# Get members of a community
members = reader.get_community_members("team_alpha")
print("Members:", members)

Search Queries

The library supports powerful search queries with various operators and conditions. Here are comprehensive examples based on a real graph structure:

Basic Property Search

# Simple equality
results = reader.search_by_property("name", "effect")

# Case-insensitive search
results = reader.search_by_property("name", "nuclear", case_sensitive=False)

# Numeric comparison
results = reader.search_by_property("community", 155, operator="==")

Complex Queries

# Multiple conditions with AND
query = "name:effect AND community:155"
results = reader.search(query)

# Multiple conditions with OR
query = "name:effect OR name:nuclear"
results = reader.search(query)

# Combining AND and OR
query = "community:155 AND (levels.0:>=3 OR levels.1:>=17)"
results = reader.search(query)

# Array value search
query = "keywords:protein OR keywords:formula"
results = reader.search(query)

# Multiple properties
query = "community:155 AND keywords:protein AND levels.0:>=3"
results = reader.search(query)

Search Operators

The following operators are supported:

  • : - Equals (default)
    • For arrays: checks if the value is in the array
    • Example: keywords:protein matches if "protein" is in the keywords array
  • :@ - Array membership
    • Checks if the property value is in the specified array
    • Example: type:@['user', 'admin'] matches if type is either 'user' or 'admin'
  • == - Equals (explicit)
  • != - Not equals
  • > - Greater than
  • >= - Greater than or equal
  • < - Less than
  • <= - Less than or equal
  • AND - Logical AND
  • OR - Logical OR
  • ( and ) - Grouping

Search Examples by Use Case

Entity Search

# Find entities by community and level
query = "community:155 AND levels.0:>=3"

# Find entities by keywords (array contains)
query = "keywords:protein AND keywords:formula"

# Find entities by name pattern
query = "name:*nuclear*"

# Find entities with multiple keywords
query = "keywords:protein AND keywords:home AND keywords:formula"

# Find entities by type (array membership)
query = "type:@['user', 'admin']"

Level-based Search

# Find entities with high level 0 values
query = "levels.0:>=50"

# Find entities with specific level combinations
query = "levels.0:>=3 AND levels.1:>=17"

# Find entities with no higher level connections
query = "levels.3:0 AND levels.4:0"

Community Search

# Find entities in specific communities
query = "community:155 OR community:245"

# Find entities by community and keywords
query = "community:155 AND keywords:protein"

# Find entities by community and level distribution
query = "community:155 AND levels.0:>=3 AND levels.1:>=17"

# Find entities with specific keyword combinations
query = "keywords:protein AND keywords:home AND keywords:formula"

Best Practices

  1. Use parentheses to group complex conditions
  2. Combine related conditions with AND
  3. Use OR for alternative values
  4. Use numeric operators for ranges
  5. Consider case sensitivity for text searches
  6. For array properties:
    • Use : to check if a value exists in an array
    • Use :@ to check if a property value is in a specified array
    • Combine multiple array conditions with AND to find entities with all specified values
  7. Use wildcards (*) for pattern matching in text fields
  8. Use dot notation for nested properties (e.g., levels.0)
  9. Combine community and level information for precise filtering
  10. Use keyword arrays for semantic search

Configuration

The GraphReaderConfig class supports the following parameters:

  • base_dir: Base directory containing the graph data
  • indexer_type: Type of indexer to use ("sqlite" or "memory")
  • cache_size: Maximum number of entities to cache in memory

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beanone_graph-0.2.0.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beanone_graph-0.2.0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file beanone_graph-0.2.0.tar.gz.

File metadata

  • Download URL: beanone_graph-0.2.0.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for beanone_graph-0.2.0.tar.gz
Algorithm Hash digest
SHA256 eca5985966cac1e6843f8d9a228f9393ba96f1063f96d46c540c3aafddc17e61
MD5 aec30778e0db1d4a7ac0a701ecdf8a59
BLAKE2b-256 9bb16bd213700f91a793757949ef5fab24038131b1b637fab2e471dc87046c17

See more details on using hashes here.

Provenance

The following attestation bundles were made for beanone_graph-0.2.0.tar.gz:

Publisher: publish.yml on beanone/graph_reader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file beanone_graph-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: beanone_graph-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for beanone_graph-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0eaf823c9bd8437b0bc4b9cb83c02cd57542a608a2f4bca124b9b4210601c9a6
MD5 2c5bf64510406566b58c0c01546c4089
BLAKE2b-256 c2cd71a4d585eb59499872645ce493566581918300e52a818f41e9cc27debf84

See more details on using hashes here.

Provenance

The following attestation bundles were made for beanone_graph-0.2.0-py3-none-any.whl:

Publisher: publish.yml on beanone/graph_reader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page