Skip to main content

GraphRAG-style Python SDK for GibRAM - Graph in-Buffer Retrieval & Associative Memory

Project description

GibRAM Python SDK v0.1.0

GraphRAG-style knowledge graph indexing with automatic entity extraction, relationship detection, and community discovery.

Installation

cd sdk/python
pip install -e .

Quick Start

from gibram import GibRAMIndexer

# Initialize indexer with OpenAI
indexer = GibRAMIndexer(
    session_id="my-project",
    llm_api_key="sk-...",  # or set OPENAI_API_KEY env
)

# Index documents (automatic chunking, extraction, embedding)
stats = indexer.index_documents([
    "Einstein was born in 1879 in Ulm, Germany.",
    "He developed the theory of relativity in 1905.",
    "Einstein received the Nobel Prize in Physics in 1921.",
])

print(f"Indexed {stats.entities_extracted} entities in {stats.indexing_time_seconds:.2f}s")

# Query knowledge graph
result = indexer.query("Einstein's achievements", top_k=5)

for entity in result.entities:
    print(f"{entity.title} ({entity.type}): {entity.description}")

Configuration

Environment Variables

export OPENAI_API_KEY="sk-..."

Initialization Parameters

indexer = GibRAMIndexer(
    # Required
    session_id="unique-project-id",
    
    # Server connection
    host="localhost",
    port=6161,
    
    # LLM configuration
    llm_provider="openai",           # Only OpenAI supported in v0.1.0
    llm_api_key="sk-...",            # Auto-detect from OPENAI_API_KEY
    llm_model="gpt-4o",              # GPT-4o recommended
    
    # Embedding configuration
    embedding_provider="openai",
    embedding_model="text-embedding-3-small",
    embedding_dimensions=1536,       # Must match server config
    
    # Chunking configuration
    chunk_size=512,                  # Tokens per chunk
    chunk_overlap=50,                # Overlap between chunks
    
    # Community detection
    auto_detect_communities=True,    # Auto-run after indexing
    community_resolution=1.0,        # Leiden algorithm resolution
)

API Reference

GibRAMIndexer

Main class for indexing and querying.

index_documents(documents, batch_size=10, show_progress=True) -> IndexStats

Index documents into knowledge graph.

Arguments:

  • documents: List of strings or dicts {"id": ..., "text": ..., "metadata": ...}
  • batch_size: Batch size for LLM/API calls (default: 10)
  • show_progress: Show progress bar (default: True)

Returns: IndexStats with counts and timing

Pipeline:

  1. Chunk documents → TextUnits
  2. Extract entities & relationships (LLM)
  3. Generate embeddings
  4. Store in graph
  5. Link entities to text units
  6. Detect communities (if enabled)

Example:

stats = indexer.index_documents([
    {"id": "doc1", "text": "...", "metadata": {"source": "wiki"}},
    {"id": "doc2", "text": "..."},
])

query(query, mode="local", top_k=10, include_entities=True, include_text_units=True, include_communities=False) -> QueryResult

Query knowledge graph.

Arguments:

  • query: Natural language query
  • mode: Query mode (v0.1.0 only supports "local")
  • top_k: Number of results (default: 10)
  • include_entities: Include entity results
  • include_text_units: Include text unit results
  • include_communities: Include community results

Returns: QueryResult with scored results

Example:

result = indexer.query("machine learning applications", top_k=5)

for entity in result.entities:
    print(f"{entity.title}: {entity.score:.3f}")

for text_unit in result.text_units:
    print(f"{text_unit.content[:100]}... (score: {text_unit.score:.3f})")

get_stats() -> IndexStats

Get current indexing statistics.

close()

Close connection to server.

Types

IndexStats

@dataclass
class IndexStats:
    documents_indexed: int = 0
    text_units_created: int = 0
    entities_extracted: int = 0
    relationships_extracted: int = 0
    communities_detected: int = 0
    indexing_time_seconds: float = 0.0

QueryResult

@dataclass
class QueryResult:
    entities: List[ScoredEntity]
    text_units: List[ScoredTextUnit]
    communities: List[ScoredCommunity]
    execution_time_ms: float

ScoredEntity

@dataclass
class ScoredEntity:
    id: int
    title: str
    type: str
    description: str
    score: float  # Similarity score

Exceptions

All exceptions inherit from GibRAMError:

  • ConnectionError: Server connection failed
  • TimeoutError: Operation timed out
  • ProtocolError: Protocol encoding/decoding error
  • ServerError: Server returned error
  • NotFoundError: Resource not found
  • ValidationError: Input validation failed
  • ExtractionError: LLM extraction failed
  • EmbeddingError: Embedding generation failed
  • ConfigurationError: Invalid configuration

Advanced Usage

Custom Extractors

Implement BaseExtractor for custom entity/relationship extraction:

from gibram.extractors import BaseExtractor
from gibram.types import ExtractedEntity, ExtractedRelationship

class MyExtractor(BaseExtractor):
    def extract(self, text: str) -> tuple[list[ExtractedEntity], list[ExtractedRelationship]]:
        # Your custom logic
        entities = [...]
        relationships = [...]
        return entities, relationships

indexer = GibRAMIndexer(
    session_id="custom",
    extractor=MyExtractor(),
    embedder=...,  # Still need embedder
)

Custom Embedders

Implement BaseEmbedder for custom embeddings:

from gibram.embedders import BaseEmbedder

class MyEmbedder(BaseEmbedder):
    def embed(self, texts: list[str]) -> list[list[float]]:
        # Your custom logic
        return [[0.1, 0.2, ...], ...]
    
    def embed_single(self, text: str) -> list[float]:
        return self.embed([text])[0]

indexer = GibRAMIndexer(
    session_id="custom",
    embedder=MyEmbedder(),
)

Context Manager

Use context manager for automatic cleanup:

with GibRAMIndexer(session_id="project") as indexer:
    stats = indexer.index_documents(documents)
    result = indexer.query("some query")
    # Connection automatically closed

Requirements

  • Python 3.8+
  • GibRAM server running (Docker recommended)
  • OpenAI API key (for extraction & embeddings)

Server Setup

Start GibRAM server with Docker:

docker run -d \
  --name gibram-server \
  -p 6161:6161 \
  -e EMBEDDING_DIM=1536 \
  gibram:latest

License

MIT

Version

v0.1.0 - Initial release with OpenAI extraction & embeddings

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gibram-0.1.0.tar.gz (24.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gibram-0.1.0-py3-none-any.whl (27.3 kB view details)

Uploaded Python 3

File details

Details for the file gibram-0.1.0.tar.gz.

File metadata

  • Download URL: gibram-0.1.0.tar.gz
  • Upload date:
  • Size: 24.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gibram-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4e1310cb46e23d5e2eb9d807e1b136537a830d577457db2efff9e4f236b50430
MD5 cc2e3dcdad86cd974e3b08abf4410814
BLAKE2b-256 f6fce1f26eb6eaa1af400ecbe5fc47ea4329844272c6d8a876c309a4fb856655

See more details on using hashes here.

Provenance

The following attestation bundles were made for gibram-0.1.0.tar.gz:

Publisher: publish.yml on gibram-io/gibram

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gibram-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: gibram-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for gibram-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9ef6aaaa13c4572bf911daa48d5912bb309768b82a4e6effcdd3db8845b60a0e
MD5 a1755e568687dc8e2f6d2ff9ec5af54f
BLAKE2b-256 294e7e74575470821a8645489fa7731ac3018b51aabd00d9308b5183e7a88f98

See more details on using hashes here.

Provenance

The following attestation bundles were made for gibram-0.1.0-py3-none-any.whl:

Publisher: publish.yml on gibram-io/gibram

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page