Lightweight knowledge graph ingestion and enrichment pipeline

These details have not been verified by PyPI

Project description

LiteGraf

Lightweight knowledge graph ingestion and query pipeline. Insert text or documents, extract entities and relationships with an LLM, store them in a graph database, and query with natural language.

from pipeline.litegraf import LiteGraf

kg = LiteGraf()
kg.insert("TP53 is associated with multiple cancers including breast and lung cancer.")
result = kg.query("What cancers are associated with TP53?")
print(result.answer)

Features

Single entry point — LiteGraf dataclass with sensible defaults, override only what you need
Pluggable backends — Neo4j, Memgraph, Ollama, Cloudflare Workers AI, AWS Bedrock
Sync and async — insert() / ainsert(), query() / aquery()
Content deduplication — hash-based, idempotent inserts
LLM response caching — disk-based, avoids redundant API calls
Rate limiting — async concurrency limiter for LLM providers
PDF and document ingestion — via MarkItDown + PyMuPDF
Benchmarking suite — compare extraction quality across LLM providers
Enrichment pipeline — entity resolution, ontology integration, evidence scoring

Install

Requires Python 3.11+.

pip install litegraf

With optional backends:

pip install litegraf[neo4j]       # Neo4j graph store
pip install litegraf[bedrock]     # AWS Bedrock LLM
pip install litegraf[all]         # Everything

Or from source with uv:

git clone https://github.com/graffold/litegraf.git
cd litegraf
uv sync --all-extras

Quick Start

Default setup (Ollama + Neo4j)

Start Ollama and Neo4j locally, then:

from pipeline.litegraf import LiteGraf

kg = LiteGraf()  # connects to localhost defaults

# Insert text
kg.insert("BRCA1 interacts with RAD51 in DNA repair pathways.")

# Insert a PDF
kg.insert(open("paper.pdf", "rb").read())

# Query
result = kg.query("What proteins interact with BRCA1?")
print(result.answer)
print(result.context)  # retrieved graph context

Cloudflare Workers AI (free tier)

kg = LiteGraf(
    llm="cloudflare",
    llm_model="@cf/meta/llama-3.1-8b-instruct-fp8",
)

Memgraph backend

kg = LiteGraf(
    graph_store="memgraph",
    graph_uri="bolt://localhost:7687",
    graph_user="",
    graph_password="",
)

Async usage

import asyncio
from pipeline.litegraf import LiteGraf

async def main():
    kg = LiteGraf()
    await kg.ainsert("TP53 suppresses tumor growth.")
    result = await kg.aquery("What does TP53 do?")
    print(result.answer)

asyncio.run(main())

Query modes

# Full pipeline: retrieve context → LLM synthesis
result = kg.query("What cancers involve TP53?")

# Context only (bring your own LLM prompt)
result = kg.query("TP53", mode="only_context")
for chunk in result.context:
    print(chunk.text, chunk.score)

Configuration

All parameters can be set via the LiteGraf constructor:

Parameter	Default	Description
`graph_store`	`"neo4j"`	Graph backend: `"neo4j"`, `"memgraph"`, or instance
`graph_uri`	`"bolt://localhost:7687"`	Bolt connection URI
`graph_user`	`"neo4j"`	Graph database username
`graph_password`	`""`	Graph database password
`llm`	`"ollama"`	LLM provider: `"ollama"`, `"cloudflare"`, `"bedrock"`
`llm_model`	`"llama3"`	Model name/ID
`embedding`	`"local"`	Embedding provider (local sentence-transformers)
`chunk_token_size`	`512`	Tokens per chunk
`enable_cache`	`True`	Cache LLM responses to disk
`enable_dedup`	`True`	Skip duplicate content on insert

Benchmarks

Compare extraction quality across LLM providers on biomedical datasets:

python -m pipeline.benchmarks

Results are published to docs/ for GitHub Pages viewing.

Development

uv sync --all-extras --group dev
uv run pytest
uv run ruff check src/

License

AGPL-3.0

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litegraf-0.1.0.tar.gz (410.8 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

litegraf-0.1.0-py3-none-any.whl (471.6 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file litegraf-0.1.0.tar.gz.

File metadata

Download URL: litegraf-0.1.0.tar.gz
Upload date: May 6, 2026
Size: 410.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for litegraf-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f729d200726449dda19a7472b73c89218f373be43e23facd9c6bd9a07a2e7bae`
MD5	`1717940169d00e79015c01479f619d5c`
BLAKE2b-256	`bcb18b1beedc5aa565d00a8bca35e8f7ac00931015e4d235ecf0877c4fe65295`

See more details on using hashes here.

File details

Details for the file litegraf-0.1.0-py3-none-any.whl.

File metadata

Download URL: litegraf-0.1.0-py3-none-any.whl
Upload date: May 6, 2026
Size: 471.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for litegraf-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2413732667ac4ae6f40431a434ec09d79dd17ef85617770cb8c8f705dc161b8e`
MD5	`07c105369c4e69b1eccd844cfe27698a`
BLAKE2b-256	`1c0ebd53aa6e97228fb366aa00c5693d29f725cd9159aea4477ced212f5fb0dc`

See more details on using hashes here.

litegraf 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

LiteGraf

Features

Install

Quick Start

Default setup (Ollama + Neo4j)

Cloudflare Workers AI (free tier)

Memgraph backend

Async usage

Query modes

Configuration

Benchmarks

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes