Graph-backed knowledge extraction and agent memory for LLM systems

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

MadDataAnalyst

These details have not been verified by PyPI

Project links

Project description

GraWiki

GraWiki is an early-stage open source Python project for graph-backed knowledge extraction, retrieval, and agent memory.

It combines two closely related ideas:

GraphRAG-style ingestion and retrieval over extracted entities and relationships.
Andrej Karpathy's "LLM Wiki" style memory, where prior agent work is stored as durable graph state instead of only transient prompt context.

The project uses an LLM to turn text into structured graph data, persists that data in a graph database, and then reuses the same graph for search, memory recall, similarity inspection, and duplicate-entity cleanup.

What GraWiki does

GraWiki currently focuses on two main workflows.

Document-to-graph ingestion. Source documents are read, chunked, embedded, processed by an LLM extractor, and persisted as document nodes, chunk nodes, entity nodes, and typed relationships.
Graph-backed agent memory. Agent outputs can be stored as dedicated __memory__ nodes, linked back into the graph, and later recalled together with connected context.

Current capabilities include:

reading source documents and splitting them into chunks,
extracting entities and relationships from chunk text,
persisting documents, chunks, entities, relationships, and memories in a graph database,
retrieving graph-backed context with vector and full-text search,
expanding graph context around matched entities and memories,
inspecting duplicate candidates through semantic-key collision checks and pluggable similarity matchers,
merging duplicate entities through the facade-level deduplication workflow.

Installation

Install the base package:

pip install grawiki

Install the local file-backed FalkorDBLite backend:

pip install 'grawiki[falkordblite]'

Install the full FalkorDB server backend:

pip install 'grawiki[falkordb]'

Useful optional extras:

grawiki[notebooks] for the maintained notebooks.
grawiki[viz] for networkx and matplotlib graph visualization.
grawiki[docs] for local MkDocs builds.
grawiki[all] for the full optional dependency set.

Package layout

The public repository is organized around a small number of major areas.

src/grawiki/: main application package.
tests/: pytest coverage for the facade, retrieval layer, graph models, extraction, query generation, and FalkorDB adapter.
docs/: public MkDocs documentation, including narrative pages and generated API reference pages under docs/api/.
notebooks/: maintained tutorial notebooks plus sample input data.

Main package structure

grawiki.core: shared source-data types and the embedding protocol.
grawiki.doc_processing: document loading and chunking.
grawiki.graph: graph schema, extraction, and prompts.
grawiki.db: backend-agnostic database interfaces plus the FalkorDB implementation.
grawiki.retrieval: query-time retrieval strategies.
grawiki.similarity: duplicate-candidate inspection, similarity matchers, and deduplication helpers.
grawiki.rag: the GraphRAG facade that ties ingestion, retrieval, memory, and deduplication together.

Core entrypoints

grawiki.GraphRAG: the main public facade.
src/grawiki/rag/graph_rag.py: end-to-end ingestion, search, recall, memory, and deduplication flows.
src/grawiki/graph/models.py: the canonical graph schema.
src/grawiki/db/base.py: the backend contract.
src/grawiki/similarity/: the duplicate-inspection and merge-support surface.

Runtime flow

At a high level, GraWiki works like this:

A source document is loaded and split into chunks.
Documents and chunks are embedded and persisted.
Each chunk is sent to an LLM extractor to produce nodes and relationships.
Extracted entities can optionally be resolved against existing persisted entities during ingest.
The resulting graph is stored and becomes available for retrieval, memory linking, and later deduplication.
Queries are handled through configured retrievers that combine text search, vector search, and graph-context expansion.
Memory recall searches __memory__ nodes first, then expands linked graph context around them.

Documentation

Public documentation lives in docs/ and is built with MkDocs Material. It includes:

conceptual background,
flow documentation,
a project structure page,
generated API reference pages centered on GraphRAG.

Tutorial notebooks

The repository ships three tutorial notebooks under notebooks/:

01_ingest_and_deduplicate.ipynb: step-by-step ingestion, entity inspection, duplicate finding, deduplication, and final querying.
02_agent_memory_and_recall.ipynb: a pydantic_ai.Agent wired to GraphRAG.search(...), GraphRAG.remember(...), and GraphRAG.recall(...).
03_visualize_graph.ipynb: a lightweight graph view built with optional networkx and matplotlib.

Run notebook 1 first. Notebook 2 reuses the same FalkorDB graph, and notebook 3 visualizes that populated graph.

The sample texts used by the notebooks live in notebooks/experimental_data/. They are Medium.com articles by Filip Wojcik, sourced from https://medium.com/@filip.igor.wojcik, and are fully accessible without a subscription.

Development

Install development tooling:

uv sync --group dev

Install development tooling with the FalkorDBLite notebook stack:

uv sync --group dev --extra falkordblite --extra notebooks --extra viz

Install development tooling with the Docker-backed FalkorDB stack:

uv sync --group dev --extra falkordb --extra notebooks --extra viz

Install the documentation toolchain:

uv sync --group dev --extra docs

Install git hooks:

uv run pre-commit install

Run all configured checks manually:

uv run pre-commit run --all-files

Run the test suite:

uv run pytest

Build the public documentation site locally:

uv run mkdocs build --strict

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

MadDataAnalyst

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

May 5, 2026

0.1.2

Apr 30, 2026

0.1.1

Apr 26, 2026

This version

0.1.0

Apr 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grawiki-0.1.0.tar.gz (43.7 kB view details)

Uploaded Apr 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

grawiki-0.1.0-py3-none-any.whl (54.9 kB view details)

Uploaded Apr 26, 2026 Python 3

File details

Details for the file grawiki-0.1.0.tar.gz.

File metadata

Download URL: grawiki-0.1.0.tar.gz
Upload date: Apr 26, 2026
Size: 43.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for grawiki-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`dc6ec3b9ea38095e4f87645344a07723e22340191d91cedf532a878add757142`
MD5	`3413c89b3e46b75fdececd2b3f640fe4`
BLAKE2b-256	`1030c09a040abdd3d31fccad1e95352bb52ce3b326f7b0561bc4e6efa4766905`

See more details on using hashes here.

Provenance

The following attestation bundles were made for grawiki-0.1.0.tar.gz:

Publisher: publish.yml on maddataanalyst/grawiki

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: grawiki-0.1.0.tar.gz
- Subject digest: dc6ec3b9ea38095e4f87645344a07723e22340191d91cedf532a878add757142
- Sigstore transparency entry: 1389218363
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: maddataanalyst/grawiki@2f051d73a32e74eeb951555a7715802acdd7215c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/maddataanalyst
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2f051d73a32e74eeb951555a7715802acdd7215c
- Trigger Event: push

File details

Details for the file grawiki-0.1.0-py3-none-any.whl.

File metadata

Download URL: grawiki-0.1.0-py3-none-any.whl
Upload date: Apr 26, 2026
Size: 54.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for grawiki-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e953e6adf05efc65186180073e46323ba64b31302a2b07485d8a6cd3a972e4a9`
MD5	`ca6b4aac2b2e04674db34e7b2891ae3d`
BLAKE2b-256	`69dcd260c0f9d9921817c53fe80fa841c6fde4cac878f7de739751246c33529b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for grawiki-0.1.0-py3-none-any.whl:

Publisher: publish.yml on maddataanalyst/grawiki

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: grawiki-0.1.0-py3-none-any.whl
- Subject digest: e953e6adf05efc65186180073e46323ba64b31302a2b07485d8a6cd3a972e4a9
- Sigstore transparency entry: 1389218370
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: maddataanalyst/grawiki@2f051d73a32e74eeb951555a7715802acdd7215c
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/maddataanalyst
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2f051d73a32e74eeb951555a7715802acdd7215c
- Trigger Event: push

grawiki 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GraWiki

What GraWiki does

Installation

Package layout

Main package structure

Core entrypoints

Runtime flow

Documentation

Tutorial notebooks

Development

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance