An Anytype-native LLM wiki combining Karpathy's pattern with Anytype's typed knowledge graph (Objects, Types, Relations).

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

scheufen

These details have not been verified by PyPI

Project description

anytype-llm-wiki

Example Anytype graph

A local-first, typed "second brain" on Anytype — for humans and AI agents.

It takes Andrej Karpathy's LLM-wiki idea — let an LLM compile your sources into a curated, interlinked knowledge base you can then query — and builds it on Anytype's native Objects, Types, and Relations instead of flat Markdown files. Everything is exposed over the Model Context Protocol, so Claude Code, Cursor, any MCP client — or your own autonomous agents — can both read and write it. It runs entirely on your machine.

Why a typed graph instead of flat notes or plain RAG?

Typed Objects and bidirectional Relations — not files. Knowledge lands as Entity, Concept, and Source objects linked by real, traversable relations in a queryable database. Markdown wikis (Obsidian, Logseq) give you backlinks over text files; Anytype gives you a typed knowledge graph.
It detects contradictions. When newly ingested facts conflict with an already-linked entity, both positions are kept and cross-linked (wiki_contradictions) and flagged for review — never silently overwritten. Your knowledge base tells you when it disagrees with itself. Flat wikis and vector stores can't.
Cited synthesis, not just search. wiki_query returns a prose answer drawn only from your wiki, citing the exact Objects it used — and can file the answer back so the wiki gets a little better every time it's used.
Local-first. Anytype + Ollama (embeddings & extraction) + Qdrant (vectors), all on localhost. Nothing leaves your machine by default. See Security & data flow.

Use cases

1. A research / knowledge wiki (for you)

Point it at sources — Wikipedia articles, papers, internal docs, your own notes — and it compiles them into typed, interlinked Entities and Concepts with provenance, deduping and merging as it goes. Then ask questions in plain language and get answers synthesized only from your wiki, each citing the Objects it drew from. It's Karpathy's LLM-wiki pattern on a real database: the graph is browsable in Anytype, and every fact traces back to a Source.

→ Path: wiki_bootstrap → wiki_ingest → wiki_query (walkthrough).

2. A secondary brain for an AI agent fleet

Give autonomous agents a persistent, typed memory that survives sessions and is shared across projects. Agents narrate what they learn (wiki_remember) — decisions, durable facts, and the relations between things — and read it back with citations (wiki_query) before starting new work. Consolidation makes repeated writes safe (it dedups, supersedes, and flags contradictions instead of overwriting), and a periodic wiki_lint surfaces contradictions and staleness for review. One brain, contradiction-aware, that compounds as the fleet works.

→ Path: register as an MCP server in your agent runtime, then wiki_remember / wiki_query.

This is exactly how we use it at Aldeia IT: as the shared long-term memory for our autonomous SDLC agent fleet.

3. A research buffer that cuts repeated web search

Researching a topic across many sessions means re-fetching the same facts from the web again and again. Ingest findings once and the wiki becomes a local, cited cache: future questions are answered from accumulated knowledge first, with a live web search reserved for genuine gaps — fewer tokens, faster answers, and a provenance trail.

→ Concrete example: a Capoeira genealogy research project uses it as exactly this kind of buffer — caching lineage and history research so repeated LLM web-searches are avoided.

How it works

Everything runs locally — no off-machine egress. An MCP client calls the anytype-llm-wiki server, which orchestrates three local backends: Anytype (the typed knowledge graph), Ollama (extraction / reasoning LLM + embeddings), and Qdrant (vectors).

System architecture overview

Questions are answered only from your wiki, with citations — and the Q&A can be filed back so future questions retrieve from it. The wiki gets more useful the more you use it:

The compounding loop

📊 Full visual guide → — the write pipeline, the typed object model, and the self-auditing health check.

Objects carry their knowledge in properties (wiki_facts, wiki_definition, …), not in the object body — so an ingested object shows an empty body in the Anytype client by design; the content is fully indexed and retrievable.

Quick start

Prerequisites

Anytype desktop (REST API on port 31012)
Ollama with an embedding model: ollama pull bge-m3 (extraction also uses a small local generation model, e.g. ollama pull qwen2.5:7b)
Qdrant: docker run -p 6333:6333 qdrant/qdrant

Install

Install from source with uv (PyPI publishing is on the roadmap):

git clone https://github.com/Aldeia-IT/anytype-llm-wiki.git
cd anytype-llm-wiki
uv sync

Run any command with uv run anytype-llm-wiki …. Running it with no subcommand starts the MCP server over stdio.

Configure

Create a .env (only ANYTYPE_API_KEY is required):

ANYTYPE_API_KEY=your-anytype-api-key     # Anytype → Settings → API
# Optional (defaults shown):
ANYTYPE_API_URL=http://127.0.0.1:31012
QDRANT_URL=http://127.0.0.1:6333
OLLAMA_URL=http://127.0.0.1:11434
EMBED_MODEL=bge-m3

Verify & provision

uv run anytype-llm-wiki doctor                          # read-only preflight (Anytype, Qdrant, Ollama)
uv run anytype-llm-wiki wiki-bootstrap --space-id <id>  # idempotently create the typed wiki schema

wiki-bootstrap is safe to re-run — it reconciles the space to the expected schema without creating duplicates. Re-run it after an upgrade that changes the schema (the CHANGELOG flags those).

Register as an MCP server

Claude Code:

claude mcp add anytype-llm-wiki -e ANYTYPE_API_KEY=your-key \
  -- uv run --directory /path/to/anytype-llm-wiki anytype-llm-wiki

Claude Desktop / Cursor / other clients — add to your MCP config:

{
  "anytype-llm-wiki": {
    "command": "uv",
    "args": ["run", "--directory", "/path/to/anytype-llm-wiki", "anytype-llm-wiki"],
    "env": { "ANYTYPE_API_KEY": "your-key" }
  }
}

Try it in 5 minutes

Build a research wiki and query it (from an empty space):

# 1. Provision the typed schema.
uv run anytype-llm-wiki wiki-bootstrap --space-id <id>

# 2. Compile a source into typed, interlinked Objects (auto-reindexes).
uv run anytype-llm-wiki wiki-ingest --space-id <id> \
  --source https://en.wikipedia.org/wiki/Retrieval-augmented_generation

# 3. Ask a question — answered only from your wiki, with citations.
#    --file-back stores the Q&A so it can be retrieved by FUTURE queries.
uv run anytype-llm-wiki wiki-query --space-id <id> \
  --question "What is retrieval-augmented generation?" --file-back

Give an agent memory — once registered over MCP, your agent can:

wiki_remember(space_id, "Qdrant 1.12 added native multi-tenancy via payload partitioning.", subject_hint="Qdrant")
wiki_query(space_id, "What do we know about Qdrant multi-tenancy?")

The MCP tools

Tool	What it does
`semantic_search`	Search the vault by meaning. `query`, `space_id?`, `types?`, `limit?`
`reindex_anytype`	Trigger an incremental reindex. `space_id?`
`wiki_bootstrap`	Provision the typed wiki schema in a space. `space_id`, `domain_tags?`
`wiki_ingest`	Compile a source (URL or file) into curated, interlinked Objects with provenance; auto-reindex. `source`, `space_id`, `domain_hint?`
`wiki_remember`	Consolidate an agent's natural-language narration into typed Objects (LLM merge/dedup/conflict-flag). Fleet-safe queue-submit: concurrent writers never block or lose writes (no read-after-write). `space_id`, `knowledge`, `subject_hint?`, `kind?`, `relations?`, `domain_tags?`, `source?`
`wiki_query`	Query the wiki for a synthesized, source-cited answer (tiered retrieval + local synthesis); optionally file the answer back. `question`, `space_id`, `file_back?`
`wiki_lint`	Read-only structural health check (contradictions, orphans, staleness, asymmetric relations, …), ranked by severity. `space_id`, `severity_threshold?`, `include_duplicates?`

Extraction and synthesis run on local Ollama by default (WIKI_EXTRACT_MODEL, default qwen2.5:7b); pointing WIKI_EXTRACT_ENDPOINT at a hosted API moves that processing off-machine behind a one-time consent gate — see Security & data flow.

Key behaviors worth knowing

Contradiction detection is automatic, but scoped. At ingest, when an updated entity's new facts conflict with an already-linked peer, both are cross-linked via wiki_contradictions and left for review (wiki_lint flags them High). Today detection is entity-only and bounded to linked entities (wiki_concept scope deferred) — an entity that contradicts something it isn't linked to won't surface a finding yet. Don't over-trust a clean contradiction column.
Cited synthesis + a compounding loop. wiki_query answers only from retrieved Objects and cites them. A clean answer that meets the file-back gate (≥ 3 cited sources and ≥ 100 words, or file_back=True) is stored as a typed Query Object; after the next reindex it becomes retrievable itself — so the wiki improves with use. (Filed answers surface only after that reindex — see known limitations.)
Safe repeated writes (wiki_remember). Reworded duplicates merge, genuinely new facts append, superseding facts replace (the prior text is recorded in the WikiLog and recoverable), contradictions are flagged not overwritten, and re-asserting the same knowledge converges to a no-op.
Fleet-safe concurrent writes (no read-after-write). Independent agents on separate PIDs/terminals can wiki_remember the same space at once: each durably queues its subjects (a lock-free append to the work-log) and whichever process holds the per-space lock drains them — nobody blocks, nobody's learnings are dropped. A submit may return before its subjects are applied, so a wiki_query immediately afterward may not see them yet (the wiki is for the next agent, not the submitter's own next line). Same-host only — see known limitations. The wiki-drain CLI forces a synchronous drain when you need one.
Tiered retrieval. Below WIKI_INDEX_THRESHOLD (default 200) Objects, wiki_query reads the whole wiki directly (exhaustive and fast); above it, it uses vector search plus 1-hop neighborhood expansion.
Incremental, schedulable indexing. Only changed objects are re-embedded. For continuous indexing, run reindex_anytype on a schedule (cron/launchd — a sample plist ships in the repo). For high agent write-rates, set WIKI_AUTO_REINDEX=false and batch a scheduled reindex, since reindex cost scales with total space size.

Performance

Benchmarked on a Mac Mini (Apple Silicon):

Operation	Time
Single search query	0.22s
Index 50 chunks	0.73s
Full reindex (500 chunks)	~7s

Configuration

ANYTYPE_API_KEY is the only required variable; sensible defaults cover the rest.

Variable	Default	Description
`ANYTYPE_API_URL`	`http://127.0.0.1:31012`	Anytype REST API endpoint
`QDRANT_URL`	`http://127.0.0.1:6333`	Qdrant endpoint
`OLLAMA_URL`	`http://127.0.0.1:11434`	Ollama endpoint
`EMBED_MODEL` / `EMBED_DIMS`	`bge-m3` / `1024`	Embedding model and its vector dimensions (must match)
`WIKI_EXTRACT_MODEL`	`qwen2.5:7b`	Local model for extraction / synthesis / consolidation
`WIKI_ALIAS_ADJUDICATION`	`off`	⚠️ EXPERIMENTAL — enable at your own risk. LLM alias-merge in entity resolution (Step 3). Off by default. Only runs on a vetted model; enabling it on an unvetted model makes the MCP server refuse to start (loud `[CONFIG ERROR]`). See the warning below.
`WIKI_ALIAS_VETTED_MODELS`	(empty)	Comma-separated extra extraction-model prefixes trusted for alias adjudication, unioned with the built-in `qwen3.5-mlx`. Adding your model here is the override (there is no force flag).

⚠️ WIKI_ALIAS_ADJUDICATION is experimental — leave it off unless you accept the risk. What it does: on a write, when exact- and fuzzy-title matching don't find an existing object, it asks a local LLM whether the new entity is the same real-world entity as a lexically-similar existing one (an alias / abbreviation / rename) and, if so, merges into it instead of creating a duplicate — automatically catching dupes like k8s → Kubernetes. The risk: the judgment is destructive and irreversible-ish (the new object is never created), and even a vetted model over-merges distinct entities on real, messy data (observed ~7–10% on a real graph — e.g. merging a person into the eponymous project, a testnet into its mainnet, or a collection into one of its members). It is deliberately conservative and gated behind this off-by-default flag + a vetted-model startup check, but it can still corrupt your graph. For curation we recommend the non-destructive path instead: wiki_lint --include-duplicates, which only surfaces potential_duplicate suggestions for a human to review and merge. | WIKI_EXTRACT_ENDPOINT | (unset → local Ollama) | Hosted LLM endpoint for extraction (off-machine; consent-gated) | | WIKI_INDEX_THRESHOLD | 200 | Object count at which wiki_query flips Tier 1 → Tier 2 | | WIKI_AUTO_REINDEX | true | Auto-reindex after each write (set false to batch via a scheduled reindex) | | WIKI_LOCK_DIR / WIKI_WORKLOG_DIR | ~/.local/share/anytype-llm-wiki/{locks,worklog} | Host-local lock + durable subject work-log. A same-host agent fleet writing one shared vault must share both (see known limitations §10); the work-log holds narrated content transiently — treat as sensitive (data flow) |

Additional WIKI_SYNTH_* and WIKI_LINT_* tuning knobs exist with sensible defaults — you won't normally need them.

Architecture

Anytype client — reads/writes objects via the REST API; handles pagination and auth.
Chunker — splits markdown by headings, falls back to paragraphs; each chunk carries object/space/type/heading metadata.
Embedder / Indexer — Ollama /api/embed; incremental by last_modified_date, re-embedding only changed objects and cleaning up vectors for deleted ones.
Wiki pipeline — LLM extraction → entity/concept resolution → typed Objects with bidirectional Relations → contradiction detection → cited synthesis.
MCP server — FastMCP over stdio, exposing the seven tools above.
doctor — read-only preflight (Anytype, Qdrant, Ollama, embedding model).

📊 Architecture — Visual Guide — diagrams of the components, the write/read pipelines, the typed object model, and the health check.

For the internals — the write pipeline, how consolidation corrects reality, entity-resolution & duplicate handling, the concurrency model, and the no-drop subject work-log — see Architecture & internals.

Supply-chain posture

Dependencies are pinned in two layers: uv.lock locks every direct and transitive dependency to an exact, content-hashed version (CI runs uv lock --check), and pyproject.toml declares compatible ranges with next-major upper bounds so a transitive resolution can't silently cross a major version. Release artifacts are built cache-free and signed with a SLSA build-provenance attestation; once wheels are published you'll be able to gh attestation verify them against this repo.

Roadmap

Hybrid search — semantic similarity + keyword + metadata filters
Relationship-aware retrieval — follow Anytype Relations to pull connected context
Contradiction detection beyond linked entities (semantic pre-filter) and across Concepts
Cross-space federation with access control
PyPI publishing
Webhook-based indexing when Anytype adds webhook support

See the GitHub Releases and CHANGELOG for what's shipped.

Comparison

	anytype-llm-wiki	Flat-file wiki (Obsidian / Logseq)	Plain vector RAG
Storage	Typed Anytype Objects + Relations	Markdown files + backlinks	Chunks in a vector DB
Knowledge model	Entities/Concepts in a queryable graph	Documents you organize by hand	Opaque chunks
Contradiction handling	Detected & cross-linked for review	None	None
Answers	Synthesized, with Object citations	You read & connect	Retrieved snippets
Agent read and write	Yes (MCP)	Manual	Read-mostly
Local-first	Yes (Ollama + Qdrant)	Yes	Varies

It also differs from API-access MCPs like anyproto/anytype-mcp (object CRUD, no semantic/vector search) and wethegreenpeople/anytype-mcp (ChromaDB, full re-embed on start): embedding-backed semantic retrieval plus the typed-wiki pipeline is the core differentiator.

Contributing

Maintained by Aldeia IT for our own use and published openly. We're not actively soliciting contributions right now and may be slow to respond to issues and PRs — but you're welcome to fork it. Security issues: please use private reporting, not a public issue. Dev setup and expectations are in CONTRIBUTING.md; please be kind (Code of Conduct).

License

MIT. See CONTRIBUTING.md for contribution licensing (inbound = outbound).

Trademarks

Anytype is a trademark of Any Association. This project is not affiliated with, sponsored by, or endorsed by Any Association or the Anytype project; the name is used solely to identify the platform this software integrates with.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

scheufen

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.7.4

Jun 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anytype_llm_wiki-0.7.4.tar.gz (1.6 MB view details)

Uploaded Jun 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

anytype_llm_wiki-0.7.4-py3-none-any.whl (114.5 kB view details)

Uploaded Jun 10, 2026 Python 3

File details

Details for the file anytype_llm_wiki-0.7.4.tar.gz.

File metadata

Download URL: anytype_llm_wiki-0.7.4.tar.gz
Upload date: Jun 10, 2026
Size: 1.6 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for anytype_llm_wiki-0.7.4.tar.gz
Algorithm	Hash digest
SHA256	`2b6e7b3a479a55dcc448ab85f5d208fe214aa36ee1023d862f0ee3683d77769d`
MD5	`2622e5086e8959d7e3ce40cd00b31999`
BLAKE2b-256	`2af50e3a6e98217ceb7a81a719c6360c1f7e63c98584a0deb6bd58c022cd1725`

See more details on using hashes here.

File details

Details for the file anytype_llm_wiki-0.7.4-py3-none-any.whl.

File metadata

Download URL: anytype_llm_wiki-0.7.4-py3-none-any.whl
Upload date: Jun 10, 2026
Size: 114.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for anytype_llm_wiki-0.7.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3dc95462af25683d8e54e4e428b4a98ed3f40b841d9d5cf6db4fb944be3a31e4`
MD5	`2f7c2d18233e72ee0802be10fdc8846b`
BLAKE2b-256	`3ac3ae145398389020fff6f6b22fa5cad43f5ee85c2c4fd8a9fe55b35f13cfca`

See more details on using hashes here.

anytype-llm-wiki 0.7.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

anytype-llm-wiki

Why a typed graph instead of flat notes or plain RAG?

Use cases

1. A research / knowledge wiki (for you)

2. A secondary brain for an AI agent fleet

3. A research buffer that cuts repeated web search

How it works

Quick start

Prerequisites

Install

Configure

Verify & provision

Register as an MCP server

Try it in 5 minutes

The MCP tools

Key behaviors worth knowing

Performance

Configuration

Architecture

Supply-chain posture

Roadmap

Comparison

Contributing

License

Trademarks

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes