LLM-powered wiki generator for any codebase

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

wikigen

LLM-powered wiki generator for any codebase — structured, interlinked Markdown notes that survive context window limits.

Inspired by Karpathy's LLM Wiki concept — wikigen is a general-purpose CLI tool that points at any project directory and generates a rich, interlinked Markdown wiki from your codebase.

Architecture

wikigen/
│
├── cli.py              ← Click entry point — routes all 4 commands
│   │
│   ├── config.py       ← WikigenConfig dataclasses, YAML load/save
│   │
│   ├── ingester.py     ← Full ingest pipeline orchestrator
│   │   ├── collector.py    walk · chunk · prioritise source files
│   │   ├── cache.py        SHA-256 hash store (.wikigen_cache.json)
│   │   ├── writer.py       Markdown output + [[wikilink]] conversion
│   │   ├── backends/       LLM abstraction layer
│   │   │   ├── Claude          Anthropic SDK
│   │   │   ├── OpenAI          openai SDK (or any compatible endpoint)
│   │   │   └── Ollama          local via httpx REST
│   │   └── prompts/        all system + user prompt builders
│   │
│   ├── updater.py      ← Incremental re-processing (changed files only)
│   │   └── (reuses collector, cache, backends, prompts)
│   │
│   └── linter.py       ← Broken [[WikiLinks]], orphan detection, CI exit code
│
tests/
└── test_wikigen.py     90 tests, zero LLM calls required

Data flow during wikigen ingest:

project/          collector.py         ingester.py           backends/
source files  →   walk + chunk    →    context summary   →   LLM call
                  SHA-256 hash         section plan           (parallel)
                                       page generation
                                            ↓
                                       writer.py  →  wiki/*.md
                                       cache.py   →  .wikigen_cache.json

Use with AI coding agents (Claude Code, Cursor, Copilot, etc.)

wikigen is designed to be invoked directly by coding agents that have shell access. No interactive prompts, no confirmations — every command is fully scriptable.

Auto-detection

wikigen auto-detects which instruction file to write based on the environment:

Environment	Detection	Behaviour
Claude Code	`CLAUDE_CODE=1` env var	Only `CLAUDE.md` is written by `init`
Cursor / Copilot / Windsurf	not detectable (IDE)	Use `--for` flag explicitly

`--for` flag

If you want explicit control over which tool's instruction file gets written:

wikigen init --for claude      # CLAUDE.md only
wikigen init --for cursor      # .cursorrules only
wikigen init --for copilot     # .github/copilot-instructions.md only
wikigen init --for aider       # .aider.conf.yml only
wikigen init --for windsurf    # .windsurf/rules only
wikigen init                   # auto-detect, or all files if no env var found

Claude Code

Claude Code always has ANTHROPIC_API_KEY in its environment — the same key it uses for its own reasoning. wikigen picks it up automatically, so no extra setup is needed.

cd /path/to/project
pip install "wikigen-cli[claude]" -q
wikigen init       # auto-detects Claude Code, writes only CLAUDE.md
wikigen ingest     # uses ANTHROPIC_API_KEY directly, normal token usage

After that, run wikigen update after every significant change to keep the wiki in sync. The wiki then becomes persistent structured context the agent can read back in future sessions — surviving the context window limit that would otherwise force it to re-read the whole codebase each time.

Other agents

Cursor, Windsurf, and Copilot are IDE-based and cannot be auto-detected from a subprocess. Set the backend explicitly in wikigen.yaml or via the --backend flag:

wikigen --backend openai ingest   # OpenAI
wikigen --backend ollama ingest   # fully local, no keys

All commands exit with code 0 on success and non-zero on error, making them composable in agent tool-call loops and CI pipelines.

Why wikigen?

Large codebases exceed the context window of any LLM. Wikigen solves this by:

Chunking your entire codebase into LLM-sized windows.
Using an LLM to synthesise structured wiki pages — not just summaries, but architecture notes, module docs, data-model refs, and more.
Writing interlinked Markdown so you can navigate your knowledge graph.
Tracking file hashes so only changed files are re-processed on wikigen update.

The resulting wiki lives next to your code, is committed to git, and stays fresh automatically.

Installation

# Core (no backend pre-installed)
pip install wikigen-cli

# With Claude (Anthropic) support
pip install "wikigen-cli[claude]"

# With OpenAI support
pip install "wikigen-cli[openai]"

# Everything
pip install "wikigen-cli[all]"

Requires Python ≥ 3.11.

Use without PyPI (local / development)

git clone https://github.com/your-org/wikigen
cd wikigen
pip install -e ".[claude]"   # registers the `wikigen` command system-wide
wikigen --version             # works immediately

Quick start

cd my-project

# 1. Scaffold config + folder structure
wikigen init

# 2. Set your API key (skip if using Claude Code or Ollama)
export ANTHROPIC_API_KEY=sk-ant-...

# 3. Generate wiki
wikigen ingest

# 4. Browse your wiki
ls wiki/

Commands

`wikigen init`

Scaffolds wikigen.yaml and the project folder structure:

raw/          ← drop source documents here (never modified by wikigen)
wiki/         ← generated wiki pages
wiki/home.md  ← placeholder, replaced by `wikigen ingest`
wiki/log.md   ← append-only operations log

Also writes AI agent instruction files so your coding assistant knows how to navigate the wiki:

wikigen init               # auto-detect tool, or write all files
wikigen init --for claude  # CLAUDE.md only (Claude Code)
wikigen init --for cursor  # .cursorrules only
wikigen init --for copilot # .github/copilot-instructions.md only
wikigen init --for aider   # .aider.conf.yml only
wikigen init --for windsurf# .windsurf/rules only
wikigen init --for all     # every file regardless of environment

wikigen init --no-agent-files  # skip all instruction files, wikigen.yaml only

`wikigen ingest`

Reads the entire codebase and generates the wiki from scratch.

wikigen ingest                  # normal run
wikigen ingest --force          # regenerate even cached pages
wikigen ingest --dry-run        # preview what would be generated
wikigen ingest --concurrency 8  # parallel LLM requests

Pipeline:

Walk project tree → collect source files
Read priority files (CLAUDE.md, README, schema) → build project context summary
Ask LLM to plan wiki structure (sections → page titles)
Generate each page in parallel, injecting relevant source chunks as context
Write interlinked Markdown to wiki/
Store SHA-256 hashes in wiki/.wikigen_cache.json

`wikigen update`

Re-processes only files that changed since the last run.

wikigen update
wikigen update --dry-run

Detects:

Changed files (hash mismatch) → re-generates affected wiki pages
Deleted files → removes cache entries

`wikigen lint`

Validates all wiki pages for consistency.

wikigen lint          # report issues, exit 1 if any found
wikigen lint --fix    # auto-fix trivial issues (e.g. add missing front matter stubs)

Checks:

[[WikiLinks]] that don't resolve to an existing page
[text](path.md) links pointing to missing files
Pages that are never linked from anywhere (orphans)
Missing YAML front matter

log.md is exempt from all lint checks — it is append-only and has no front matter by design.

Useful in CI:

# .github/workflows/wiki.yml
- run: wikigen lint

Configuration reference (`wikigen.yaml`)

project_name: "my-project"

backend:
  name: "claude"           # claude | openai | ollama
  model: "claude-sonnet-4-20250514"
  api_key_env: "ANTHROPIC_API_KEY"
  # base_url: "http://localhost:11434"  # for Ollama or OpenAI-compatible endpoints
  max_tokens: 4096
  temperature: 0.2

ingestion:
  include_patterns: ["**/*"]
  exclude_patterns:
    - "**/.git/**"
    - "**/node_modules/**"
    - "**/__pycache__/**"
  max_file_size_kb: 256
  chunk_size_tokens: 6000
  chunk_overlap_tokens: 200

wiki:
  sections:
    - Overview
    - Architecture
    - Modules
    - Data Models
    - API Reference
    - Configuration
    - Development Guide
  index_page: "Home"
  link_style: "wikilink"   # wikilink ([[Page]]) or markdown ([Page](Page.md))
  front_matter: true

Backends

Claude (Anthropic API)

pip install "wikigen-cli[claude]"
export ANTHROPIC_API_KEY=sk-ant-...

backend:
  name: claude
  model: claude-sonnet-4-20250514
  api_key_env: ANTHROPIC_API_KEY

OpenAI

pip install "wikigen-cli[openai]"
export OPENAI_API_KEY=sk-...

backend:
  name: openai
  model: gpt-4o
  api_key_env: OPENAI_API_KEY

Also works with any OpenAI-compatible API (Together, Groq, Azure, etc.) by setting base_url.

Ollama (local)

ollama pull llama3

backend:
  name: ollama
  model: llama3
  base_url: "http://localhost:11434"

No API key required. All processing stays on your machine.

Wiki structure

After wikigen ingest, your project looks like:

raw/                           ← drop source documents here (immutable)
wiki/
├── home.md                    ← index page with full ToC
├── log.md                     ← append-only operations log
├── .wikigen_cache.json        ← hash cache (commit this)
├── architecture/
│   ├── system-overview.md
│   ├── request-lifecycle.md
│   └── data-flow.md
├── modules/
│   ├── auth-module.md
│   └── payment-module.md
├── data-models/
│   ├── user-model.md
│   └── order-model.md
└── ...

Each page has YAML front matter:

---
title: RequestLifecycle
description: How HTTP requests flow through the system.
tags: [architecture, http, middleware]
related: [SystemOverview, AuthModule]
---

And uses [[WikiLinks]] for cross-references (compatible with Obsidian, Foam, Logseq, etc.).

Global options

wikigen --project-dir /path/to/project  ingest
wikigen --wiki-dir /custom/wiki/path    ingest
wikigen --backend openai                ingest   # override config backend

Development

git clone https://github.com/your-org/wikigen
cd wikigen
pip install -e ".[dev]"

# Run tests (no API key needed — all LLM calls are unit-tested without network)
pytest

# Lint
ruff check wikigen/
mypy wikigen/

Roadmap

wikigen serve — local web UI for browsing the wiki
GitHub Actions integration template
Embeddings-based chunk retrieval for better relevance
Support for multi-modal (diagrams via GPT-4V / Claude Vision)
wikigen diff — show what changed between two wiki generations
MkDocs / Docusaurus export

License

MIT © wikigen contributors

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Arashk

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.4

May 26, 2026

0.1.3

May 26, 2026

This version

0.1.2

May 26, 2026

0.1.1

May 25, 2026

0.1.0

May 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wikigen_cli-0.1.2.tar.gz (40.6 kB view details)

Uploaded May 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

wikigen_cli-0.1.2-py3-none-any.whl (35.1 kB view details)

Uploaded May 26, 2026 Python 3

File details

Details for the file wikigen_cli-0.1.2.tar.gz.

File metadata

Download URL: wikigen_cli-0.1.2.tar.gz
Upload date: May 26, 2026
Size: 40.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wikigen_cli-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`0d5e7a98532300a7b98cf66e17a46b6089310601c03a577fb90bc3e743b90ae8`
MD5	`2a5bb3e8a2f391d3c953e4ff417230ca`
BLAKE2b-256	`f3ee7fc444b3b1bc33501147dc0ab1e1a5e606a0b353e494001a86e623e51116`

See more details on using hashes here.

Provenance

The following attestation bundles were made for wikigen_cli-0.1.2.tar.gz:

Publisher: publish.yml on birangdev/WikiGen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: wikigen_cli-0.1.2.tar.gz
- Subject digest: 0d5e7a98532300a7b98cf66e17a46b6089310601c03a577fb90bc3e743b90ae8
- Sigstore transparency entry: 1634857286
- Sigstore integration time: May 26, 2026
Source repository:
- Permalink: birangdev/WikiGen@6c0c282574eca61dc9f01590739e510088d2e34f
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/birangdev
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6c0c282574eca61dc9f01590739e510088d2e34f
- Trigger Event: push

File details

Details for the file wikigen_cli-0.1.2-py3-none-any.whl.

File metadata

Download URL: wikigen_cli-0.1.2-py3-none-any.whl
Upload date: May 26, 2026
Size: 35.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wikigen_cli-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f55374f0d2a65bcb27118a95ad8c7bf65ffe8d31ccf76b251ba3e15dedd4e34b`
MD5	`24fc190e72a218207eed49dabd2a28a8`
BLAKE2b-256	`18d9d4f9c3a744e9dc732387c099687d534f7bd679f8caa42cae610708ac06cf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for wikigen_cli-0.1.2-py3-none-any.whl:

Publisher: publish.yml on birangdev/WikiGen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: wikigen_cli-0.1.2-py3-none-any.whl
- Subject digest: f55374f0d2a65bcb27118a95ad8c7bf65ffe8d31ccf76b251ba3e15dedd4e34b
- Sigstore transparency entry: 1634857437
- Sigstore integration time: May 26, 2026
Source repository:
- Permalink: birangdev/WikiGen@6c0c282574eca61dc9f01590739e510088d2e34f
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/birangdev
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@6c0c282574eca61dc9f01590739e510088d2e34f
- Trigger Event: push

wikigen-cli 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

wikigen

Architecture

Use with AI coding agents (Claude Code, Cursor, Copilot, etc.)

Auto-detection

--for flag

Claude Code

Other agents

Why wikigen?

Installation

Use without PyPI (local / development)

Quick start

Commands

wikigen init

wikigen ingest

wikigen update

wikigen lint

Configuration reference (wikigen.yaml)

Backends

Claude (Anthropic API)

OpenAI

Ollama (local)

Wiki structure

Global options

Development

Roadmap

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`--for` flag

`wikigen init`

`wikigen ingest`

`wikigen update`

`wikigen lint`

Configuration reference (`wikigen.yaml`)