Skip to main content

AI-powered quant research knowledge base & brainstorm agent

Project description

Quant_LLM_Wiki: A Karpathy-shaped wiki-first knowledge base for quant research

Features | Architecture | Quick Start | Agent Usage | Configuration | Tests | Contributing

Python License LLM ChromaDB


Quant_LLM_Wiki turns WeChat articles, web pages, and research PDFs into an LLM-built Markdown knowledge base for quantitative investment research. It follows Andrej Karpathy's LLM-built KB method: a raw/ ingest layer, an LLM-compiled wiki/ of concept articles, and a schema/ that the LLM and tools both follow. Vector RAG is preserved as a fallback substrate, not the primary retrieval path. Three durable verbs — ingest, query, lint — drive everything. A built-in Rethink Layer scores novelty and quality of brainstormed ideas before output.

The goal is research inspiration and cross-document idea combination, not producing trade-ready strategies.

Features

  • Multi-source Ingestion — Ingest from single URLs, batch URL lists, or local HTML files; warns on re-ingesting previously rejected sources
  • LLM Enrichment — Automatically extract structured fields: idea blocks, transfer targets, combination hooks, failure modes, and more. Concurrent processing with configurable parallelism
  • Hybrid RAG Retrieval — Keyword + vector + RRF fusion retrieval across your knowledge base
  • Brainstorm Mode — Generate new strategy ideas by combining insights from multiple articles
  • Rethink Layer — Post-generation validation that checks idea novelty (via vector similarity) and scores quality (traceability, coherence, actionability)
  • Article Quality Control — Mark articles as rejected to remove from KB and prevent re-ingestion; review tool shows only enriched articles
  • Interactive Agent — LangGraph ReAct agent with 8 tools for full pipeline management, with real-time progress streaming
  • Provider-Agnostic — Works with any OpenAI-compatible LLM API (Zhipu GLM, DeepSeek, Moonshot, Qwen, OpenAI, Ollama, etc.)
  • Local-First — All data stored locally as Markdown files + ChromaDB vectors

Architecture

The system has three durable layers and three operational verbs. Vector RAG is preserved as supporting substrate, not the primary retrieval path.

Layout

raw/      — incoming source articles (one dir per article: article.md + source.json + images/)
wiki/     — LLM-built Markdown memory (the primary query surface)
            ├── INDEX.md          — auto-maintained table of contents
            ├── state.json        — content hashes, concept scores, retrieval hints
            ├── lint_report.json  — last health audit
            ├── concepts/<slug>.md
            ├── sources/<basename>.md
            ├── queries/<date>_<slug>_<mode>.md   — query → wiki feedback log
            └── maintenance_report.md             — last `kb lint --maintain` output
schema/   — rules the LLM and tools follow:
            concept-schema.md, source-schema.md, wiki-structure.md, operations.md
vector_store/  — ChromaDB substrate, used as fallback only

Articles live flat under raw/. The frontmatter status field (raw, reviewed, high_value, rejected) is the source of truth — there is no directory-as-status convention.

Three operations

                                              ┌──> wiki/concepts/<slug>.md
                                              ├──> wiki/sources/<basename>.md
WeChat URL / Web URL / PDF / HTML             ├──> wiki/INDEX.md
        |                                     ├──> wiki/state.json
        v                                     │    (hashes, scores, freshness, retrieval hints)
  [kb ingest] ──> raw/<dir>/article.md + source.json
        |                                     ▲
        v                                     │
  [kb compile]  ── schema/-injected LLM ──────┘
  (auto after ingest)
        |
        v
  [kb embed]  ── ChromaDB substrate over raw/ + wiki/
  (auto after compile)
        |
        v
  [kb query]  ── wiki-first retrieval (INDEX → matched concepts → source summaries)
        |        RAG runs ONLY when wiki has no relevant concept or audit reports degradation
        |        (mode: ask | brainstorm; brainstorm runs Rethink Layer post-generation)
        |
        v
  ┌─ outputs/brainstorms/<date>_<slug>_<mode>.md
  └─ wiki/queries/<date>_<slug>_<mode>.md  ── append_query_log:
                                              cited concepts get importance bump
                                              + retrieval_hints append in state.json

  [kb lint]              ── schema-compliance audit (frontmatter, sections, source anchors)
  [kb lint --fix]        ── LLM auto-repair of schema-noncompliant concepts
  [kb lint --maintain]   ── gap analysis: unmapped source clusters, under-supported concepts,
                            stale concepts → suggested ingestion queries / new brainstorm prompts
                            (writes wiki/maintenance_report.md)
  [kb lint --maintain --apply]  ── apply query-derived state updates idempotently

Wiki-first retrieval (load-bearing invariant)

brainstorm_from_kb.retrieve_blocks gates on _should_use_wiki_memory(notes) and _wiki_is_healthy_for_query(kb_root). There is no command == "brainstorm" check — both ask and brainstorm pull kb_layer=wiki_concept blocks first (Chroma-filtered → state-score reranked → lexical fallback), then fill remaining slots with complementary article chunks excluding sources already cited by the surfaced concepts. Pure-vector retrieval is the fallback, not the default.

Query → wiki feedback

Every kb query (unless --no-file-back) files a structured note into wiki/queries/<date>_<slug>_<mode>.md and bumps state.json:concepts.<slug>.importance + retrieval_hints for cited concepts. kb lint --maintain later distills these query logs into proposed concept-page improvements. This realizes Karpathy's "my own explorations and queries always 'add up' in the knowledge base."

Schema is enforced, not advisory

schema/concept-schema.md and schema/source-schema.md define required frontmatter fields, valid enum values, and required section headers. wiki_lint checks these on every run (severity: warning), and kb lint --fix runs an LLM auto-repair pass via recompile_concept for schema-noncompliant concepts. The schema text is also injected into compile-time prompts so the LLM is told the source-anchor invariant.

Rethink Layer

A post-generation validation layer that runs automatically in brainstorm mode:

  1. Idea Parsing — Extracts structured ideas from LLM output (EN/CN formats)
  2. Novelty Check — Embeds each idea and queries ChromaDB for similar existing articles (threshold: 0.75)
  3. Quality Scoring — Traceability (heuristic) + Coherence & Actionability (LLM-as-judge)
  4. Rethink Report — Appended to output with per-idea scores and reasoning

Agent Layer

The LangGraph ReAct agent provides 12 tools:

Tool Description
ingest_article Ingest from URL (auto: WeChat / web / PDF), batch URLs, HTML file, PDF file, PDF URL
enrich_articles LLM-powered structured enrichment (concurrent, with limit support)
list_articles List articles by status (raw / reviewed / high_value); all live flat under raw/
review_articles Show enriched articles ready for review
set_article_status Update article status field in frontmatter
embed_knowledge Build/update ChromaDB vector index over raw/ + wiki/
query_knowledge_base Wiki-first Q&A or brainstorm; both modes pull stable wiki concepts before vectors
compile_wiki Compile/update wiki (incremental or rebuild); auto-runs lint
audit_wiki Wiki health report: schema violations, stale concepts, unsupported claims, duplicates
list_concepts List wiki concepts by status (stable / proposed / deprecated)
set_concept_status Override: approve/deprecate/delete a concept (escape hatch)
read_wiki Read INDEX.md / a concept article / a source summary

File Structure

Quant_LLM_Wiki/
├── pyproject.toml                  # Package metadata + `qlw` console_script entry point
├── requirements.txt                # Python dependencies (kept for non-pip-install users)
├── llm_config.example.env          # Example LLM provider config
├── README.md
├── LICENSE
├── kb.py                           # Wiki-first KB CLI: ingest | query | lint | compile | embed
├── ingest_source.py                # Unified ingest dispatcher (WeChat / web / PDF / HTML)
├── _wechat.py                      # WeChat-specific extraction
├── _web_extract.py                 # Generic web extraction (trafilatura)
├── _pdf_extract.py                 # PDF extraction (pypdf)
├── _code_math.py                   # Code/math preservation utilities
├── wiki_schemas.py                 # ConceptArticle / SourceSummary dataclasses
├── wiki_seed.py                    # Seed taxonomy + bootstrap
├── wiki_state.py                   # Machine state manifest + scoring (freshness decay etc.)
├── wiki_compile.py                 # compile_wiki orchestrator (schema-injected, soft-error)
├── wiki_compile_llm.py             # assign_concepts + recompile_concept LLM wrappers
├── wiki_index.py                   # INDEX.md generator
├── wiki_lint.py                    # Schema enforcement + health checks + auto_fix
├── wiki_maintain.py                # append_query_log + run_maintenance (Steps 6 + 7)
├── quant_llm_wiki/                 # Restructured Python package (qlib-style)
│   ├── __init__.py
│   ├── cli.py                      # `qlw` dispatcher
│   ├── shared.py                   # Shared utilities, LLM HTTP client, paths, frontmatter
│   ├── ingest/
│   │   └── wechat.py               # WeChat-specific ingest
│   ├── enrich.py                   # LLM enrichment pipeline
│   ├── embed.py                    # ChromaDB substrate over raw/ + wiki/
│   ├── sync.py                     # Article status-based file sync
│   ├── query/
│   │   ├── brainstorm.py           # query (ask | brainstorm) — wiki-first retrieval
│   │   └── rethink.py              # Post-generation novelty + quality validation
│   └── agent/                      # LangGraph agent layer
│       ├── cli.py                  # Interactive ReAct agent CLI
│       ├── graph.py
│       ├── prompts.py
│       └── tools.py
├── raw/                            # Incoming source articles, flat (one dir per article)
├── wiki/                           # LLM-built Markdown memory
│   ├── INDEX.md                    # auto-maintained TOC
│   ├── state.json                  # content hashes, concept scores, retrieval hints
│   ├── lint_report.json            # last health audit
│   ├── maintenance_report.md       # last `kb lint --maintain` output
│   ├── concepts/                   # one .md per concept
│   ├── sources/                    # one .md per raw article (mechanically derived)
│   └── queries/                    # one .md per filed `kb query` (Step 7 feedback log)
├── schema/                         # Rules followed by LLM and tools
│   ├── concept-schema.md
│   ├── source-schema.md
│   ├── wiki-structure.md
│   └── operations.md
├── templates/                      # Article markdown templates (research-note / strategy-note)
├── tests/                          # unittest suite
│   ├── robustness/                 # Edge-case tests (Layer 1–4)
│   ├── test_kb_cli.py              # kb.py CLI dispatch
│   ├── test_query_wiki_first_ask.py
│   ├── test_wiki_lint_schema.py    # Schema enforcement + auto_fix
│   ├── test_wiki_maintain.py       # Query feedback + maintenance
│   └── test_*.py                   # Per-module coverage
└── docs/                           # Design specs and usage guides

Repo / package / command names. Repo: Quant_LLM_Wiki. Package: quant_llm_wiki. Console command: qlw (installed via pip install -e .). The wiki-first KB workflow (raw/, wiki/, schema/) remains driven by kb.py; the standalone scripts (enrichment, embedding, brainstorm, agent, sync, single-source ingest) are now subcommands of qlw.

Command Renaming (vs. previous versions)

The standalone scripts at the repo root have moved into quant_llm_wiki/ and are dispatched through a single qlw CLI:

Old New
qlw ingest --url X qlw ingest --url X
qlw enrich --limit 10 qlw enrich --limit 10
qlw embed qlw embed
qlw sync qlw sync
qlw ask --query Q qlw ask --query Q
qlw brainstorm --query Q qlw brainstorm --query Q
qlw agent qlw agent

Install with pip install -e . to put qlw on PATH; otherwise use python -m quant_llm_wiki.cli <subcmd>. The kb.py wiki-first CLI is unchanged.

Quick Start

1. Install

The recommended way to install is via pipx, which gives you the qlw command globally without polluting your system Python and without requiring you to activate a venv:

# From PyPI (once published)
pipx install quant-llm-wiki

# Or directly from GitHub (always tracks main)
pipx install git+https://github.com/jackwu321/Quant_LLM_Wiki.git

After install, qlw is on your PATH from any shell. Upgrade later with pipx upgrade quant-llm-wiki.

Alternative: clone for development

If you want to hack on the code, clone and install in editable mode:

git clone https://github.com/jackwu321/Quant_LLM_Wiki.git
cd Quant_LLM_Wiki

python3 -m venv .venv
source .venv/bin/activate
pip install -e .

2. Configure LLM Provider

Copy the example config and fill in your API key:

cp llm_config.example.env .env
# Edit .env with your API key and provider settings

Or set environment variables directly:

export LLM_API_KEY="your-api-key"
export LLM_BASE_URL="https://open.bigmodel.cn/api/paas/v4"  # or any OpenAI-compatible endpoint
export LLM_MODEL="glm-4.7"  # or gpt-4, deepseek-chat, etc.

See llm_config.example.env for provider-specific examples (DeepSeek, Moonshot, Qwen, OpenAI, Ollama).

3. Ingest, Compile, Embed (one command)

# Single URL — ingest + auto-compile + auto-embed
python3 kb.py ingest --url "https://mp.weixin.qq.com/s/..."

# Skip the auto compile/embed
python3 kb.py ingest --url "..." --no-compile

# Local PDF
python3 kb.py ingest --pdf-file paper.pdf

# Saved WeChat HTML
python3 kb.py ingest --html-file saved.html

# Batch from a list (one URL per line)
python3 kb.py ingest --url-list urls.txt

Each URL has a hard 120 s ceiling; on hit, ingest prints TIMEOUT <url>: exceeded 120s and (in batch mode) continues with the next URL. Override via INGEST_URL_TIMEOUT=<seconds>. Note: a timed-out URL may leave a partial articles/raw/<date>_*/ directory behind (same as ordinary FAILED cases).

enrich_articles_with_llm.py remains a separate step (run before kb compile if your raw articles need LLM-derived metadata first):

qlw enrich                    # all raw articles (concurrent)
qlw enrich --limit 10         # first 10 only
qlw enrich --concurrency 5    # 5 parallel LLM requests

Each article enrichment has a hard 360 s ceiling; on hit, the article is recorded as failed: timeout: exceeded Ns and the batch continues. Override via LLM_ARTICLE_TIMEOUT=<seconds>. Start / done / TIMEOUT / [llm-retry] events are printed to stderr (separate from the per-completion [i/N] ... ok|failed lines on stdout) so you can see what's happening even when the LLM API is slow or backing off.

4. Query (wiki-first)

# Factual Q&A — wiki concepts first, RAG fallback only
python3 kb.py query --mode ask --query "What momentum factors are discussed?"

# Brainstorm new ideas (with Rethink Layer + query-feedback)
python3 kb.py query --mode brainstorm --query "Combine momentum and volatility timing for ETF rotation"

# Show retrieved context only (dry run)
python3 kb.py query --mode brainstorm --query "..." --dry-run

# Run a debug query without filing it back into wiki/queries/
python3 kb.py query --mode ask --query "..." --no-file-back

5. Lint + Maintain

# Schema + health audit
python3 kb.py lint

# LLM auto-repair of schema-noncompliant concepts
python3 kb.py lint --fix

# Gap analysis: unmapped sources, under-supported concepts, stale concepts
python3 kb.py lint --maintain

# Apply query-derived state updates (idempotent)
python3 kb.py lint --maintain --apply

Agent Usage

The interactive agent manages the full pipeline through natural language:

# Interactive mode
qlw agent

# Single command
qlw agent --query "ingest this article: https://mp.weixin.qq.com/s/..."
qlw agent --query "list all articles"
qlw agent --query "brainstorm: combine factor timing with risk parity"

Example Agent Workflow

You: ingest these articles: url1, url2, url3
Agent: Ingested 3/3 articles. Auto-compiled wiki and refreshed vector index.

You: enrich the first 3 raw articles
Agent: [1/3] ok  [2/3] ok  [3/3] ok — Enriched 3/3 articles.

You: review the new articles
Agent: [Shows enriched articles with content types and summaries]

You: set articles 1 and 3 as high_value, article 2 as rejected (low research value)
Agent: Updated 3 articles. Article 2 recorded as rejected (URL noted to prevent re-ingest).

You: ingest url2 again
Agent: WARNING — url2 was previously rejected: "文章标题" (reason: low research value).
       Use force=True to re-ingest.

You: brainstorm: how to combine momentum with volatility timing
Agent: [Wiki concepts surfaced first; complementary articles fill remaining slots]
       [LLM generates ideas; Rethink Layer scores novelty + quality]
       [Query filed back into wiki/queries/; cited concepts gain importance]

Configuration

LLM Provider

Quant_LLM_Wiki works with any OpenAI-compatible API. Configure via .env file (auto-loaded) or environment variables:

Variable Default Description
LLM_API_KEY Your API key
LLM_BASE_URL https://open.bigmodel.cn/api/paas/v4 API base URL
LLM_MODEL glm-4.7 Chat model name
LLM_EMBEDDING_MODEL embedding-3 Embedding model name
LLM_CONNECT_TIMEOUT 10 Connection timeout (seconds)
LLM_READ_TIMEOUT 120 Read timeout (seconds)
LLM_MAX_RETRIES 2 Max retry attempts
LLM_CONCURRENCY 3 Max parallel LLM requests for enrichment

Legacy ZHIPU_* prefixed variables are also supported as fallbacks.

Content Classification

Each article is classified with exactly one content_type:

Type Description
methodology Research frameworks, models, factor logic
strategy Trading logic with entry/exit rules and backtest
allocation Portfolio construction, rotation, ETF allocation
risk_control Risk management, drawdown control, volatility targeting
market_review Market commentary, sector reviews

Article Status Lifecycle

All articles live flat under raw/. The frontmatter status field is the source of truth.

Status Description
raw Ingested, pending enrichment and review
reviewed Human-reviewed; included in wiki compilation and vector index
high_value High research value; included in wiki compilation and vector index
rejected Low value — removed from KB, source URL recorded to prevent re-ingestion

Running Tests

Unit Tests

python3 -m unittest discover -s tests -p 'test_*.py' -v

Robustness Tests

The tests/robustness/ suite covers edge cases and failure modes across four layers:

File What it tests
test_layer1_tool_robustness.py Agent tools with malformed/missing inputs
test_layer2_workflow_integration.py End-to-end pipeline with bad data
test_layer3_agent_routing.py Agent routing under unexpected queries
test_layer4_llm_api_robustness.py LLM API timeouts, retries, and failures
python3 -m unittest discover -s tests/robustness -p 'test_*.py' -v

Design Principles

  • Wiki-first, RAG-as-substrate — Both kb query --mode ask and --mode brainstorm retrieve stable wiki concepts before vectors. ChromaDB runs only as fallback when the wiki is empty/sparse or audit_wiki reports degradation.
  • Three durable verbskb ingest, kb query, kb lint per Karpathy's prescription. compile and embed are internal operations auto-run by ingest.
  • Schema is enforcedschema/concept-schema.md and schema/source-schema.md define required frontmatter fields, valid enums, and required section headers. wiki_lint checks these on every run; kb lint --fix runs an LLM auto-repair pass.
  • Inspiration over execution — The knowledge base serves idea combination, not backtested trading signals.
  • Hybrid memory: Markdown + structured state — Markdown is the inspectable interface; wiki/state.json and ChromaDB metadata are the operational substrate (scoring, freshness decay, conflict tracking).
  • Per-claim provenance — Every bullet in a concept article ends with [<source_basename>]; un-anchored bullets fail lint and lower confidence.
  • Content-hash idempotencykb compile reruns produce zero LLM calls when source hashes are unchanged (no mtime, no date guessing).
  • Queries compound — Every kb query files into wiki/queries/ and bumps state.json scoring for cited concepts. kb lint --maintain distills the query log into proposed concept-page improvements.
  • Complementary retrieval — Wiki concepts surface first, then complementary article chunks fill remaining slots (excluding sources already cited by concepts).
  • Graceful degradation — Every component handles missing dependencies without crashing; audit_wiki errors push the wiki-first path to article-only fallback.
  • Self-healing vector store — Automatic SQLite integrity check before each ChromaDB operation; corrupted stores are cleaned up and rebuilt transparently.

Releasing (maintainers)

This repo publishes to PyPI automatically when a v*.*.* tag is pushed. The workflow is defined in .github/workflows/publish.yml and uses PyPI Trusted Publishing (OIDC) — no API token is stored in GitHub secrets.

One-time PyPI setup

Before the first release, configure a "pending publisher" on PyPI:

  1. Log in to https://pypi.org/manage/account/publishing/
  2. Add a pending publisher with:
    • PyPI Project Name: quant-llm-wiki
    • Owner: jackwu321
    • Repository name: Quant_LLM_Wiki
    • Workflow filename: publish.yml
    • Environment name: pypi
  3. In GitHub repo settings → Environments, create an environment named pypi (no secrets needed; OIDC handles auth).

Cutting a release

# 1. Bump version in pyproject.toml (e.g. 0.2.0 -> 0.2.1)
# 2. Commit
git commit -am "release: v0.2.1"
# 3. Tag and push
git tag v0.2.1
git push origin main --tags

The workflow will:

  1. Verify the tag matches project.version in pyproject.toml
  2. Build sdist + wheel
  3. Upload to PyPI via Trusted Publishing

Users then upgrade with pipx upgrade quant-llm-wiki.

Versioning. Follow SemVer: bump patch for fixes, minor for new features, major for breaking changes. The tag v0.2.1 must match version = "0.2.1" in pyproject.toml exactly, or the workflow aborts before publishing.

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Write tests for new functionality
  4. Ensure all tests pass (python3 -m unittest discover -s tests -p 'test_*.py')
  5. Commit your changes
  6. Open a Pull Request

License

This project is licensed under the MIT License — see the LICENSE file for details.

Disclaimer

Quant_LLM_Wiki is a research tool for generating investment strategy ideas. It does not produce trade-ready strategies or financial advice. All generated ideas require independent validation, backtesting, and risk assessment before any real-world application. Use at your own risk.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quant_llm_wiki-0.2.0.tar.gz (98.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quant_llm_wiki-0.2.0-py3-none-any.whl (66.1 kB view details)

Uploaded Python 3

File details

Details for the file quant_llm_wiki-0.2.0.tar.gz.

File metadata

  • Download URL: quant_llm_wiki-0.2.0.tar.gz
  • Upload date:
  • Size: 98.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quant_llm_wiki-0.2.0.tar.gz
Algorithm Hash digest
SHA256 42b6d720dbad3170ace6398c367ff0688a9158839ff71e1ee1b4ef8a7072de1e
MD5 e6e75ffb2a3b7a8961870f4ee55754f1
BLAKE2b-256 291839adacc7af3f497189da7d9c0e0122fe4e655f5886dd13a6cc3c86897fbd

See more details on using hashes here.

Provenance

The following attestation bundles were made for quant_llm_wiki-0.2.0.tar.gz:

Publisher: publish.yml on jackwu321/Quant_LLM_Wiki

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file quant_llm_wiki-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: quant_llm_wiki-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 66.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for quant_llm_wiki-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0ad78948cfb9eea2b93744e466a6e75dfbf167d4c32ec5d7c69f64632376d7fb
MD5 fb5db73c2cce312430695b545364966b
BLAKE2b-256 e3ff148bd3d41a6b8e6fb1a0546d687ad898b28e5172551ad33feaadcb0c0c72

See more details on using hashes here.

Provenance

The following attestation bundles were made for quant_llm_wiki-0.2.0-py3-none-any.whl:

Publisher: publish.yml on jackwu321/Quant_LLM_Wiki

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page