Skip to main content

Retrieval-only MCP server over the indexed React documentation, with a prebuilt index downloaded on first run.

Project description

cfunklabs-rag-react-docs

Backend for the RAG demo, built with LangChain, LangGraph, Anthropic Claude, and ChromaDB. It also ships a retrieval-only MCP server, published to PyPI as cfunklabs-rag-react-docs, that serves grounding context from the indexed React documentation (see MCP server).

Prerequisites

Setup

All commands should be run from the backend directory.

1. Install dependencies

uv sync

2. Configure environment variables

Copy the sample env file and add your Anthropic API key:

cp .env.sample .env

Open .env and set your key:

ANTHROPIC_API_KEY=your_api_key_here

3. Initialize the vector database

Create the ChromaDB collection used to store document embeddings:

uv run src/utils/init_db.py

This creates a persistent ChromaDB store under rag_datastore/ and a collection named after tool.rag_db.rag_doc_collection_name in pyproject.toml. The rag_datastore/ directory is gitignored.

4. Fetch the React docs dataset

The corpus is a local mirror of the React documentation markdown files. Download it with:

uv run src/utils/fetch_react_docs.py

This script:

  1. Fetches the index at react.dev/llms.txt
  2. Follows every linked https://react.dev/*.md URL
  3. Saves each page under docs/ at the repository root, using the index heading structure as directories and each file's frontmatter title as the filename

Example output path:

docs/API Reference/React/Components/Built-in React Components.md

The docs/ directory is gitignored — run the script locally after cloning and re-run it anytime to refresh the dataset.

Running

Startup check

Run main.py to verify your environment is configured correctly. It prints installed package versions and performs a live health check against the Claude API:

uv run main.py

A passing run looks like:

Versions:
  - langchain_core_version: x.x.x
  - langgraph_version: x.x.x
  - langchain_anthropic_version: x.x.x

Checking LLM service health... PASSED

If the health check fails, confirm that ANTHROPIC_API_KEY is set correctly in your .env file.

Ingesting documents

Chunk every Markdown file under docs/, embed the chunks, and upsert them into the vector store:

uv run main.py

To process a single file (useful while iterating on chunking), pass --md_file_path. Add --evaluate_chunking to write LLM-as-judge quality reports to evals/results, or --print_chunks to dump each chunk to stdout.

Querying

Ask a question against the ingested docs. The query pipeline is a LangGraph graph (retrieve -> generate) that embeds your question with the same model used at ingestion, retrieves the most similar chunks from ChromaDB, and has Claude answer using only that context. The answer streams token-by-token and is followed by the cited sources:

uv run query.py "How does memo work?"

Options:

  • --k N — number of chunks to retrieve (default top_k in pyproject.toml).
  • --no-stream — wait for the full answer instead of streaming tokens.
  • --show-scores — show the retrieval distance for each cited source.
  • -v, --verbose — show diagnostic output (the preflight datastore and LLM health checks). Hidden by default.

Retrieval and generation settings live under [tool.rag_query] in pyproject.toml (top_k and generation_model). Run uv run main.py first — querying requires a populated collection.

MCP server

In addition to the CLI, the retrieval pipeline is exposed as an MCP server over stdio, so MCP clients (Cursor, Claude Desktop, etc.) can pull grounding context directly. The server ships prescriptive metadata — a server-level instructions block plus a richly documented tool — so a consuming LLM knows when to reach for it (any React 19.2 API, hook, component, or pattern question) instead of relying on its own possibly-stale knowledge. It exposes a single retrieval-only tool:

  • search_react_docs(question, k?) — embeds the question with the same model used at ingestion, retrieves the most similar chunks from ChromaDB, and returns each chunk's source label, content, and retrieval distance. The client LLM generates the answer from those chunks, so no Anthropic key is needed to run the server.
    • question should be a full natural-language question, not bare keywords.
    • k defaults to RAG_TOP_K (5); use ~3 for a specific API lookup and ~8-10 for broad topics.
    • distance is squared L2 over normalized embeddings, so lower is more similar. For this corpus, < ~1.0 is relevant and > ~1.5 usually means off-topic / not covered.

For end users (published package)

The server is published to PyPI as cfunklabs-rag-react-docs. End users don't clone the repo or run the ingestion pipeline — the prebuilt index (~34 MB) is downloaded from a GitHub Release and cached on first run. Just register it with your MCP client:

{
  "mcpServers": {
    "rag-react-docs": {
      "command": "uvx",
      "args": ["cfunklabs-rag-react-docs"]
    }
  }
}

The first launch needs network access to fetch the index; subsequent runs read from the local cache (platformdirs cache dir) and work offline. Optional environment overrides: RAG_TOP_K, RAG_COLLECTION_NAME, RAG_INDEX_URL, and RAG_DATASTORE_DIR.

Local development

Run the dev server from the backend directory (so pyproject.toml resolves) against the locally-built rag_datastore:

uv run mcp_server.py

Run standalone this way, the server prints a short startup banner to stderr and then blocks silently by design — the stdio transport reserves stdout for the JSON-RPC protocol, so it waits for a client to connect rather than logging. Running it directly is mainly a smoke test; press Ctrl+C to stop.

Run uv run main.py first — the dev server needs a populated collection. For interactive testing, launch the MCP Inspector with uv run mcp dev mcp_server.py.

Releasing

The published package (cfunklabs-rag-react-docs) contains only the retrieval + MCP server (the import package rag_react_docs under src/). Ingestion/query tooling and src/utils/* are dev-only and excluded from the wheel.

A release involves two independently-versioned artifacts:

  • The PyPI package, published automatically by CI when you push a v* git tag. The .github/workflows/publish.yml workflow builds the wheel/sdist and uploads them to PyPI using trusted publishing (OIDC) — no API tokens or secrets are involved.
  • The prebuilt index, uploaded manually to a GitHub Release, and only when the corpus changes. Its version is INDEX_VERSION in src/rag_react_docs/config.py, independent of the package version.

The v* tag drives PyPI and the index-* tag drives the index download; they never trigger each other (the workflow filters v*).

Release checklist

Step 1 - Increment the package version. Bump the version in BOTH pyproject.toml (version) and src/rag_react_docs/init.py (__version__), keeping them in sync. PyPI rejects re-uploads of an existing version, so this must change every release.

Step 2 - Build and upload the index (only if the docs, chunking, or embeddings changed). Most code-only releases skip this step. If the index content changed, bump REACT_VERSION and/or INDEX_REVISION in src/rag_react_docs/config.py first, then:

uv run main.py                     # repopulate rag_datastore (needs docs fetched + ANTHROPIC_API_KEY)
uv run scripts/build_index_archive.py
gh release create index-19-2-v1 dist/rag-index-19-2-v1.tar.gz dist/rag-index-19-2-v1.tar.gz.sha256

Upload the index before publishing the package release, so the new package's INDEX_URL resolves for end users on first run.

The index version follows the standard index-<react-version>-v<incremental> (e.g. index-19-2-v1), composed from REACT_VERSION and INDEX_REVISION. Bump REACT_VERSION when re-fetching the docs for a new React release, and bump INDEX_REVISION for re-chunk or embedding-model changes within the same React version. Either bump changes the release tag/asset name and the client cache path, so clients pull a fresh, compatible index instead of reusing a stale cache.

Step 3 - (Optional) Local build sanity check. This verifies the wheel/sdist build; it is not the publish mechanism (CI builds too). Artifacts land in the gitignored dist/.

uv build

Step 4 - Publish by tagging. Commit the version bump, then tag and push. The v* tag triggers CI, which builds and publishes to PyPI automatically:

git commit -am "Release v0.1.3"
git tag -a v0.1.3 -m "Release v0.1.3"
git push origin main --tags

After the workflow finishes, verify the new version at pypi.org/project/cfunklabs-rag-react-docs, and (if you re-released the index) that the index asset URL returns 200.

Manual publishing (uv publish) is not the standard path: trusted publishing is configured for CI only, so a local upload would require a separate API token and bypass the pinned pypi environment. Prefer the tag-driven flow above.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cfunklabs_rag_react_docs-0.1.4.tar.gz (11.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cfunklabs_rag_react_docs-0.1.4-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file cfunklabs_rag_react_docs-0.1.4.tar.gz.

File metadata

  • Download URL: cfunklabs_rag_react_docs-0.1.4.tar.gz
  • Upload date:
  • Size: 11.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cfunklabs_rag_react_docs-0.1.4.tar.gz
Algorithm Hash digest
SHA256 525382bf7511b2e316c09d4ac5c820f31c8289f2d6029407e64228776c1303e4
MD5 23ab348478b56bffb9df899b5db51420
BLAKE2b-256 5dc3652e9ce69a316327b24e611d0027088186999f6562187ad58e70779128b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for cfunklabs_rag_react_docs-0.1.4.tar.gz:

Publisher: publish.yml on cfunklabs/rag-react-docs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cfunklabs_rag_react_docs-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for cfunklabs_rag_react_docs-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 815e1a4e3593d0f552f3f61f25d298c4b7d5f6f48e96826176f0629e83d3f3a7
MD5 f26c53ae819e807b87065065b177e816
BLAKE2b-256 3646f037d3454e7dd76ff1bdd9f292832d2e8f98e2f9446c3d542eac52b5100b

See more details on using hashes here.

Provenance

The following attestation bundles were made for cfunklabs_rag_react_docs-0.1.4-py3-none-any.whl:

Publisher: publish.yml on cfunklabs/rag-react-docs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page