Convert any docs site into llms.md for AI agents
Project description
๐ชธ Prumo
Convert any documentation site into an llms.md file โ
structured context ready for AI coding agents.
The Problem
AI models hallucinate APIs from new or obscure libraries because they were never trained on that documentation. When you ask a coding agent to use a recent SDK, it invents function names, parameters, and behaviors that do not exist.
Prumo solves this by turning live documentation into a compact, structured llms.md file that you can drop into any agent's context window.
How It Works
URL or GitHub repo
โ
โผ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
โ Crawler โโโโโโถโ Exporter โโโโโโถโ llms.md โ
โ(Static/GitHubโ โ (Gemini or โ โ (Markdown) โ
โ /Playwright) โ โ Claude) โ โ โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
Crawler operates in three modes:
- Default โ navigates static HTML, follows internal links, strips navigation noise.
- GitHub mode (
--github) โ reads.md/.mdxfiles directly from a repository via the GitHub API. Bypasses JavaScript-rendered sites (Docusaurus, VitePress, Next.js). - JS mode (
--js) โ renders pages with a headless browser (Playwright) for JavaScript-heavy sites with no useful GitHub markdown source.
Exporter sends the cleaned content to an LLM, which generates a llms.md with full Markdown documentation content.
Installation
Prumo is a CLI tool. The recommended way to install it is with pipx, which installs it in an isolated environment and makes it globally available in your terminal:
pipx install prumo
Don't have pipx? Install it first:
# macOS brew install pipx && pipx ensurepath # Ubuntu / Debian sudo apt install pipx && pipx ensurepath # Windows scoop install pipx
Alternative โ pip inside a virtual environment:
pip install prumo
Alternative โ uv:
uv tool install prumo
Quick Start
1. Configure credentials
prumo init
The wizard will ask for your Gemini or Claude API key, and optionally a GitHub token for --github mode.
2. Fetch documentation
# Standard mode โ static HTML
prumo fetch https://docs.example.com
# GitHub mode โ reads .md/.mdx directly from the repository
prumo fetch https://github.com/some/repo --github
# JavaScript mode โ renders with Playwright
prumo fetch https://docs.stellar.org --js --max-pages 30
# Remap GitHub blob links to the published documentation URLs
prumo fetch https://github.com/stellar/stellar-docs \
--github \
--docs-base-url https://developers.stellar.org/docs
3. Optional JS setup (Playwright)
pip install prumo[js]
playwright install chromium
When --js is used, Prumo prints this warning:
Warning: JavaScript rendering mode enabled.
This launches a headless browser that may execute untrusted code.
Use only with trusted sites.
The result is written to ./output/llms.md by default.
Current release version: 0.1.2.
Output Format
Prumo generates a llms.md following the llmstxt.org standard:
# FastAPI
> Modern, fast web framework for building APIs with Python.
## Getting Started
- [Installation](https://fastapi.tiangolo.com/tutorial/): How to install and create the first endpoint.
- [First Steps](https://fastapi.tiangolo.com/tutorial/first-steps/): Basic structure of a FastAPI application.
## Request Handling
- [Path Parameters](https://fastapi.tiangolo.com/tutorial/path-params/): Dynamic URL parameters with automatic type validation.
CLI Reference
prumo init
Interactive wizard that creates a local .env file with your credentials.
Options:
--force, -f Overwrite an existing .env without prompting
prumo fetch <url>
Crawls a documentation site and generates llms.md.
| Option | Default | Description |
|---|---|---|
url |
required | Root URL of the docs site or GitHub repository |
--output, -o |
./output |
Output directory |
--provider, -p |
gemini |
LLM provider: gemini or claude |
--api-key, -k |
env var | LLM provider API key |
--max-pages, -m |
50 |
Maximum pages or files to crawl |
--github |
false |
Use the GitHub API to read .md/.mdx files directly |
--github-token |
env var | GitHub Personal Access Token |
--js |
false |
Use Playwright to render JS-heavy docs |
--docs-base-url, -d |
โ | Remap GitHub blob links to the published docs URL |
Credential resolution order
For each secret, Prumo tries in this order and stops at the first match:
--api-key / --github-token flag โ .env file โ shell environment variable โ error
Limitations
| JS-rendered sites | Standard mode may return empty pages on JS-heavy docs. Prefer --github first; use --js when no markdown source is available. |
| Playwright mode costs | --js is slower and uses more CPU/RAM because it launches a headless browser for rendering. |
| Large documentation | Crawling is capped at --max-pages to avoid bloated API calls. Increase it if the generated file feels incomplete. |
| Output quality | Depends on the LLM provider and the structure of the source documentation. Gemini 2.5 Flash is the default and works well for most cases. |
| GitHub API rate limits | Authenticated requests are limited to 5,000 per hour. A large repository with --max-pages 200 can consume several hundred requests. |
Development
git clone https://github.com/Dione-b/prumo.git
cd prumo
uv sync
cp .env.example .env # fill in your keys
uv run ruff check .
uv run mypy prumo/
uv run pytest tests/ -v
Contributing
Contributions are welcome. Before opening a pull request:
- Keep changes focused on a single concern.
- Add or update tests for any behavior changes.
- Make sure
ruff,mypy, andpytestall pass locally. - Describe the motivation and user-facing impact in the PR description.
If you find a bug or want to propose a feature, open an issue first.
License
MIT โ see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file prumo-0.1.2.tar.gz.
File metadata
- Download URL: prumo-0.1.2.tar.gz
- Upload date:
- Size: 87.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Pop!_OS","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6cdef06f94de99a15a40172bcb98ffeb9d095c7ce07dda3ba1b65d2c989c3bd0
|
|
| MD5 |
0bdbe99a01aed94d8b4d68fab8e61b51
|
|
| BLAKE2b-256 |
666ee54f72101fee00445ecf8192434996b977e6df255588fc5d36d426541634
|
File details
Details for the file prumo-0.1.2-py3-none-any.whl.
File metadata
- Download URL: prumo-0.1.2-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.3 {"installer":{"name":"uv","version":"0.10.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Pop!_OS","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
28d0c7f3ef0ee6f78343b26c696b3f7a9574dfb392d93797d05dd99076302a81
|
|
| MD5 |
e00c2290ef278f289b7c3a41e4a44387
|
|
| BLAKE2b-256 |
0706172dda864db8bd61b16607632c6137204a361a7d4154aded9bab874e4526
|