AST-aware llms.txt generator for Python and JavaScript/TypeScript codebases
Project description
llmstxt-gen
AST-aware
llms.txtgenerator for Python, JavaScript/TypeScript, and Go codebases.
What problem this solves
LLM coding agents work best when they have an accurate, up-to-date map of the code they are working on. The llms.txt standard exists to give them exactly that: a single Markdown file at the root of a project that lists the public surface area and points at deeper documentation.
Most existing generators build that file by scraping a project's published docs site. Scrapers go stale the moment your code changes, they bring along marketing prose the agent does not need, and they cannot describe code that has not been documented yet. The result is an llms.txt that confidently lists deprecated APIs.
llmstxt-gen takes a different approach. It reads your Python, JavaScript/TypeScript, or Go source code directly, parses it with tree-sitter into an Abstract Syntax Tree, and extracts the things an agent actually needs: function signatures, type hints, docstrings, class hierarchies, and exported symbols. The result is a token-efficient, always-current Markdown file you can regenerate from a pre-commit hook or a CI job.
No scraping. No cloud calls. No framework lock-in.
Installation
pip install llmstxt-gen
Requires Python 3.11 or newer. The PyPI distribution name is llmstxt-gen; the installed CLI command and Python import name are both llmstxt-gen.
Quick start
From the root of any Python, JavaScript/TypeScript, or Go project:
llmstxt-gen generate
You will get two files in the project root:
llms.txt: a compact summary suitable for inclusion in an agent's initial contextllms-full.txt: the full detailed reference
To preview without writing files:
llmstxt-gen generate --dry-run
To get a quick read on what would be included:
llmstxt-gen stats
Example output
A small Python module like:
"""Tiny calculator module."""
def add(a: int, b: int = 0) -> int:
"""Return the sum of a and b."""
return a + b
produces this entry in llms-full.txt:
## src/calc.py
Tiny calculator module.
### Functions
#### `add(a: int, b: int = 0) -> int`
Return the sum of a and b.
and a one-line entry in llms.txt:
calc: Tiny calculator module.
Configuration
All options live in your pyproject.toml under [tool.llmstxt_gen]. Every key is optional.
| Option | Type | Default | Description |
|---|---|---|---|
name |
string | directory name | Project name shown in the heading |
description |
string | "" |
Short tagline shown as a blockquote |
version |
string | "" |
Project version |
include |
list of strings | [] (all) |
Paths to scan, relative to the repo root |
exclude |
list of strings | [] |
Additional patterns to skip, beyond .gitignore |
extensions |
list of strings | [".py", ".js", ".jsx", ".ts", ".tsx", ".go"] |
File extensions to consider |
output_dir |
string | "." |
Where to write the output files |
output_summary |
string | "llms.txt" |
Filename for the summary file |
output_full |
string | "llms-full.txt" |
Filename for the full reference |
include_private |
bool | false |
Include private or non-exported symbols |
max_tokens_summary |
int | 8000 |
Token budget for llms.txt |
max_tokens_full |
int | 32000 |
Token budget for llms-full.txt |
languages |
list of strings | ["python", "typescript", "go"] |
Parsers to activate |
Example:
[tool.llmstxt_gen]
include = ["src/"]
exclude = ["src/internal/"]
include_private = false
max_tokens_summary = 6000
CI integration
Pre-commit hook
repos:
- repo: local
hooks:
- id: llmstxt-gen
name: llmstxt-gen
entry: llmstxt-gen generate
language: system
pass_filenames: false
always_run: true
GitHub Actions
name: Update llms.txt
on:
push:
branches: [main]
jobs:
update:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- run: pip install llmstxt-gen
- run: llmstxt-gen generate
- uses: stefanzweifel/git-auto-commit-action@v5
with:
commit_message: "chore: refresh llms.txt"
More integrations live in docs/ci-integration.md.
How it compares to scraper-based approaches
Scrapers like llmstxt.org generators crawl a published documentation site and concatenate the rendered HTML. They work without source access, which is their main advantage. The drawbacks are real:
- They cannot describe undocumented code, so newer modules are invisible.
- They drift the moment your code lands faster than your docs site rebuilds.
- They include navigation chrome, marketing copy, and rendered examples that bloat the agent's context window.
- They cannot reliably recover type information, since rendered HTML is lossy.
llmstxt-gen reads the source. It will always reflect what is actually in the repository, and it produces output that maps one-to-one with the symbols an agent will end up calling.
Contributing
See CONTRIBUTING.md. Bug reports and pull requests are welcome.
License
MIT. See LICENSE.
Roadmap (not yet implemented)
- Rust port for large monorepos
- Parser support for Ruby and Java
- Optional semantic pruning via a local model
- A hosted GitHub App for zero-config setup
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmstxt_gen-0.2.0.tar.gz.
File metadata
- Download URL: llmstxt_gen-0.2.0.tar.gz
- Upload date:
- Size: 32.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f13b9a95d2f0caa4667d9320e4d2527228082f4e3984b48f4f2f0242c59fa72a
|
|
| MD5 |
c74bd1fa48357676b20682a23cec151a
|
|
| BLAKE2b-256 |
51270b48c64895c1b6c4e885f87649491c6bde10d6d4a64de06d17433f18a16a
|
Provenance
The following attestation bundles were made for llmstxt_gen-0.2.0.tar.gz:
Publisher:
publish.yml on wuzzzzaah/llmstxt-gen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmstxt_gen-0.2.0.tar.gz -
Subject digest:
f13b9a95d2f0caa4667d9320e4d2527228082f4e3984b48f4f2f0242c59fa72a - Sigstore transparency entry: 1566339618
- Sigstore integration time:
-
Permalink:
wuzzzzaah/llmstxt-gen@3d4e2faa15d5ab6803a695fe18df87601a6db54e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/wuzzzzaah
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3d4e2faa15d5ab6803a695fe18df87601a6db54e -
Trigger Event:
push
-
Statement type:
File details
Details for the file llmstxt_gen-0.2.0-py3-none-any.whl.
File metadata
- Download URL: llmstxt_gen-0.2.0-py3-none-any.whl
- Upload date:
- Size: 23.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a0b3f2ad31fdc5a17655054e00774f78c81b21f6f4d918ad76812178148bf3a
|
|
| MD5 |
7b3d36d0935475d79d302a17e2848758
|
|
| BLAKE2b-256 |
204c3610991aa8d502b330faa44a4b7cddde3b75254622b3d063566fa7c6da93
|
Provenance
The following attestation bundles were made for llmstxt_gen-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on wuzzzzaah/llmstxt-gen
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llmstxt_gen-0.2.0-py3-none-any.whl -
Subject digest:
1a0b3f2ad31fdc5a17655054e00774f78c81b21f6f4d918ad76812178148bf3a - Sigstore transparency entry: 1566339668
- Sigstore integration time:
-
Permalink:
wuzzzzaah/llmstxt-gen@3d4e2faa15d5ab6803a695fe18df87601a6db54e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/wuzzzzaah
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3d4e2faa15d5ab6803a695fe18df87601a6db54e -
Trigger Event:
push
-
Statement type: