CLI tool to generate LLM context from documentation.
Project description
docs2llm
A command-line tool to extract documentation from local directories and GitHub repositories, formatting it for use as context with Large Language Models (LLMs).
Purpose
docs2llm helps you capture documentation from codebases to use as context for AI assistants and large language models. It searches for documentation files (markdown, text, etc.), processes them, and creates a single consolidated file that can be used as reference material for LLMs.
Features
- Extract documentation from local directories or GitHub repositories
- Automatically identify and process common documentation files
- Prioritize README files and important documentation
- Support for multiple file formats (Markdown, RST, TXT)
- Format output for optimal LLM context
- Control scan depth to manage output size
- Clone specific branches from Git repositories
- Detailed logging with configurable verbosity
Installation
# Install from PyPI
pip install docs2llm
Usage
Command Line Interface
# Extract docs from a local directory
docs2llm /path/to/project --output context.txt
# Extract docs from a GitHub repository
docs2llm --git owner/repo --output context.txt
# Specify a branch
docs2llm --git owner/repo --branch develop
# Control scan depth
docs2llm /path/to/project --max-depth 2
# Enable verbose logging
docs2llm /path/to/project -v
# Write logs to a file
docs2llm /path/to/project --log-file extraction.log
Options
PATH: Local directory containing documentation files--git: GitHub repository URL or owner/repo format--output: Output file name (default: llm_context.txt)--max-depth: Maximum directory depth to search (default: 3)--branch: Specific branch to clone (only used with --git)--verbose,-v: Enable verbose logging--log-file: Log to this file in addition to console
Python API
from docs2llm import extract_documentation
# Extract from local directory
success = extract_documentation(
local_path="/path/to/project",
output_file="context.txt",
max_depth=3,
verbose=True
)
# Extract from GitHub repository
success = extract_documentation(
git_repo="owner/repo",
output_file="context.txt",
branch="main",
verbose=True
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file docs2llm-0.1.0.tar.gz.
File metadata
- Download URL: docs2llm-0.1.0.tar.gz
- Upload date:
- Size: 22.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
904b479f5f425c5d03381af0904ada252ad4c08dad04345ecce86fd84ecac42c
|
|
| MD5 |
0b554751224d07b0e1f76b69e327a221
|
|
| BLAKE2b-256 |
3cecb5419cc88dc748c591e365e97847b24bca138c97705d4594865f17985aea
|
Provenance
The following attestation bundles were made for docs2llm-0.1.0.tar.gz:
Publisher:
publish.yml on nklsw/docs2llm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docs2llm-0.1.0.tar.gz -
Subject digest:
904b479f5f425c5d03381af0904ada252ad4c08dad04345ecce86fd84ecac42c - Sigstore transparency entry: 196790165
- Sigstore integration time:
-
Permalink:
nklsw/docs2llm@3bddf664715ba5a588acf0d627a82ccabaa788ed -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/nklsw
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3bddf664715ba5a588acf0d627a82ccabaa788ed -
Trigger Event:
release
-
Statement type:
File details
Details for the file docs2llm-0.1.0-py3-none-any.whl.
File metadata
- Download URL: docs2llm-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64ede59f2556eda6ff110fa305e2babf8aeed6fcd5149ff6945590628ef3c62a
|
|
| MD5 |
f238b661a5e9089a014a85868c0588e7
|
|
| BLAKE2b-256 |
375bdc2c53e1ff1ae0bdcb08aecdc0a9618e0212cb1119e1d78ca2d185aa16e6
|
Provenance
The following attestation bundles were made for docs2llm-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on nklsw/docs2llm
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
docs2llm-0.1.0-py3-none-any.whl -
Subject digest:
64ede59f2556eda6ff110fa305e2babf8aeed6fcd5149ff6945590628ef3c62a - Sigstore transparency entry: 196790167
- Sigstore integration time:
-
Permalink:
nklsw/docs2llm@3bddf664715ba5a588acf0d627a82ccabaa788ed -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/nklsw
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@3bddf664715ba5a588acf0d627a82ccabaa788ed -
Trigger Event:
release
-
Statement type: