Generates LLM context by scraping and summarizing documentation for Python libraries listed in a requirements.txt file.

Project description

LLM-Min: Generate Compact Docs for LLMs

Problem: Large Language Models (LLMs) work best with focused, concise context. Feeding them entire documentation websites is inefficient and often counterproductive.

Solution: llm-min-generator automatically crawls Python library documentation and uses Google Gemini to generate compact, structured summaries (llm-min.txt) optimized for LLM consumption. It also saves the full crawled text (llm-full.txt) for reference.

Stop wasting tokens! Give your LLMs the focused context they need.

Key Features

Automated Crawling: Finds and scrapes official Python package docs.
LLM-Powered Summarization: Creates concise, structured summaries using the PCS (Progressive Compaction Strategy) via Google Gemini.
Flexible Input: Process packages from requirements.txt, folders, or direct input.
Easy Integration: Use via CLI or the Python LLMMinClient.
Organized Output: Saves results neatly per package (output_dir/package_name/).

Quick Start

1. Installation:

Using pip (Recommended for users):

pip install llm-min

For Development/Contribution (Using uv):

# Clone (if you haven't already)
# git clone <repository_url>
# cd llm-min-generator

# Install dependencies (using uv)
python -m venv .venv
source .venv/bin/activate # or .venv\Scripts\activate on Windows
uv pip install -r requirements.txt
uv pip install -e .

# Install browser binaries for crawling
playwright install

# Optional: Install pre-commit hooks for development
# uv pip install pre-commit
# pre-commit install

2. Configure API Key:

Recommended: Copy .env.example to .env and add your GEMINI_API_KEY. The application will automatically load it.
Alternatively: You can provide the key directly using the --gemini-api-key CLI flag or pass it as the api_key parameter when initializing LLMMinClient in Python.

3. Generate Docs (CLI Example):

Process packages from a requirements file and save to my_llm_docs:

llm-min-generator -f path/to/your/requirements.txt -o my_llm_docs

Use -pkg "requests\ntyper" for direct package input.
Use -d /path/to/project to find requirements.txt in a folder.
See llm-min-generator --help for more options (crawl depth, chunk size, etc.).

4. Generate Docs (Python Client Example):

from llm_min.client import LLMMinClient
import os

# Assumes GEMINI_API_KEY is in .env or environment
try:
    client = LLMMinClient()

    # Example: Compact existing text content
    long_text = "Your very long documentation text here..."
    subject = "My Custom Library"
    compacted_text = client.compact(content=long_text, subject=subject)

    print(f"--- Compacted {subject} ---")
    print(compacted_text)

    # You can also use client.process_package("package_name")
    # or client.process_requirements("path/to/requirements.txt")
    # See client documentation for details.

except (ValueError, FileNotFoundError) as e:
    print(f"Error initializing client (API Key or PCS Guide missing?): {e}")
except Exception as e:
    print(f"An error occurred: {e}")

Output

For each package, you'll get:

output_dir/
└── package_name/
    ├── llm-full.txt  # Raw crawled content
    └── llm-min.txt   # Compacted PCS content for LLMs

What is PCS (Packed Code Syntax)?

PCS is a highly condensed, machine-centric format designed for representing code structure and essential metadata with maximum information density. It uses single-character codes, minimal delimiters, and positional context to create a compact, single-string representation optimized for LLM context windows.

Think of it as a "minified" version of code documentation, focusing purely on the structural elements and relationships an LLM needs to understand an API or library, discarding natural language explanations. The full specification can be found in docs/pcs-guide.md.

Contributing

Contributions are welcome! See CONTRIBUTING.md (if available) or focus on improving discovery, compaction, LLM support, or tests.

License

MIT License.

Project details

Release history Release notifications | RSS feed

0.3.1

Jun 18, 2025

0.3.0

Jun 5, 2025

0.2.4

Jun 1, 2025

0.2.3

May 16, 2025

0.2.1

May 16, 2025

0.2.0

May 15, 2025

0.1.5

May 11, 2025

0.1.4

May 11, 2025

This version

0.1.3

Apr 30, 2025

0.1.2

Apr 30, 2025

0.1.1

Apr 29, 2025

0.1.0

Apr 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_min-0.1.3.tar.gz (39.4 kB view details)

Uploaded Apr 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_min-0.1.3-py3-none-any.whl (29.2 kB view details)

Uploaded Apr 30, 2025 Python 3

File details

Details for the file llm_min-0.1.3.tar.gz.

File metadata

Download URL: llm_min-0.1.3.tar.gz
Upload date: Apr 30, 2025
Size: 39.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llm_min-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`58372af033c38807d3ee0e99931f16f3b9a7ffeb211bb403e5ba8369db4dad14`
MD5	`dc8a0b98d9504297550ad66f0dcd3d93`
BLAKE2b-256	`98ebb4d3a5eca82324285bc96bd1da89872c1998671e1da211f36e9f44525c8d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_min-0.1.3.tar.gz:

Publisher: publish.yml on marv1nnnnn/llm-min.txt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_min-0.1.3.tar.gz
- Subject digest: 58372af033c38807d3ee0e99931f16f3b9a7ffeb211bb403e5ba8369db4dad14
- Sigstore transparency entry: 204506475
- Sigstore integration time: Apr 30, 2025
Source repository:
- Permalink: marv1nnnnn/llm-min.txt@ca9eae796f22ad77810f201dcacecc71cf584a45
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/marv1nnnnn
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ca9eae796f22ad77810f201dcacecc71cf584a45
- Trigger Event: push

File details

Details for the file llm_min-0.1.3-py3-none-any.whl.

File metadata

Download URL: llm_min-0.1.3-py3-none-any.whl
Upload date: Apr 30, 2025
Size: 29.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llm_min-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`49d5f0ee845444891b036f0aa7ce094456200194515fd8966759a90f860cbe5f`
MD5	`e763a589dce1ffce66215243d85490b0`
BLAKE2b-256	`4a888fe6e107281a8461c5868c67cd725dfa876cee7ea936c12830ffbf94ff44`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_min-0.1.3-py3-none-any.whl:

Publisher: publish.yml on marv1nnnnn/llm-min.txt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llm_min-0.1.3-py3-none-any.whl
- Subject digest: 49d5f0ee845444891b036f0aa7ce094456200194515fd8966759a90f860cbe5f
- Sigstore transparency entry: 204506478
- Sigstore integration time: Apr 30, 2025
Source repository:
- Permalink: marv1nnnnn/llm-min.txt@ca9eae796f22ad77810f201dcacecc71cf584a45
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/marv1nnnnn
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@ca9eae796f22ad77810f201dcacecc71cf584a45
- Trigger Event: push

llm-min 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

LLM-Min: Generate Compact Docs for LLMs

Key Features

Quick Start

Output

What is PCS (Packed Code Syntax)?

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance