Skip to main content

Convert software documentation to one-file prompt for LLMs

Project description

docs2prompt 📜→🤖

PyPI License

Fetch open-sourced documentation from Github or closed-sourced documentation from publisher website to put into a LLM-friendly format in one file for use with LLMs.

Features

  • GitHub Integration: Extracts documentation files (e.g., README.md, docs.md, files within a docs/ folder) from a GitHub repository using heuristics.
  • External Documentation Heuristic: Optionally, fetches and converts external documentation links found in the root README.
  • URL Crawling: Supports crawling a top-level documentation URL for content.
  • Customizable Output: Serialize documentation in various formats (default plain text, XML, or Markdown).
  • CLI and API: Use as an importable Python package or as a standalone command-line tool.

Installation

You can install docs2prompt directly from PyPI:

pip install docs2prompt

Alternatively, clone the repository and install locally:

git clone https://github.com/rezabrizi/docs2prompt.git
cd docs2prompt
pip install .

Usage

Command-Line Interface

After installing, you can run the tool via the command line.

Example using a GitHub repository:

docs2prompt --repo owner/repo --token YOUR_GITHUB_TOKEN --format markdown --full_repo --external_documentation --output docs.txt
  • --repo: GitHub repository in the format owner/repo (required if not using --url).
  • --token: Your GitHub authentication token (only used if --repo is provided) - HIGHLY RECOMMENDED as without a token you can make at most 60 requests per hour which easily gets reached with 1 query.
  • --repo: Documentation page url (required if not using --repo).
  • --format: Output format (default, xml, or markdown).
  • --output: File name to write the serialized documentation.
  • --full_repo: Performs a full recursive search of the repository. Having this as True will cause the query to take longer to finish.
  • --external_documentation: Enables external documentation heuristic to fetch linked external docs from the root README. Having this as True will cause the query to take longer to finish.

Example using a documentation URL:

docs2prompt --url https://example.com/documentation --format xml --output output.xml

Note: You must provide exactly one of --repo or --url.

As a Python Package

You can also import docs2prompt as a module in your own Python code:

from docs2prompt import get_github_documentation


repo_id = "owner/repo"
token = "YOUR_GITHUB_TOKEN" # Although optional, Highly recommended
content = get_github_documentation(repo_id, token, full_repo=False, external_documentation=False, output_format="XML")
print(output_content)

Contributing

Contributions are welcome! If you'd like to contribute:

  1. Fork the repository.
  2. Create a feature branch (git checkout -b feature/my-feature).
  3. Commit your changes (git commit -am 'Add some feature').
  4. Push to the branch (git push origin feature/my-feature).
  5. Create a new Pull Request.

Please ensure that your changes include appropriate tests and documentation.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For any questions or suggestions, please open an issue on the GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docs2prompt-0.1.4.tar.gz (38.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docs2prompt-0.1.4-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file docs2prompt-0.1.4.tar.gz.

File metadata

  • Download URL: docs2prompt-0.1.4.tar.gz
  • Upload date:
  • Size: 38.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.5

File hashes

Hashes for docs2prompt-0.1.4.tar.gz
Algorithm Hash digest
SHA256 30a77fee2a44b7ee2f03470b3a287b6667fcbdf7126dc4515ca39e46a4ccecf2
MD5 81499ea7ca303690ccd8b06edb8b30a9
BLAKE2b-256 afe2d5e0a3d5f81053db5a0abffa589f8618cf9c9c067ec5c9fe890751951a70

See more details on using hashes here.

File details

Details for the file docs2prompt-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for docs2prompt-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 65f47b355f164a93958a2209b04746be045bbd4f6dcd4c3278996d51f3c337c4
MD5 d28c2aa569650ac8401663962bc0a95c
BLAKE2b-256 44ec8d9db365edaf5a44268248becdd6c4a2454793a135d57c00f1142a7f92b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page