Skip to main content

LLM plugin to load entire folder contents as fragments

Project description

llm-fragments-folder

An LLM plugin that loads entire folder contents as fragments, turning any directory into a chat-ready knowledge base.

Installation

llm install llm-fragments-folder

Or install from source:

cd llm-fragments-folder
pip install -e .

Usage

Two fragment loaders are provided: folder: for general document collections and project: for software projects.

folder: - Load documents from a directory

# Chat against all docs in a folder
llm chat -f folder:./docs

# Ask a question about files in the current directory
llm -f folder:. "What are these documents about?"

# Combine with a specific model
llm -f folder:~/notes -m claude-sonnet-4-5 "Find all action items"

# Use with system fragments for custom instructions
llm -f folder:./research --sf "You are a research assistant" "Summarize the key findings"

# Only load specific file types
llm -f "folder:./docs?glob=*.md,*.txt" "Summarize the docs"
llm -f "folder:.?glob=*.json,*.yaml" "Explain these configs"

project: - Load a software project (respects .gitignore)

# Explain a codebase
llm chat -f project:.

# Ask about a specific project
llm -f project:./my-app "What framework does this use?"

# Code review
llm -f project:. "Review this code for security issues"

# Architecture overview
llm -f project:~/repos/my-api -m claude-sonnet-4-5 "Describe the architecture"

# Only Python files
llm -f "project:.?glob=*.py" "Review this code"

The project: loader:

  • Uses git ls-files when inside a git repo (most accurate)
  • Falls back to parsing .gitignore patterns if git is not available
  • Prepends a file tree summary as the first fragment
  • Automatically skips node_modules, __pycache__, .git, venv, dist, build, etc.

Combining with other fragments

Fragments compose naturally with each other and with LLM's other features:

# Folder + URL context
llm -f folder:./docs -f https://example.com/api-spec "Compare our docs to the spec"

# Folder + system prompt
llm -f folder:./meeting-notes --system "Extract action items with owners and dates" ""

# Project + GitHub issue
llm install llm-fragments-github
llm -f project:. -f issue:user/repo/42 "Implement this feature"

What gets loaded

Text file detection is based on file extension and filename. Supported types include:

  • Documents: .md, .qmd, .txt, .rst, .adoc, .tex, .org
  • Code: .py, .js, .ts, .go, .rs, .java, .rb, .c, .cpp, and many more
  • Config: .json, .yaml, .yml, .toml, .ini, .env, .cfg
  • Web: .html, .css, .scss, .svg, .xml
  • Data: .csv, .tsv, .sql, .graphql
  • Dotfiles: .bashrc, .zshrc, .vimrc, .gitconfig, .tmux.conf, .profile, .npmrc, etc.
  • Special files: Makefile, Dockerfile, LICENSE, etc.
  • Shebang scripts: extensionless files starting with #!

Always skipped directories: .git, node_modules, __pycache__, .venv, venv, dist, build, .idea, .vscode, .mypy_cache, .pytest_cache, etc.

Filtering with glob patterns

Use ?glob= to filter files using gitignore-style glob patterns. Patterns are comma-separated and support negation with !.

# Only markdown files
llm -f "folder:./docs?glob=*.md" "Summarize these"

# Python files, excluding tests
llm -f "project:.?glob=*.py,!*_test.py,!tests/**" "Review the code"

# All dotfiles
llm -f "folder:~?glob=.*" "Explain my shell config"

# Multiple file types
llm -f "folder:.?glob=*.md,*.txt,*.json" "What's in here?"

# Files containing a keyword, excluding a type
llm -f "folder:.?glob=*finance*,!*.txt" "Summarize the finance docs"

When ?glob= is specified, it replaces the default text file detection entirely. Only files matching the glob patterns are included (binary files with null bytes are still skipped automatically).

When no ?glob= is specified, the default text file detection is used (extension and filename based).

Binary file detection: Files containing null bytes are automatically detected as binary and skipped, even if matched by a glob pattern. This prevents garbled output from PDFs, images, Word docs, etc.

Safety limits: Files larger than 1MB are skipped. Maximum 500 files per loader call.

How it works

Each file becomes a separate LLM fragment, wrapped with a filename header:

--- path/to/file.py ---
<file contents>

This means LLM's fragment deduplication works at the file level. If you reference the same folder across multiple prompts, files that haven't changed won't be stored again in the log database.

Development

# Clone and install for development
git clone https://github.com/michael-borck/llm-fragments-folder.git
cd llm-fragments-folder
uv sync

# Run tests
uv run pytest

# Lint and format
uv run ruff check .
uv run ruff format .

# Type checking
uv run mypy llm_fragments_folder.py

Acknowledgments

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_fragments_folder-0.4.0.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_fragments_folder-0.4.0-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file llm_fragments_folder-0.4.0.tar.gz.

File metadata

  • Download URL: llm_fragments_folder-0.4.0.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for llm_fragments_folder-0.4.0.tar.gz
Algorithm Hash digest
SHA256 6b0c80d3dc2cc89b7a68d42ff84acc107825d0fc5d9256a0d8b05875ba6c18c5
MD5 f8c22e4ceb0279a767ce0375171ce784
BLAKE2b-256 271e158f9337509fc762f32a129c316df8fbf595ef18ec3bc88218c090970f89

See more details on using hashes here.

File details

Details for the file llm_fragments_folder-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_fragments_folder-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 583e080a02d9f63e2a772806929e97b9a52e9b5cc2b2e9a6627b1fcf083fc324
MD5 51be619781786564f48e9eab93c8d10d
BLAKE2b-256 2c0af3014f16d2adc79ecccc2452743e14c425e9a5c4a7e6f7f009f4ea9dfa24

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page