Skip to main content

Concatenate a directory full of files into a single prompt for use with LLMs

Project description

files-to-prompt

PyPI Changelog Tests License

Concatenate a directory full of files into a single prompt for use with LLMs

For background on this project see Building files-to-prompt entirely using Claude 3 Opus.

Installation

Install this tool using pip:

pip install files-to-prompt

Usage

To use files-to-prompt, provide the path to one or more files or directories you want to process:

files-to-prompt path/to/file_or_directory [path/to/another/file_or_directory ...]

This will output the contents of every file, with each file preceded by its relative path and separated by ---.

Options

  • -e/--extension <extension>: Only include files with the specified extension. Can be used multiple times.

    files-to-prompt path/to/directory -e txt -e md
    
  • --include-hidden: Include files and folders starting with . (hidden files and directories).

    files-to-prompt path/to/directory --include-hidden
    
  • --ignore <pattern>: Specify one or more patterns to ignore. Can be used multiple times. Patterns may match file names and directory names, unless you also specify --ignore-files-only. Pattern syntax uses fnmatch, which supports *, ?, [anychar], [!notchars] and [?] for special character literals.

    files-to-prompt path/to/directory --ignore "*.log" --ignore "temp*"
    
  • --ignore-files-only: Include directory paths which would otherwise be ignored by an --ignore pattern.

    files-to-prompt path/to/directory --ignore-files-only --ignore "*dir*"
    
  • --ignore-gitignore: Ignore .gitignore files and include all files.

    files-to-prompt path/to/directory --ignore-gitignore
    
  • -c/--cxml: Output in Claude XML format.

    files-to-prompt path/to/directory --cxml
    
  • -m/--markdown: Output as Markdown with fenced code blocks.

    files-to-prompt path/to/directory --markdown
    
  • -o/--output <file>: Write the output to a file instead of printing it to the console.

    files-to-prompt path/to/directory -o output.txt
    
  • -n/--line-numbers: Include line numbers in the output.

    files-to-prompt path/to/directory -n
    

    Example output:

    files_to_prompt/cli.py
    ---
      1  import os
      2  from fnmatch import fnmatch
      3
      4  import click
      ...
    
  • -0/--null: Use NUL character as separator when reading paths from stdin. Useful when filenames may contain spaces.

    find . -name "*.py" -print0 | files-to-prompt --null
    

Example

Suppose you have a directory structure like this:

my_directory/
├── file1.txt
├── file2.txt
├── .hidden_file.txt
├── temp.log
└── subdirectory/
    └── file3.txt

Running files-to-prompt my_directory will output:

my_directory/file1.txt
---
Contents of file1.txt
---
my_directory/file2.txt
---
Contents of file2.txt
---
my_directory/subdirectory/file3.txt
---
Contents of file3.txt
---

If you run files-to-prompt my_directory --include-hidden, the output will also include .hidden_file.txt:

my_directory/.hidden_file.txt
---
Contents of .hidden_file.txt
---
...

If you run files-to-prompt my_directory --ignore "*.log", the output will exclude temp.log:

my_directory/file1.txt
---
Contents of file1.txt
---
my_directory/file2.txt
---
Contents of file2.txt
---
my_directory/subdirectory/file3.txt
---
Contents of file3.txt
---

If you run files-to-prompt my_directory --ignore "sub*", the output will exclude all files in subdirectory/ (unless you also specify --ignore-files-only):

my_directory/file1.txt
---
Contents of file1.txt
---
my_directory/file2.txt
---
Contents of file2.txt
---

Reading from stdin

The tool can also read paths from standard input. This can be used to pipe in the output of another command:

# Find files modified in the last day
find . -mtime -1 | files-to-prompt

When using the --null (or -0) option, paths are expected to be NUL-separated (useful when dealing with filenames containing spaces):

find . -name "*.txt" -print0 | files-to-prompt --null

You can mix and match paths from command line arguments and stdin:

# Include files modified in the last day, and also include README.md
find . -mtime -1 | files-to-prompt README.md

Claude XML Output

Anthropic has provided specific guidelines for optimally structuring prompts to take advantage of Claude's extended context window.

To structure the output in this way, use the optional --cxml flag, which will produce output like this:

<documents>
<document index="1">
<source>my_directory/file1.txt</source>
<document_content>
Contents of file1.txt
</document_content>
</document>
<document index="2">
<source>my_directory/file2.txt</source>
<document_content>
Contents of file2.txt
</document_content>
</document>
</documents>

--markdown fenced code block output

The --markdown option will output the files as fenced code blocks, which can be useful for pasting into Markdown documents.

files-to-prompt path/to/directory --markdown

The language tag will be guessed based on the filename.

If the code itself contains triple backticks the wrapper around it will use one additional backtick.

Example output:

myfile.py
```python
def my_function():
    return "Hello, world!"
```
other.js
```javascript
function myFunction() {
    return "Hello, world!";
}
```
file_with_triple_backticks.md
````markdown
This file has its own
```
fenced code blocks
```
Inside it.
````

Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

cd files-to-prompt
python -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

files_to_prompt-0.6.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

files_to_prompt-0.6-py3-none-any.whl (10.9 kB view details)

Uploaded Python 3

File details

Details for the file files_to_prompt-0.6.tar.gz.

File metadata

  • Download URL: files_to_prompt-0.6.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for files_to_prompt-0.6.tar.gz
Algorithm Hash digest
SHA256 9af57eecbdb29d3cce034c186493ffc6c1205ea4f5abde6fb32ccb1d96eae40c
MD5 9878860c5e4715e54d1e7d91020d2d12
BLAKE2b-256 b94f81fc86a88dc9e0cf6ea1ac2c561c0ac48b46d314cbbc2db5c8844b4b448b

See more details on using hashes here.

Provenance

The following attestation bundles were made for files_to_prompt-0.6.tar.gz:

Publisher: publish.yml on simonw/files-to-prompt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file files_to_prompt-0.6-py3-none-any.whl.

File metadata

  • Download URL: files_to_prompt-0.6-py3-none-any.whl
  • Upload date:
  • Size: 10.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for files_to_prompt-0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 83d9a8b33246a10233218716a5c78034da4f5614748eda2f0ab94f1117801337
MD5 0dae6147f67e6ec2a793cc5ca6204f3f
BLAKE2b-256 a0990efff50ce810119d99eaa2fc0c7bbf66e4197e2defb89242f6e848004902

See more details on using hashes here.

Provenance

The following attestation bundles were made for files_to_prompt-0.6-py3-none-any.whl:

Publisher: publish.yml on simonw/files-to-prompt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page