A tool to compress file structures for LLMs.
Project description
File Structure Compressor
Dramatically reduce the token count of your project's file structure before sending it to a Large Language Model (LLM).
Overview
When working with Large Language Models like GPT-4, Claude, or Gemini, providing the context of a project's file structure is crucial for tasks like code generation, debugging, and architectural analysis. However, sending a simple list of file paths for a large project consumes an enormous number of tokens, quickly exhausting the context window and increasing API costs.
File Structure Compressor is a lightweight, zero-dependency Python utility designed to intelligently compress a directory structure into several token-efficient formats, each with its own balance of compactness and LLM readability.
Key Features
- Massive Token Savings: Reduce the character count of your file structure by up to 70% compared to a plain file list.
- Multiple Compression Formats: Choose the best representation for your needs:
- ASCII Tree: The recommended default. Highly readable for both humans and LLMs, offering excellent compression.
- JSON Tree: A structured, machine-readable format.
- Custom Compact Format: An ultra-dense format for maximum token savings.
- Flexible Input Sources:
- Scan a project directory from the filesystem.
- Build a structure from a pre-existing list of file paths (e.g., from
git ls-files).
- Intelligent Filtering: Easily exclude irrelevant files and directories (like
.git,__pycache__,node_modules) using.gitignore-style patterns. - Depth Control: Limit the recursion depth to show only the most relevant parts of a complex project.
- Simple CLI & API: Use it as a command-line tool or integrate it directly into your Python scripts.
Why File Structure Compressor?
Sending a raw file list is inefficient:
# Costly and redundant
D:/project/src/main.py
D:/project/src/api/routes.py
D:/project/src/api/models.py
D:/project/src/utils/helpers.py
This tool transforms it into a clear and concise representation that LLMs can easily understand, without the redundant path prefixes.
# Efficient and readable (ASCII Tree)
D:/project/
├── main.py
├── api/
│ ├── routes.py
│ └── models.py
└── utils/
└── helpers.py
Installation
pip install file-structure-compressor
Usage
Method 1: From a Project Directory
This is the most common use case. Simply import FileStructureCompressor, point it to your project root, and generate the desired format.
import os
from pathlib import Path
from file_structure_compressor import FileStructureCompressor
# --- 1. Set up a dummy project structure for demonstration ---
project_root = Path("my_temp_project")
project_root.mkdir(exist_ok=True)
(project_root / "src").mkdir(exist_ok=True)
(project_root / "src" / "api").mkdir(exist_ok=True)
(project_root / ".git").mkdir(exist_ok=True)
(project_root / "README.md").touch()
(project_root / "src" / "main.py").touch()
(project_root / "src" / "api" / "routes.py").touch()
# --- 2. Initialize the compressor with filtering rules ---
compressor = FileStructureCompressor(
root_dir=project_root,
exclude_dirs=[".git", ".idea", "node_modules"],
)
# --- 3. Generate the ASCII tree ---
ascii_tree = compressor.generate_ascii_tree()
print("--- ASCII Tree Generated from Directory ---")
print(ascii_tree)
Method 2: From a List of File Paths
If you already have a list of files (e.g., from a version control or build tool), you can use the .from_paths() class method to avoid re-scanning the filesystem.
from file_structure_compressor import FileStructureCompressor
# --- 1. Assume you have a list of file paths from another command ---
file_paths = [
"/app/src/main.py",
"/app/src/utils/parser.py",
"/app/config.json",
"/app/README.md",
"/app/src/api/v1/endpoint.py",
"/app/tests/test_main.py"
]
# --- 2. Initialize the compressor using the .from_paths() class method ---
# The tool will automatically infer the common root path `/app`
compressor_from_list = FileStructureCompressor.from_paths(file_paths)
# --- 3. Generate your desired format ---
ascii_tree_from_list = compressor_from_list.generate_ascii_tree()
print("--- ASCII Tree Generated from List ---")
print(ascii_tree_from_list)
# You can generate other formats as well
# compact_format = compressor_from_list.generate_custom_format()
# print("\n--- Custom Compact Format from List ---")
# print(compact_format)
Expected Output
--- ASCII Tree Generated from Directory ---
my_temp_project/
├── README.md
└── src/
├── main.py
└── api/
└── routes.py
--- ASCII Tree Generated from List ---
app/
├── README.md
├── config.json
├── src/
│ ├── main.py
│ ├── utils/
│ │ └── parser.py
│ └── api/
│ └── v1/
│ └── endpoint.py
└── tests/
└── test_main.py
Format Comparison
Choose the format that best fits your use case.
| Format | Token Efficiency | LLM Readability | Best For |
|---|---|---|---|
| ASCII Tree | High | Excellent | Most use cases; provides clear structure that LLMs understand well. |
| JSON Tree | Medium | Good | Programmatic use or when the LLM task involves JSON manipulation. |
| Custom | Very High | Low (Requires prompt explanation) | Extreme cases of context window limitation where every token matters. |
To use the Custom format effectively, you should instruct the LLM on how to read it, for example:
"The following string represents a file structure where directories are followed by parentheses containing their contents:
root(file1,subdir(file2))."
Command-Line Interface (CLI)
For quick use in your terminal:
# Generate an ASCII tree, excluding common directories, up to a depth of 3
file-structure-compressor . --format ascii --exclude .git,node_modules,build --depth 3
# Generate a compact representation and copy it to the clipboard
file-structure-compressor /path/to/your/project --format compact | pbcopy
Contributing
Contributions are welcome! If you have ideas for new features, optimizations, or formats, please open an issue or submit a pull request.
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature). - Commit your changes (
git commit -am 'Add some feature'). - Push to the branch (
git push origin feature/your-feature). - Create a new Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file file_structure_compressor-0.1.0.tar.gz.
File metadata
- Download URL: file_structure_compressor-0.1.0.tar.gz
- Upload date:
- Size: 7.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d3c4f28cf82c8a2e650729ab7049167110f08780a764fe0998256a43e61c7e8b
|
|
| MD5 |
ffc44564c20456f6e18e88356a77a6a1
|
|
| BLAKE2b-256 |
86ffe3f9c35f34a91c8ec6d2a642c08540b2a1358e1cd817f673fc7b0d9d15eb
|
Provenance
The following attestation bundles were made for file_structure_compressor-0.1.0.tar.gz:
Publisher:
publish.yml on chouzz/file-structure-compressor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
file_structure_compressor-0.1.0.tar.gz -
Subject digest:
d3c4f28cf82c8a2e650729ab7049167110f08780a764fe0998256a43e61c7e8b - Sigstore transparency entry: 273273048
- Sigstore integration time:
-
Permalink:
chouzz/file-structure-compressor@a749cca1e72ba43f5db8ccea9164b147b2e7fe19 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/chouzz
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a749cca1e72ba43f5db8ccea9164b147b2e7fe19 -
Trigger Event:
push
-
Statement type:
File details
Details for the file file_structure_compressor-0.1.0-py3-none-any.whl.
File metadata
- Download URL: file_structure_compressor-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48c2d41ebcc74ff19556443180c72b1c4d9b0cf322be5501f4dabd545774960c
|
|
| MD5 |
c2c5ccf84011ec294bf9f0de1c05d6e0
|
|
| BLAKE2b-256 |
edb430474ca69ecdcf6f0d55653e117cbe6da0ddfa375f9624e65104299b6f85
|
Provenance
The following attestation bundles were made for file_structure_compressor-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on chouzz/file-structure-compressor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
file_structure_compressor-0.1.0-py3-none-any.whl -
Subject digest:
48c2d41ebcc74ff19556443180c72b1c4d9b0cf322be5501f4dabd545774960c - Sigstore transparency entry: 273273051
- Sigstore integration time:
-
Permalink:
chouzz/file-structure-compressor@a749cca1e72ba43f5db8ccea9164b147b2e7fe19 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/chouzz
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@a749cca1e72ba43f5db8ccea9164b147b2e7fe19 -
Trigger Event:
push
-
Statement type: