Concatenate text-like files in a directory tree with Typer-powered CLI.
Project description
Project Combiner (combine-files)
project-combiner is a powerful and flexible command-line tool for concatenating text-based files within a directory tree. It's designed to be intuitive, fast, and highly configurable, making it easy to bundle source code, documentation, or any text-like files for analysis, distribution, or large language model contexts.
Highlights
- Intuitive CLI: Powered by Typer, providing a rich
--helpexperience and shell completion. - Cross-Platform: Uses
pathlib.Pathfor seamless operation on Windows, macOS, and Linux. - Highly Configurable: Control everything with command-line flags—no hard-coding required. Specify what to include, what to skip, file encodings, output location, and more.
.gitignoreAware: Automatically respects your project's.gitignorerules (requirespathspec).- Smart File Handling: Skips binary files based on MIME types to prevent garbage output and, by default, any
directory whose name starts with
.(override with--include-dot-dirs). - Performance-Oriented: Features optional multithreaded file reading and a
tqdmprogress bar for large projects. - Flexible Output: Stream combined content to standard output (
stdout) or save it directly to a file.
Installation
You can install project-combiner directly from PyPI.
Full Feature Set
For all features, including .gitignore support and a progress bar, install with the [all] extra:
pip install project-combiner[all]
This installs typer, pathspec, and tqdm.
Minimal Installation
For the core functionality without optional dependencies:
pip install project-combiner
Usage
The basic command is combine-files, followed by the path to the directory you want to process and any desired options.
combine-files [ROOT_DIRS]... [OPTIONS]
Command-Line Options
| Option | Alias | Description | Default |
|---|---|---|---|
--output-file, -o |
Path to the output file. Use - for stdout. |
- (stdout) |
|
--skip-dirs |
Space-separated list of directory names to skip. | .git .hg __pycache__ |
|
--skip-files |
Space-separated list of file names to skip. | ||
--skip-exts |
Space-separated list of file extensions to skip. | ||
--preview-exts |
Space-separated list of extensions to preview instead of including their full content. | ||
--encoding |
The encoding to use for reading files. | utf-8 |
|
--jobs, -j |
Number of parallel threads for reading files. | 2 |
|
--progress |
Show a progress bar during file processing (requires tqdm). |
||
--follow-symlinks |
Follow symbolic links. | False |
|
--skip-dot-dirs / --include-dot-dirs |
Skip directories that start with . (dot). Use the second form to include them. |
--skip-dot-dirs |
|
--log-level |
Set the logging level (e.g., DEBUG, INFO). |
WARNING |
|
--version |
Show the version and exit. | ||
--help |
Show the help message and exit. |
Example Scenario
Let's walk through how to use project-combiner with a typical project structure.
Sample Project Structure
Imagine you have a project with the following layout:
my_project/
├── .gitignore
├── src/
│ ├── main.py
│ ├── utils.py
│ └── data/
│ ├── data.csv
│ └── notes.txt
├── tests/
│ ├── test_main.py
│ └── test_utils.py
├── docs/
│ ├── guide.md
│ └── reference.md
├── .venv/
│ └── ... (virtual environment files)
└── README.md
Your .gitignore file might look like this:
# .gitignore
.venv/
__pycache__/
*.log
Use Cases
1. Combine All Relevant Files
To combine all text-based files in the project while respecting the .gitignore file, simply run:
combine-files my_project
- What it does: It will walk through
my_project, skip the.venvdirectory (as specified in.gitignore), and concatenate the contents of all other text files (.py,.csv,.txt,.md). - Output: The combined content is printed to the terminal (
stdout).
2. Save the Combined Output to a File
To save the output into a single file named combined_output.txt:
combine-files my_project -o combined_output.txt
- What it does: Same as the first example, but the result is written to
combined_output.txtinstead of the console.
3. Exclude the tests Directory
If you want to combine only the application source code and documentation, excluding the tests:
combine-files my_project --skip-dirs tests
- What it does: This command will skip the
tests/directory in addition to the patterns in.gitignore. The output will contain files fromsrc/anddocs/.
4. Combine Only Python Source Files
To isolate just the Python source code from the src directory:
combine-files my_project/src --skip-exts .csv .txt .md
Or, more simply, if you only want to process the src folder:
combine-files my_project/src
Assuming data contains non-python files, they will be skipped if they are binary or if you explicitly skip their extensions.
5. Preview Large Data or Markdown Files
Sometimes you don't want the full content of large data files or verbose documentation. You can "preview" them instead.
combine-files . --preview-exts .md .csv -j 4 --progress
- What it does:
- It processes the entire project (
.). - For any file ending in
.mdor.csv, it will only include a header indicating the file's path and a "preview" message, rather than its full content. - It uses 4 threads (
-j 4) for faster reading and shows a progress bar (--progress).
- It processes the entire project (
The output for a previewed file like docs/guide.md would look like this:
---
File: docs/guide.md (preview)
---
Advanced Usage
Working with Encodings
If your project uses a different file encoding, you can specify it with the --encoding flag. For example, for projects using legacy Windows encodings:
combine-files . --encoding cp1252
Performance
For very large projects with thousands of files, you can speed up the process by increasing the number of threads. A good starting point is the number of cores on your CPU.
# Use 8 threads to read files
combine-files . -j 8 --progress
Contributing
Contributions are welcome! If you have ideas for new features, bug fixes, or improvements, feel free to open an issue or submit a pull request on the project's repository.
Project Links
- Source & Issue Tracker: https://github.com/muhammad-luay/context-tools
- PyPI: https://pypi.org/project/project-combiner/
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file project_combiner-0.1.1.tar.gz.
File metadata
- Download URL: project_combiner-0.1.1.tar.gz
- Upload date:
- Size: 9.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
359eedbd4620b92bda46dca443d80e100f795ee262a78598999abcd156f1c501
|
|
| MD5 |
14b62dbf3282f62cf0dd528b26ad4066
|
|
| BLAKE2b-256 |
9c175daa57fbb805c60b3b7a0dabb5a1c6f8a74f6546af5165a16b65e17e4857
|
File details
Details for the file project_combiner-0.1.1-py3-none-any.whl.
File metadata
- Download URL: project_combiner-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ac6ed53253603dc8392194710acc5b5c5cc9437ff8e85afa177a6fe591848d0
|
|
| MD5 |
454b2a7ecd0a68b73e1fd698674c82dc
|
|
| BLAKE2b-256 |
e67fbc84665ea87e0e3327c06945337252010969db9b340a9e2d04560ffcd176
|