Skip to main content

Concatenate text-like files in a directory tree with Typer-powered CLI.

Project description

Project Combiner (combine-files)

project-combiner is a powerful and flexible command-line tool for concatenating text-based files within a directory tree. It's designed to be intuitive, fast, and highly configurable, making it easy to bundle source code, documentation, or any text-like files for analysis, distribution, or large language model contexts.

PyPI version

Highlights

  • Intuitive CLI: Powered by Typer, providing a rich --help experience and shell completion.
  • Cross-Platform: Uses pathlib.Path for seamless operation on Windows, macOS, and Linux.
  • Highly Configurable: Control everything with command-line flags—no hard-coding required. Specify what to include, what to skip, file encodings, output location, and more.
  • .gitignore Aware: Automatically respects your project's .gitignore rules (requires pathspec).
  • Smart File Handling: Skips binary files based on MIME types to prevent garbage output and, by default, any directory whose name starts with . (override with --include-dot-dirs).
  • Performance-Oriented: Features optional multithreaded file reading and a tqdm progress bar for large projects.
  • Flexible Output: Stream combined content to standard output (stdout) or save it directly to a file.
  • Clipboard‑Ready: Use -c/--clipboard to copy the combined output straight to your system clipboard (requires pyperclip).

Installation

You can install project-combiner directly from PyPI.

Full Feature Set

For all features, including .gitignore support and a progress bar, install with the [all] extra:

pip install project-combiner[all]

This installs typer, pathspec, and tqdm.

Minimal Installation

For the core functionality without optional dependencies:

pip install project-combiner

ℹ️ Add clipboard support later with:

pip install pyperclip

Usage

The basic command is combine-files, followed by the path to the directory you want to process and any desired options.

combine-files [ROOT_DIRS]... [OPTIONS]

Command-Line Options

Option Alias Description Default
--output-file, -o Path to the output file. Use - for stdout. - (stdout)
--skip-dirs Space-separated list of directory names to skip. .git .hg __pycache__
--skip-files Space-separated list of file names to skip.
--skip-exts Space-separated list of file extensions to skip.
--preview-exts Space-separated list of extensions to preview instead of including their full content.
--encoding The encoding to use for reading files. utf-8
--jobs, -j Number of parallel threads for reading files. 2
--progress Show a progress bar during file processing (requires tqdm).
--follow-symlinks Follow symbolic links. False
--skip-dot-dirs / --include-dot-dirs Skip directories that start with . (dot). Use the second form to include them. --skip-dot-dirs
--log-level Set the logging level (e.g., DEBUG, INFO). WARNING
--version Show the version and exit.
--help Show the help message and exit.

Example Scenario

Let's walk through how to use project-combiner with a typical project structure.

Sample Project Structure

Imagine you have a project with the following layout:

my_project/
├── .gitignore
├── src/
│   ├── main.py
│   ├── utils.py
│   └── data/
│       ├── data.csv
│       └── notes.txt
├── tests/
│   ├── test_main.py
│   └── test_utils.py
├── docs/
│   ├── guide.md
│   └── reference.md
├── .venv/
│   └── ... (virtual environment files)
└── README.md

Your .gitignore file might look like this:

# .gitignore
.venv/
__pycache__/
*.log

Use Cases

1. Combine All Relevant Files

To combine all text-based files in the project while respecting the .gitignore file, simply run:

combine-files my_project
  • What it does: It will walk through my_project, skip the .venv directory (as specified in .gitignore), and concatenate the contents of all other text files (.py, .csv, .txt, .md).
  • Output: The combined content is printed to the terminal (stdout).

2. Save the Combined Output to a File

To save the output into a single file named combined_output.txt:

combine-files my_project -o combined_output.txt
  • What it does: Same as the first example, but the result is written to combined_output.txt instead of the console.

3. Exclude the tests Directory

If you want to combine only the application source code and documentation, excluding the tests:

combine-files my_project --skip-dirs tests
  • What it does: This command will skip the tests/ directory in addition to the patterns in .gitignore. The output will contain files from src/ and docs/.

4. Combine Only Python Source Files

To isolate just the Python source code from the src directory:

combine-files my_project/src --skip-exts .csv .txt .md

Or, more simply, if you only want to process the src folder:

combine-files my_project/src

Assuming data contains non-python files, they will be skipped if they are binary or if you explicitly skip their extensions.

5. Preview Large Data or Markdown Files

Sometimes you don't want the full content of large data files or verbose documentation. You can "preview" them instead.

combine-files . --preview-exts .md .csv -j 4 --progress
  • What it does:
    • It processes the entire project (.).
    • For any file ending in .md or .csv, it will only include a header indicating the file's path and a "preview" message, rather than its full content.
    • It uses 4 threads (-j 4) for faster reading and shows a progress bar (--progress).

The output for a previewed file like docs/guide.md would look like this:

---
File: docs/guide.md (preview)
---

6. Copy Output Directly to Clipboard

combine-files . -c

No file writing or terminal spam—your combined content is ready to paste.


Advanced Usage

Working with Encodings

If your project uses a different file encoding, you can specify it with the --encoding flag. For example, for projects using legacy Windows encodings:

combine-files . --encoding cp1252

Performance

For very large projects with thousands of files, you can speed up the process by increasing the number of threads. A good starting point is the number of cores on your CPU.

# Use 8 threads to read files
combine-files . -j 8 --progress

Contributing

Contributions are welcome! If you have ideas for new features, bug fixes, or improvements, feel free to open an issue or submit a pull request on the project's repository.

Project Links

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

project_combiner-0.1.3.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

project_combiner-0.1.3-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file project_combiner-0.1.3.tar.gz.

File metadata

  • Download URL: project_combiner-0.1.3.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for project_combiner-0.1.3.tar.gz
Algorithm Hash digest
SHA256 f8dc9c41a2adbff91f6ff3efe174c43193e4dcc7888c112f076bfe05c8fe8210
MD5 45bb76ccdf28de26a0ee3a916880aa2b
BLAKE2b-256 1ee47ce2391e3d1a4f19b4b83cbe9f77ec8913afe7b94f9ef974375d7955e31a

See more details on using hashes here.

File details

Details for the file project_combiner-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for project_combiner-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 613106f74b63a28e3e0a07268b675cd370f5688b3934cca973a243bc172c04d0
MD5 2db94e27468bf6c9ff86e871cb305340
BLAKE2b-256 95c662fed95cd3fe20dbc5d6020c0ae042fb793a889d3dc22a3246c00ff5d4da

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page