A tool to concatenate folders into a single text file, respecting .gitignore and using optional config.
Project description
CodeConcat is a command-line tool to concatenate files within a directory into a single text file. It intelligently filters files based on common ignore patterns (like .git, node_modules), file extensions, and optional user-defined rules, making it ideal for preparing codebases for analysis or large language model (LLM) context stuffing.
Key Features
- Smart Filtering: Automatically excludes common unnecessary files/directories (e.g.,
.git,__pycache__,node_modules, hidden files) and prioritizes known text/code file extensions. - Flexible Control: Use
--excludeand--whitelistwith simple glob patterns (like*.py,docs/*, not complex regex) to fine-tune included/excluded files. - Configuration File: Define project-specific defaults in a
.codeconcat_config.jsonfile in your project root. - Clear Output: Prepends each file's content with its relative path (
--- File: path/to/file.py ---). - Standard Output: Easily pipe the output to other commands or redirect to a file (
codeconcat . > output.txt). - Modern Tooling: Built with modern Python practices, using
pyproject.toml,rufffor linting/formatting, andmypyfor type checking.
Installation
Ensure you have Python 3.8+ installed.
pip install codeconcat
System Dependency: codeconcat uses python-magic for advanced file type detection, which relies on the libmagic library. You might need to install it separately:
- Debian/Ubuntu:
sudo apt-get update && sudo apt-get install -y libmagic1 - macOS (Homebrew):
brew install libmagic - Windows: Installation can be more complex. Consider using WSL or consult
python-magicdocumentation.
If libmagic is not found, codeconcat will still work but rely solely on file extensions for filtering, which is often sufficient.
How to Use
Basic Command Structure
codeconcat <source_path> [output_file] [-e PATTERN] [-w PATTERN] [-v]
Parameters
<source_path>: (Required) Path to the directory to process.[output_file]: (Optional) Path to save the concatenated output. If omitted, output is sent to standard output (stdout).-e PATTERN,--exclude PATTERN: (Optional) Add a glob pattern to exclude files/directories. Can be used multiple times (e.g.,-e '*.log' -e 'temp/'). CLI excludes are added to defaults and config file excludes.-w PATTERN,--whitelist PATTERN: (Optional) Add a glob pattern to only include matching files/directories (after excludes are processed). If omitted, common text/code files are included by default. If used, only files matching these patterns (and not excluded) will be included. Can be used multiple times (e.g.,-w '*.py' -w 'src/*'). CLI whitelists override config file whitelists.-v,--verbose: (Optional) Enable detailed logging output.
Examples
Concatenate current directory to stdout:
codeconcat .
Concatenate a specific repo to a file:
codeconcat ./my-cool-project concatenated_code.txt
Concatenate to a file, excluding log files and the dist directory:
codeconcat ./my-cool-project output.txt -e "*.log" -e "dist/*"
Concatenate only Python and Markdown files:
codeconcat ./my-cool-project output.txt -w "*.py" -w "*.md"
Pipe output to less:
codeconcat . | less
Configuration File (.codeconcat_config.json)
You can place a .codeconcat_config.json file in the root of your <source_path> directory to define default patterns.
Example .codeconcat_config.json:
{
"exclude": [
"*.tmp",
"**/test_data/*",
".cache/"
],
"whitelist": [
"src/**/*.py",
"config/*.yaml",
"*.md"
]
}
Precedence Rules:
- Default Excludes: Applied first (e.g.,
.git,node_modules). - Config File Excludes: Added to the default excludes.
- CLI
--exclude: Added to the combined default and config excludes. - Config File Whitelist: If present, files must match these patterns after passing exclude checks.
- CLI
--whitelist: If present, overrides the config file whitelist. Files must match these patterns after passing exclude checks. - Default Whitelist (Extensions): If no CLI or config whitelist is active, common text/code file extensions are used as an implicit whitelist.
- MIME Type Check: As a final check (if
libmagicis available), files identified as likely binary are excluded.
Contributing
Contributions are welcome!
- Set up:
git clone https://github.com/lguibr/codeconcat.git cd codeconcat python -m venv .venv source .venv/bin/activate # or .venv\Scripts\activate on Windows pip install -r requirements-dev.txt # Installs codeconcat in editable mode + dev tools pre-commit install # Install pre-commit hooks
- Make your changes.
- Run checks:
pre-commit run --all-files(includesruffformat/lint,mypy) - (Optional but Recommended) Add tests using
pytest. - Submit a Pull Request.
License
CodeConcat is distributed under the MIT license. See the LICENSE file for more details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codeconcat-2.2.2.tar.gz.
File metadata
- Download URL: codeconcat-2.2.2.tar.gz
- Upload date:
- Size: 18.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b85decb787281f27d1e22a03a6e10ad4f82856245567d5ee8c5b40a76317b99a
|
|
| MD5 |
015a126c6869f1a45c8d10865e15c68a
|
|
| BLAKE2b-256 |
49b25db7f030eec29f94cf7fd45af17c3a66709d88d3864360d796fc3157aeae
|
File details
Details for the file codeconcat-2.2.2-py3-none-any.whl.
File metadata
- Download URL: codeconcat-2.2.2-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8be98d7267a4dc3b9a3055ec0753b32c2868c0f86665bbd0427ba3c0ec16a434
|
|
| MD5 |
85c9d4e9f17e316e4aebc19146c2ee67
|
|
| BLAKE2b-256 |
46123d3eaf32466c34dadd1e046fe2d1f2d2fdbf6e072c2085376875de1b7a3c
|