A Python tool for concatenating code.
Project description
Catenator
Catenator is a Python tool for concatenating code files in a directory into a single output string.
Features
- Concatenate code files from a specified directory
- Include or exclude specific file extensions
- Include a directory tree structure
- Include README files in the output
- Output to file, clipboard, or stdout
- gitignore-style .catignore files
Installation
Install using pip
pip install catenator
Usage
As a Command-Line Tool
Basic usage:
catenator /path/to/your/project
Options:
--output FILE: Write output to a file instead of stdout--clipboard: Copy output to clipboard--no-tree: Disable directory tree generation--no-readme: Exclude README files from the output--include EXTENSIONS: Comma-separated list of file extensions to include (replaces defaults)--ignore EXTENSIONS: Comma-separated list of file extensions to ignore--count-tokens: Output approximation of how many tokens in output (tiktoken cl100k_base)--watch: Watch for changes and update output file automatically (requires --output)--ignore-tests: Leave out tests from the concatenated output--token-limit N: Keep output under N tokens by summarizing least important files--llm: Use AI for richer summaries when using --token-limit (requires robot module)
Example:
python catenator.py /path/to/your/project --output concatenated.md --include py,js,ts
As a Python Module
You can also use Catenator in your Python scripts:
from catenator import Catenator
catenator = Catenator(
directory='/path/to/your/project',
include_extensions=['py', 'js', 'ts'],
)
result = catenator.catenate()
print(result)
.catignore File
The .catignore file allows you to specify files and directories that should be excluded from the concatenation process. The syntax is like .gitignore files.
Syntax
Lines starting with # are treated as comments. Blank lines are ignored. Patterns can include filenames, directories, or wildcard characters.
Examples
# Ignore all JavaScript files
*.js
# Ignore specific file
ignored_file.txt
# Ignore entire directory
ignored_dir/
.catconfig.yaml for Custom Builds
For more complex configurations, you can define custom "builds" in a .catconfig.yaml file in your project's root directory. This allows you to specify multiple sets of whitelisted and blacklisted files.
--build Option
To use a build, use the --build command-line option:
catenator /path/to/your/project --build <build_name>
When you use the --build option, the catenator will ignore .catignore and other filtering flags, and will instead rely solely on the whitelist and blacklist defined in the specified build.
Example .catconfig.yaml
Here is an example of a .catconfig.yaml file with two builds, frontend and backend:
builds:
frontend:
whitelist:
- "src/frontend/"
- "README.md"
blacklist:
- "src/frontend/node_modules/"
backend:
whitelist:
- "src/backend/"
- "requirements.txt"
blacklist:
- "*.log"
In this example:
catenator . --build frontendwill concatenate all files insrc/frontend/(exceptnode_modules) and theREADME.mdfile.catenator . --build backendwill concatenate all files insrc/backend/and therequirements.txtfile, excluding any.logfiles.
Token Limit and Summarization
When a project exceeds a specified token limit, catenator uses a progressive approach to fit within the budget:
catenator /path/to/project --token-limit 10000
This will:
- Rank all files by importance to understanding the project
- Include full content for the most important files
- Add summaries for less important files until 90% of budget is used
- Add just docstrings for remaining files until 100% of budget
- Truncate if still over the limit
By default, summaries are structural extracts (function/class signatures and docstrings). For richer AI-generated summaries, add the --llm flag:
catenator /path/to/project --token-limit 10000 --llm
Files are labeled in the output: (summary) for summarized files, (docstring) for docstring-only files. Summaries are cached in ~/.catenator/summaries/. Token counting requires tiktoken; AI summaries require the robot module.
License
This project is licensed under the Creative Commons Zero v1.0 Universal (CC0-1.0) License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file catenator-0.2.7.tar.gz.
File metadata
- Download URL: catenator-0.2.7.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20c7e7a3e972d6f4ad6d6afb23724b8857907fb96414bbd9568c9e4c6230bc26
|
|
| MD5 |
6b2dcdb40f6411eb78784df00cecc8d4
|
|
| BLAKE2b-256 |
fc74f2575f9fe408de03450e4a865a7652bb34de28ce2c595fd55fa35da0eef1
|
File details
Details for the file catenator-0.2.7-py3-none-any.whl.
File metadata
- Download URL: catenator-0.2.7-py3-none-any.whl
- Upload date:
- Size: 16.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0fa3dc384abd834af2bdd1a9cac457e131b21194707077ec163973546b27e180
|
|
| MD5 |
03440a8a6794d477689d53a427b7dafe
|
|
| BLAKE2b-256 |
960d1ba32c503df5dc61048baeecad7296df9fb50efdb78b94e9d8e441e09f8c
|