Skip to main content

A Python tool for concatenating code.

Project description

Catenator

Catenator is a Python tool for concatenating code files in a directory into a single output string.

Features

  • Concatenate code files from a specified directory
  • Include or exclude specific file extensions
  • Include a directory tree structure
  • Include README files in the output
  • Output to file, clipboard, or stdout
  • gitignore-style .catignore files

Installation

Install using pip

pip install catenator

Usage

As a Command-Line Tool

Basic usage:

catenator /path/to/your/project

Options:

  • --output FILE: Write output to a file instead of stdout
  • --clipboard: Copy output to clipboard
  • --no-tree: Disable directory tree generation
  • --no-readme: Exclude README files from the output
  • --include EXTENSIONS: Comma-separated list of file extensions to include (replaces defaults)
  • --ignore EXTENSIONS: Comma-separated list of file extensions to ignore
  • --count-tokens: Output approximation of how many tokens in output (tiktoken cl100k_base)
  • --watch: Watch for changes and update output file automatically (requires --output)
  • --ignore-tests: Leave out tests from the concatenated output
  • --token-limit N: Keep output under N tokens by summarizing least important files
  • --llm: Use AI for richer summaries when using --token-limit (requires robot module)

Example:

python catenator.py /path/to/your/project --output concatenated.md --include py,js,ts

As a Python Module

You can also use Catenator in your Python scripts:

from catenator import Catenator

catenator = Catenator(
    directory='/path/to/your/project',
    include_extensions=['py', 'js', 'ts'],
)
result = catenator.catenate()
print(result)

.catignore File

The .catignore file allows you to specify files and directories that should be excluded from the concatenation process. The syntax is like .gitignore files.

Syntax

Lines starting with # are treated as comments. Blank lines are ignored. Patterns can include filenames, directories, or wildcard characters.

Examples

# Ignore all JavaScript files
*.js

# Ignore specific file
ignored_file.txt

# Ignore entire directory
ignored_dir/

.catconfig.yaml for Custom Builds

For more complex configurations, you can define custom "builds" in a .catconfig.yaml file in your project's root directory. This allows you to specify multiple sets of whitelisted and blacklisted files.

--build Option

To use a build, use the --build command-line option:

catenator /path/to/your/project --build <build_name>

When you use the --build option, the catenator will ignore .catignore and other filtering flags, and will instead rely solely on the whitelist and blacklist defined in the specified build.

Example .catconfig.yaml

Here is an example of a .catconfig.yaml file with two builds, frontend and backend:

builds:
  frontend:
    whitelist:
      - "src/frontend/"
      - "README.md"
    blacklist:
      - "src/frontend/node_modules/"
  backend:
    whitelist:
      - "src/backend/"
      - "requirements.txt"
    blacklist:
      - "*.log"

In this example:

  • catenator . --build frontend will concatenate all files in src/frontend/ (except node_modules) and the README.md file.
  • catenator . --build backend will concatenate all files in src/backend/ and the requirements.txt file, excluding any .log files.

Token Limit and Summarization

When a project exceeds a specified token limit, catenator uses a progressive approach to fit within the budget:

catenator /path/to/project --token-limit 10000

This will:

  1. Rank all files by importance to understanding the project
  2. Include full content for the most important files
  3. Add summaries for less important files until 90% of budget is used
  4. Add just docstrings for remaining files until 100% of budget
  5. Truncate if still over the limit

By default, summaries are structural extracts (function/class signatures and docstrings). For richer AI-generated summaries, add the --llm flag:

catenator /path/to/project --token-limit 10000 --llm

Files are labeled in the output: (summary) for summarized files, (docstring) for docstring-only files. Summaries are cached in ~/.catenator/summaries/. Token counting requires tiktoken; AI summaries require the robot module.

License

This project is licensed under the Creative Commons Zero v1.0 Universal (CC0-1.0) License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

catenator-0.2.7.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

catenator-0.2.7-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file catenator-0.2.7.tar.gz.

File metadata

  • Download URL: catenator-0.2.7.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for catenator-0.2.7.tar.gz
Algorithm Hash digest
SHA256 20c7e7a3e972d6f4ad6d6afb23724b8857907fb96414bbd9568c9e4c6230bc26
MD5 6b2dcdb40f6411eb78784df00cecc8d4
BLAKE2b-256 fc74f2575f9fe408de03450e4a865a7652bb34de28ce2c595fd55fa35da0eef1

See more details on using hashes here.

File details

Details for the file catenator-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: catenator-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for catenator-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 0fa3dc384abd834af2bdd1a9cac457e131b21194707077ec163973546b27e180
MD5 03440a8a6794d477689d53a427b7dafe
BLAKE2b-256 960d1ba32c503df5dc61048baeecad7296df9fb50efdb78b94e9d8e441e09f8c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page