A CLI tool to aggregate and organize code files from complex projects
Project description
CodeCollector
CodeCollector is a powerful CLI tool designed to help developers easily aggregate and organize code files from complex projects, specifically tailored for providing context to Large Language Models (LLMs) in development workflows.
Purpose
When working on large, complex projects, it can be challenging to provide comprehensive context to LLMs about your codebase. Manually copying and pasting relevant files from various parts of your project is time-consuming and error-prone. CodeCollector solves this problem by allowing you to easily select and aggregate the most relevant code files, creating a consolidated view of your project that can be readily shared with an LLM for more accurate and context-aware assistance.
Features
- Aggregate code files from specified directories
- Interactive mode for selecting specific files and directories
- Customizable file type filtering
- Recursive directory traversal
- Ignore patterns support (similar to .gitignore)
- Configuration file support
- Optimized for providing context to LLMs
Installation
You can install CodeCollector using pip:
pip install codecollector
Usage
Basic Usage
To run CodeCollector in its default mode:
codecollector
This will start the interactive mode in the current directory, allowing you to select the files you want to include in your LLM context.
Command-line Options
codecollector [OPTIONS]
Options:
-d, --directory TEXT: Base directory to start searching from (default: current directory)-o, --output TEXT: Output file name (default: aggregated_output.txt)-r, --recursive / --no-recursive: Enable/disable recursive search (default: recursive)-t, --file-types TEXT: File types to include (can be used multiple times, default: .py)-i, --interactive: Launch interactive mode (default: False in CLI, True when run without arguments)--version: Show the version and exit--help: Show this message and exit
Examples
-
Interaactive mode starting from current dir
codecollector -iYou should then be able to navigate through the project tree and select files whose content you want to include.
-
Collect Python files recursively from the current directory for LLM context:
codecollector -
Collect JavaScript and TypeScript files from a specific project for LLM analysis:
codecollector -d /path/to/project -t .js -t .ts -
Non-recursive collection of Ruby files with a custom output name for focused LLM input:
codecollector --no-recursive -t .rb -o ruby_context.txt -
Interactive mode starting from a specific directory to selectively choose files for LLM context:
codecollector -i -d /path/to/project
Interactive Mode
In interactive mode, use the following keys to select the most relevant files for your LLM context:
- ↑/k: Move cursor up
- ↓/j: Move cursor down
- Space: Expand/Collapse directory
- Enter: Select/Deselect file or directory
- f: Finish selection and process files
- q: Quit without processing
Configuration File
You can create a codecollector.yaml file in your project root to set default options:
directory: /path/to/project
output: llm_context.txt
recursive: true
file_types:
- .py
- .js
- .ts
interactive: true
Ignore Patterns
Create a .ccignore file in your project root to specify ignore patterns:
**/.git/**
**/__pycache__/**
**/*.egg-info/**
**/.pytest_cache/**
**/.vscode/**
**/.idea/**
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Thanks to all contributors who have helped shape CodeCollector
- Inspired by the need for better context provision to LLMs in complex development projects
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codecollector-0.1.0.tar.gz.
File metadata
- Download URL: codecollector-0.1.0.tar.gz
- Upload date:
- Size: 8.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
599bf77da5b859d594f68b75b8820597c8269c7013f494b66ae326d25116c413
|
|
| MD5 |
499b055efbb35b826649b13440358dda
|
|
| BLAKE2b-256 |
3cdb46864988fe668d8b71aa2bdc53a7df21889aa154244a9b6b1a87aebd5fc6
|
File details
Details for the file codecollector-0.1.0-py3-none-any.whl.
File metadata
- Download URL: codecollector-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8002688971ebda904e3b527d1ad6b289ddf010c606604e88e1453be99d58db14
|
|
| MD5 |
61db28f9f074dfed068282ae7d0b747e
|
|
| BLAKE2b-256 |
92422cdea33e1f40dbb634e42306f40bfcbe8fbd2acb47c3393f943d01380fa4
|