Skip to main content

A CLI tool to scan a code project and create a single file with the entire project code, ideal for feeding context to LLMs.

Project description

Code Context Compiler

Code Context Compiler is a powerful CLI tool that scans a code project and creates a single file containing the entire project code with file-wise separation. The tool also masks sensitive information in the code to enhance security and provides various customization options.

Features

  • Scans an entire project directory
  • Processes all file types by default
  • Compiles all code files into a single output file
  • Masks sensitive information such as passwords, API keys, and tokens
  • Respects .gitignore patterns and custom ignore patterns
  • Supports custom configuration via YAML files
  • Asynchronous file processing for improved performance
  • Progress indicator for large projects
  • Multiple output formats: text, JSON, and YAML
  • Option to only process files tracked by Git
  • Customizable file extension filtering (optional)
  • Customizable masking patterns
  • AI-friendly output format, ideal for feeding project context to language models

Use Cases

Feeding Project Context to Language Models (LLMs)

Code Context Compiler is designed to be AI-friendly, making it an excellent tool for preparing entire project contexts for language models. Some key benefits include:

  1. Comprehensive Context: By compiling the entire project into a single file, you provide LLMs with a complete view of your codebase, enabling more accurate and context-aware responses.

  2. Structured Output: The file-wise separation in the output allows LLMs to understand the project structure and relationships between different files.

  3. Sensitive Information Protection: With the built-in masking feature, you can safely share your project context with AI models without exposing sensitive data.

  4. Customizable Content: Use the configuration options to include only the files and information relevant to your AI-related tasks.

  5. Multiple Output Formats: Choose between text, JSON, or YAML output to best suit your LLM integration needs.

By using Code Context Compiler to prepare your project data, you can enhance the effectiveness of AI-powered code analysis, documentation generation, code review assistance, and other AI-driven development tools.

LLM-Friendly Output

Code Context Compiler generates output that is specifically designed to be easily understood by Large Language Models (LLMs). Each compiled output includes:

  1. A comprehensive prompt at the beginning of the file, explaining:

    • The structure of the document
    • How to interpret file markers
    • The presence and meaning of masked sensitive information
    • Guidelines for analyzing the code
  2. Clear file demarcation using "File: " prefixes before each file's content.

  3. Consistent formatting across all files in the project.

This structure allows LLMs to easily parse and understand the entire project context, making it ideal for tasks such as:

  • Code analysis and review
  • Documentation generation
  • Answering questions about the project structure and functionality
  • Identifying patterns and potential improvements across the codebase

When using the JSON or YAML output formats, the LLM prompt is included as a separate field, making it even easier for automated systems to leverage this information.

Installation

Using Pip

You can install Code Context Compiler directly from PyPI:

pip install code_context_compiler

Usage

After installation, you can use the tool directly from the command line:

code_context_compiler [OPTIONS] PROJECT_PATH OUTPUT_FILE

...

Using Git clone

To install Code Context Compiler, you need Python 3.8 or later and Poetry. Follow these steps:

  1. Clone the repository:

    git clone https://github.com/yourusername/code_context_compiler.git
    cd code_context_compiler
    
  2. Install dependencies using Poetry:

    poetry install
    

Usage

To use Code Context Compiler, run the following command:

poetry run code_context_compiler [OPTIONS] PROJECT_PATH OUTPUT_FILE

Arguments:

  • PROJECT_PATH: Path to the project to scan
  • OUTPUT_FILE: Path to the output file

Options:

  • --config-file PATH: Path to the configuration file
  • --output-format [text|json|yaml]: Output format (default: text)
  • --help: Show this message and exit

Example:

poetry run code_context_compiler /path/to/your/project /path/to/output/file.txt --config-file config.yaml --output-format json

Configuration

You can customize the behavior of Code Context Compiler by creating a YAML configuration file. Here's an example configuration:

ignore_patterns:
  - "*.log"
  - "*.tmp"
  - "poetry.lock"
  - "package-lock.json"
file_extensions:
  - ".py"
  - ".js"
  - ".java"
mask_patterns:
  - 'password\s*=\s*["\'].*?["\']'
  - 'api[_-]?key\s*=\s*["\'].*?["\']'
use_git: true
  • ignore_patterns: List of file patterns to ignore. By default, common lock files (like poetry.lock, package-lock.json, yarn.lock, etc.) are ignored.
  • file_extensions: List of file extensions to process (if empty, all files are processed)
  • mask_patterns: List of regex patterns to mask sensitive information
  • use_git: Boolean to only process Git-tracked files

Note: If file_extensions is not specified or is an empty list, the tool will process all file types.

The tool comes with sensible defaults, including ignoring common lock files. You can override or extend these defaults in your configuration file.

Development

To set up the development environment:

  1. Ensure you have Python 3.8+ and Poetry installed.
  2. Clone the repository and navigate to the project directory.
  3. Install dependencies:
    poetry install
    
  4. Run tests:
    poetry run pytest
    
  5. Run tests with coverage:
    poetry run pytest --cov=code_context_compiler
    

Contributing

Contributions to Code Context Compiler are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_context_compiler-0.1.1.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

code_context_compiler-0.1.1-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file code_context_compiler-0.1.1.tar.gz.

File metadata

  • Download URL: code_context_compiler-0.1.1.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.11.8 Linux/6.5.0-44-generic

File hashes

Hashes for code_context_compiler-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4ca3dc70f2b7cbdbb7c62ae63094ca3465b053a698c88214f6921862f33ee6f8
MD5 b2d8dc448d79c95c1b0e3b723fab4272
BLAKE2b-256 41b8175362616cc4a26aa8156756ebfbdfb025279a255e6cb85f51b27f2be87e

See more details on using hashes here.

File details

Details for the file code_context_compiler-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for code_context_compiler-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 555f1a5b4822bbf5106afd5b62fd618e23cbd30c08f765ef05cca972d8290cc1
MD5 9bf3efe32689b71fae09611334d00226
BLAKE2b-256 b516a624d444ecbaaac73e50009f0991ce71e8719c6f586a2e0d0ebbe31f6909

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page