Skip to main content

A CLI tool to dump projects into an LLM ideal format.

Project description

ctxl: Contextual

PyPI version License: MIT

ctxl is a CLI tool designed to transform project directories into a structured XML format suitable for language models and AI analysis. It intelligently extracts file contents and directory structures while respecting gitignore rules and custom filters. A key feature of ctxl is its ability to automatically detect project types (such as Python, JavaScript, or web projects) based on the files present in the directory. This auto-detection enables ctxl to make smart decisions about which files to include or exclude, ensuring that the output is relevant and concise. Users can also override this auto-detection with custom presets if needed.

The tool creates a comprehensive project snapshot that can be easily parsed by LLMs, complete with a customizable task specification. This task specification acts as a prompt, priming the LLM to provide more targeted and relevant assistance with your project.

ctxl was developed through a bootstrapping process, where each version was used to generate context for an LLM in developing the next version.

Table of Contents

Why ctxl?

ctxl streamlines the process of providing project context to LLMs. Instead of manually copying and pasting file contents or explaining your project structure, ctxl automatically generates a comprehensive, structured representation of your project. This allows LLMs to have a more complete understanding of your codebase, leading to more accurate and context-aware assistance.

Installation

To install ctxl, you can use pip:

pip install ctxl

Quick Start

After installation, you can quickly generate an XML representation of your project:

ctxl /path/to/your/project > project_context.xml

This command will create an XML file containing your project's structure and file contents, which you can then provide to an LLM for analysis or assistance. The XML output includes file contents, directory structure, and a default task description that guides the LLM in analyzing your project.

Usage

Basic Usage

To use ctxl, simply run the following command in your terminal:

ctxl /path/to/your/project

By default, this will output the XML representation of your project to stdout. This allows for piping the output into other CLI tools, for example with LLM:

ctxl /path/to/your/project | llm

To output to a file:

ctxl /path/to/your/project > context.xml

or

ctxl /path/to/your/project -o context.xml

Command-line Options

ctxl offers several command-line options to customize its behavior:

  • -o, --output: Specify the output file path (default: stdout)
  • --presets: Choose preset project types to combine (default: auto-detect)
  • --suffixes: Specify allowed file suffixes (overrides presets)
  • --ignore: Add additional folders/files to ignore
  • --include-dotfiles: Include dotfiles and folders in the output
  • --gitignore: Specify a custom .gitignore file path
  • --task: Include a custom task description in the output
  • --no-auto-detect: Disable auto-detection of project types

Example:

ctxl /path/to/your/project --presets python javascript --output project_context.xml --task "Analyze this project for potential security vulnerabilities"

Presets

ctxl includes presets for common project types:

  • python: Includes .py, .pyi, .pyx, .ipynb files, ignores common Python-specific directories and files
  • javascript: Includes .js, .jsx, .mjs, .cjs files, ignores node_modules and other JS-specific files
  • typescript: Includes .ts, .tsx files, similar ignores to JavaScript
  • web: Includes .html, .css, .scss, .sass, .less files
  • misc: Includes common configuration and documentation file types

The tool can automatically detect project types, or you can specify them manually.

Features

  • Extracts project structure and file contents
  • Supports multiple programming languages and project types
  • Customizable file inclusion/exclusion
  • Respects .gitignore rules
  • Generates XML output for easy parsing
  • Auto-detects project types
  • Allows custom task specifications for LLM priming

How It Works

ctxl operates in several steps:

  1. It scans the specified directory to detect the project type(s).
  2. Based on the detected type(s) or user-specified presets, it determines which files to include or exclude.
  3. It reads the contents of included files and constructs a directory structure.
  4. All this information is then formatted into an XML structure, along with the specified task.
  5. The resulting XML is output to stdout or a specified file.

Output Example

Here's an example of what the XML output might look like when run on the ctxl project itself.

To generate this output, you would run:

ctxl /path/to/ctxl/repository --output ctxl_context.xml

The resulting ctxl_context.xml would look something like this:

<root>
  <project_context>
    <file path="src/ctxl/ctxl.py">
      <content>
        ...truncated for examples sake...
      </content>
    </file>
    <file path="src/ctxl/__init__.py">
      <content>
        ...truncated for examples sake...
      </content>
    </file>
    <file path="pyproject.toml">
      <content>
        ...truncated for examples sake...
      </content>
    </file>
    <directory_structure>
      <directory path=".">
        <file path="README.md" />
        <file path="pyproject.toml" />
        <directory path="src">
          <directory path="src/ctxl">
            <file path="src/ctxl/ctxl.py" />
            <file path="src/ctxl/__init__.py" />
          </directory>
        </directory>
      </directory>
    </directory_structure>
  </project_context>
  <task>Describe this project in detail. Pay special attention to the structure of the code, the design of the project, any frameworks/UI frameworks used, and the overall structure/workflow.</task>
</root>

The XML output provides a comprehensive view of the ctxl project, including file contents, structure, and a task description. This format allows LLMs to easily parse and understand the project context, enabling them to provide more accurate and relevant assistance.

Project Structure

The ctxl project has the following structure:

ctxl/
├── src/
│   └── ctxl/
│       ├── __init__.py
│       └── ctxl.py
├── README.md
└── pyproject.toml

The main functionality is implemented in src/ctxl/ctxl.py.

Troubleshooting

  • Issue: ctxl is not detecting my project type correctly. Solution: Use the --presets option to manually specify the project type(s).

  • Issue: ctxl is including/excluding files I don't want. Solution: Use the --suffixes and --ignore options to customize file selection.

  • Issue: The XML output is too large for my LLM to process. Solution: Try using more specific presets or custom ignore patterns to reduce the amount of included content.

Contributing

Contributions to ctxl are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ctxl-0.0.6.tar.gz (12.0 kB view hashes)

Uploaded Source

Built Distribution

ctxl-0.0.6-py3-none-any.whl (7.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page