A CLI tool to dump projects into an LLM ideal format.
Project description
ctxl: Contextual
ctxl is a CLI tool designed to transform project directories into a structured XML format suitable for language models and AI analysis. It intelligently extracts file contents and directory structures while respecting gitignore rules and custom filters. A key feature of ctxl is its ability to automatically detect project types (such as Python, JavaScript, or web projects) based on the files present in the directory. This auto-detection enables ctxl to make smart decisions about which files to include or exclude, ensuring that the output is relevant and concise. Users can also override this auto-detection with custom presets if needed.
The tool creates a comprehensive project snapshot that can be easily parsed by LLMs, complete with a customizable task specification. This task specification acts as a prompt, priming the LLM to provide more targeted and relevant assistance with your project.
ctxl was developed through a bootstrapping process, where each version was used to generate context for an LLM in developing the next version.
Table of Contents
- Why ctxl?
- Installation
- Quick Start
- Usage
- Features
- How It Works
- Output Example
- Integration with LLMs
- Project Structure
- Troubleshooting
- Contributing
- License
Why ctxl?
ctxl streamlines the process of providing project context to LLMs. Instead of manually copying and pasting file contents or explaining your project structure, ctxl automatically generates a comprehensive, structured representation of your project. This allows LLMs to have a more complete understanding of your codebase, leading to more accurate and context-aware assistance.
Installation
To install ctxl, you can use pip:
pip install ctxl
Quick Start
After installation, you can quickly generate an XML representation of your project:
ctxl /path/to/your/project > project_context.xml
This command will create an XML file containing your project's structure and file contents, which you can then provide to an LLM for analysis or assistance. The XML output includes file contents, directory structure, and a default task description that guides the LLM in analyzing your project.
Usage
Basic Usage
To use ctxl, simply run the following command in your terminal:
ctxl /path/to/your/project
By default, this will output the XML representation of your project to stdout. This allows for piping the output into other CLI tools, for example with LLM:
ctxl /path/to/your/project | llm
To output to a file:
ctxl /path/to/your/project > context.xml
or
ctxl /path/to/your/project -o context.xml
Command-line Options
ctxl offers several command-line options to customize its behavior:
-o, --output
: Specify the output file path (default: stdout)--presets
: Choose preset project types to combine (default: auto-detect)--suffixes
: Specify allowed file suffixes (overrides presets)--ignore
: Add additional folders/files to ignore--include-dotfiles
: Include dotfiles and folders in the output--gitignore
: Specify a custom .gitignore file path--task
: Include a custom task description in the output--no-auto-detect
: Disable auto-detection of project types
Example:
ctxl /path/to/your/project --presets python javascript --output project_context.xml --task "Analyze this project for potential security vulnerabilities"
Presets
ctxl includes presets for common project types:
- python: Includes .py, .pyi, .pyx, .ipynb files, ignores common Python-specific directories and files
- javascript: Includes .js, .jsx, .mjs, .cjs files, ignores node_modules and other JS-specific files
- typescript: Includes .ts, .tsx files, similar ignores to JavaScript
- web: Includes .html, .css, .scss, .sass, .less files
- misc: Includes common configuration and documentation file types
The tool can automatically detect project types, or you can specify them manually.
Features
- Extracts project structure and file contents
- Supports multiple programming languages and project types
- Customizable file inclusion/exclusion
- Respects .gitignore rules
- Generates XML output for easy parsing
- Auto-detects project types
- Allows custom task specifications for LLM priming
How It Works
ctxl operates in several steps:
- It scans the specified directory to detect the project type(s).
- Based on the detected type(s) or user-specified presets, it determines which files to include or exclude.
- It reads the contents of included files and constructs a directory structure.
- All this information is then formatted into an XML structure, along with the specified task.
- The resulting XML is output to stdout or a specified file.
Output Example
Here's an example of what the XML output might look like when run on the ctxl project itself.
To generate this output, you would run:
ctxl /path/to/ctxl/repository --output ctxl_context.xml
The resulting ctxl_context.xml
would look something like this:
<root>
<project_context>
<file path="src/ctxl/ctxl.py">
<content>
...truncated for examples sake...
</content>
</file>
<file path="src/ctxl/__init__.py">
<content>
...truncated for examples sake...
</content>
</file>
<file path="pyproject.toml">
<content>
...truncated for examples sake...
</content>
</file>
<directory_structure>
<directory path=".">
<file path="README.md" />
<file path="pyproject.toml" />
<directory path="src">
<directory path="src/ctxl">
<file path="src/ctxl/ctxl.py" />
<file path="src/ctxl/__init__.py" />
</directory>
</directory>
</directory>
</directory_structure>
</project_context>
<task>Describe this project in detail. Pay special attention to the structure of the code, the design of the project, any frameworks/UI frameworks used, and the overall structure/workflow.</task>
</root>
The XML output provides a comprehensive view of the ctxl project, including file contents, structure, and a task description. This format allows LLMs to easily parse and understand the project context, enabling them to provide more accurate and relevant assistance.
Project Structure
The ctxl project has the following structure:
ctxl/
├── src/
│ └── ctxl/
│ ├── __init__.py
│ └── ctxl.py
├── README.md
└── pyproject.toml
The main functionality is implemented in src/ctxl/ctxl.py
.
Troubleshooting
-
Issue: ctxl is not detecting my project type correctly. Solution: Use the
--presets
option to manually specify the project type(s). -
Issue: ctxl is including/excluding files I don't want. Solution: Use the
--suffixes
and--ignore
options to customize file selection. -
Issue: The XML output is too large for my LLM to process. Solution: Try using more specific presets or custom ignore patterns to reduce the amount of included content.
Contributing
Contributions to ctxl are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.