A tool to convert code snippets into AI prompts for documentation or explanation purposes.
Project description
Code2Prompt
Code2Prompt is a powerful command-line tool that generates comprehensive prompts from codebases, designed to streamline interactions between developers and Large Language Models (LLMs) for code analysis, documentation, and improvement tasks.
Table of Contents
- Why Code2Prompt?
- Features
- Installation
- Quick Start
- Usage
- Options
- Examples
- Templating System
- Integration with LLM CLI
- GitHub Actions Integration
- Configuration File
- Troubleshooting
- Contributing
- License
Why Code2Prompt?
When working with Large Language Models on software development tasks, providing extensive context about the codebase is crucial. Code2Prompt addresses this need by:
- Offering a holistic view of your project, enabling LLMs to better understand the overall structure and dependencies.
- Allowing for more accurate recommendations and suggestions from LLMs.
- Maintaining consistency in coding style and conventions across the project.
- Facilitating better interdependency analysis and refactoring suggestions.
- Enabling more contextually relevant documentation generation.
- Helping LLMs learn and apply project-specific patterns and idioms.
Features
- Process single files or entire directories
- Support for multiple programming languages
- Gitignore integration
- Comment stripping
- Line number addition
- Custom output formatting using Jinja2 templates
- Token counting for AI model compatibility
- Clipboard copying of generated content
- Automatic traversal of directories and subdirectories
- File filtering based on patterns
- File metadata inclusion (extension, size, creation time, modification time)
- Graceful handling of binary files and encoding issues
Installation
Choose one of the following methods to install Code2Prompt:
Using pip (recommended)
pip install code2prompt
Using Poetry
- Ensure you have Poetry installed:
curl -sSL https://install.python-poetry.org | python3 -
- Install Code2Prompt:
poetry add code2prompt
Using pipx
pipx install code2prompt
Quick Start
-
Generate a prompt from a single Python file:
code2prompt --path /path/to/your/script.py
-
Process an entire project directory and save the output:
code2prompt --path /path/to/your/project --output project_summary.md
-
Generate a prompt for multiple files, excluding tests:
code2prompt --path /path/to/src --path /path/to/lib --exclude "*/tests/*" --output codebase_summary.md
Usage
The basic syntax for Code2Prompt is:
code2prompt --path /path/to/your/code [OPTIONS]
For multiple paths:
code2prompt --path /path/to/dir1 --path /path/to/file2.py [OPTIONS]
Options
Option | Short | Description |
---|---|---|
--path |
-p |
Path(s) to the directory or file to process (required, multiple allowed) |
--output |
-o |
Name of the output Markdown file |
--gitignore |
-g |
Path to the .gitignore file |
--filter |
-f |
Comma-separated filter patterns to include files (e.g., ".py,.js") |
--exclude |
-e |
Comma-separated patterns to exclude files (e.g., ".txt,.md") |
--case-sensitive |
Perform case-sensitive pattern matching | |
--suppress-comments |
-s |
Strip comments from the code files |
--line-number |
-ln |
Add line numbers to source code blocks |
--no-codeblock |
Disable wrapping code inside markdown code blocks | |
--template |
-t |
Path to a Jinja2 template file for custom prompt generation |
--tokens |
Display the token count of the generated prompt | |
--encoding |
Specify the tokenizer encoding to use (default: "cl100k_base") | |
--create-templates |
Create a templates directory with example templates | |
--version |
-v |
Show the version and exit |
Command Parameters
--filter
or -f
and --exclude
or -e
The --filter
and --exclude
options allow you to specify patterns for files or directories that should be included in or excluded from processing, respectively.
Syntax:
--filter "PATTERN1,PATTERN2,..."
--exclude "PATTERN1,PATTERN2,..."
or
-f "PATTERN1,PATTERN2,..."
-e "PATTERN1,PATTERN2,..."
Description:
- Both options accept a comma-separated list of patterns.
- Patterns can include wildcards (
*
) and directory indicators (**
). - Case-sensitive by default (use
--case-sensitive
flag to change this behavior). --exclude
patterns take precedence over--filter
patterns.
Examples:
-
Include only Python files:
--filter "**.py"
-
Exclude all Markdown files:
--exclude "**.md"
-
Include specific file types in the src directory:
--filter "src/**.{js,ts}"
-
Exclude multiple file types and a specific directory:
--exclude "**.log,**.tmp,**/node_modules/**"
-
Include all files except those in 'test' directories:
--filter "**" --exclude "**/test/**"
-
Complex filtering (include JavaScript files, exclude minified and test files):
--filter "**.js" --exclude "**.min.js,**test**.js"
-
Include specific files across all directories:
--filter "**/config.json,**/README.md"
-
Exclude temporary files and directories:
--exclude "**/.cache/**,**/tmp/**,**.tmp"
-
Include source files but exclude build output:
--filter "src/**/*.{js,ts}" --exclude "**/dist/**,**/build/**"
-
Exclude version control and IDE-specific files:
--exclude "**/.git/**,**/.vscode/**,**/.idea/**"
Important Notes:
- Always use double quotes around patterns to prevent shell interpretation of special characters.
- Patterns are matched against the full path of each file, relative to the project root.
- The
**
wildcard matches any number of directories. - Single
*
matches any characters within a single directory or filename. - Use commas to separate multiple patterns within the same option.
- Combine
--filter
and--exclude
for fine-grained control over which files are processed.
Best Practices:
- Start with broader patterns and refine as needed.
- Test your patterns on a small subset of your project first.
- Use the
--case-sensitive
flag if you need to distinguish between similarly named files with different cases. - When working with complex projects, consider using a configuration file to manage your filter and exclude patterns.
By using the --filter
and --exclude
options effectively and safely (with proper quoting), you can precisely control which files are processed in your project, ensuring both accuracy and security in your command execution.
Examples
-
Generate documentation for a Python library:
code2prompt --path /path/to/library --output library_docs.md --suppress-comments --line-number --filter "*.py"
-
Prepare a codebase summary for a code review, focusing on JavaScript and TypeScript files:
code2prompt --path /path/to/project --filter "*.js,*.ts" --exclude "node_modules/*,dist/*" --template code_review.j2 --output code_review.md
-
Create input for an AI model to suggest improvements, focusing on a specific directory:
code2prompt --path /path/to/src/components --suppress-comments --tokens --encoding cl100k_base --output ai_input.md
-
Analyze comment density across a multi-language project:
code2prompt --path /path/to/project --template comment_density.j2 --output comment_analysis.md --filter "*.py,*.js,*.java"
-
Generate a prompt for a specific set of files, adding line numbers:
code2prompt --path /path/to/important_file1.py --path /path/to/important_file2.js --line-number --output critical_files.md
Templating System
Code2Prompt supports custom output formatting using Jinja2 templates. To use a custom template:
code2prompt --path /path/to/code --template /path/to/your/template.j2
Creating Template Examples
Use the --create-templates
command to generate example templates:
code2prompt --create-templates
This creates a templates
directory with sample Jinja2 templates, including:
default.j2
: A general-purpose templateanalyze-code.j2
: For detailed code analysiscode-review.j2
: For thorough code reviewscreate-readme.j2
: To assist in generating README filesimprove-this-prompt.j2
: For refining AI prompts
For full template documentation, see Documentation Templating.
Integration with LLM CLI
Code2Prompt can be integrated with Simon Willison's llm CLI tool for enhanced code analysis.
Installation
pip install code2prompt llm
Basic Usage
-
Generate a code summary and analyze it with an LLM:
code2prompt --path /path/to/your/project | llm "Analyze this codebase and provide insights on its structure and potential improvements"
-
Process a specific file and get refactoring suggestions:
code2prompt --path /path/to/your/script.py | llm "Suggest refactoring improvements for this code"
For more advanced use cases, refer to the Integration with LLM CLI section in the full documentation.
GitHub Actions Integration
You can integrate Code2Prompt into your GitHub Actions workflow. Here's an example:
name: Code Analysis
on: [push]
jobs:
analyze-code:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
pip install code2prompt llm
- name: Analyze codebase
run: |
code2prompt --path . | llm "Perform a comprehensive analysis of this codebase. Identify areas for improvement, potential bugs, and suggest optimizations." > analysis.md
- name: Upload analysis
uses: actions/upload-artifact@v2
with:
name: code-analysis
path: analysis.md
Configuration File
Code2Prompt supports a .code2promptrc
configuration file in JSON format for setting default options. Place this file in your project or home directory.
Example .code2promptrc
:
{
"suppress_comments": true,
"line_number": true,
"encoding": "cl100k_base",
"filter": "*.py,*.js",
"exclude": "tests/*,docs/*"
}
Troubleshooting
-
Issue: Code2Prompt is not recognizing my .gitignore file. Solution: Run Code2Prompt from the project root, or specify the .gitignore path with
--gitignore
. -
Issue: The generated output is too large for my AI model. Solution: Use
--tokens
to check the count, and refine--filter
or--exclude
options. -
Issue: Encoding-related errors when processing files. Solution: Try a different encoding with
--encoding
, e.g.,--encoding utf-8
. -
Issue: Some files are not being processed. Solution: Check for binary files or exclusion patterns. Use
--case-sensitive
if needed.
Contributing
Contributions to Code2Prompt are welcome! Please read our Contributing Guide for details on our code of conduct and the process for submitting pull requests.
License
Code2Prompt is released under the MIT License. See the LICENSE file for details.
⭐ If you find Code2Prompt useful, please give us a star on GitHub! It helps us reach more developers and improve the tool. ⭐
Project Growth
Made with ❤️ by Raphaël MANSUY
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file code2prompt-0.6.7.tar.gz
.
File metadata
- Download URL: code2prompt-0.6.7.tar.gz
- Upload date:
- Size: 22.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d31bb6cd2c93a79f40d15e3ee4f993cb3f50ffdc52f81fd7ea5c0e040aa74016 |
|
MD5 | e28e562d00816a864c381b64bd985ad3 |
|
BLAKE2b-256 | 0f8a8bb9b4be04265db2c0fb7c1ae46e728a2ba4d8a3772e3e781c6a11ab1fc6 |
File details
Details for the file code2prompt-0.6.7-py3-none-any.whl
.
File metadata
- Download URL: code2prompt-0.6.7-py3-none-any.whl
- Upload date:
- Size: 26.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Darwin/23.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a8e5586ba1f9d06402dd980b9eeeada826e9909d4e6166a05e8bcef31b2afb7 |
|
MD5 | c54560b49b5983f6cd5acc24bdee6262 |
|
BLAKE2b-256 | 6280f5b945266c15a1b2b9c6dd5a50696d14f58b396df09a7ca0a617700a358d |