A tool to format repository content into a single Markdown file.
Project description
Repository Formatter
A command-line tool to format repository content into a single Markdown file (repository.md). It includes options for filtering, anonymization, and different processing modes.
- There is a ton of other options for this.
Features
- Generates a Markdown file with repository structure and file contents.
- Filters files/directories based on paths and extensions via a config file.
- Anonymizes specified strings in file paths and content.
- Supports different modes:
normal: Process the entire repository (respecting filters).class: Include only files containing a specific class name.patch: Include a git diff instead of file contents.
- Estimates the token count of the generated Markdown using
tiktoken. - Configurable via a
.repo_formatter.yamlfile.
Installation
From PyPI (Recommended):
pip install repo-formatter
From Source (for Development):
- Clone the repository:
git clone https://github.com/your-username/repo-formatter.git # <-- UPDATE URL cd repo-formatter
- Install in editable mode (includes development dependencies):
pip install -e .[dev]
Usage
repo-formatter [OPTIONS] [DIRECTORY]
Arguments:
DIRECTORY: The path to the repository/directory to process (default: current directory).
Options:
-m MODE,--mode MODE: Processing mode (normal,class,patch). Default:normal.--class-name NAME: Required forclassmode. The name to search for.--diff-target TARGET: Required forpatchmode. Usecurrentfor uncommitted changes, or a git ref/range (e.g.,main,HEAD~2,v1.0..v1.1).-c PATH,--config PATH: Path to the YAML configuration file. If not provided, searches for.repo_formatter.yamlin the target directory and its parents.-a,--anonymize: Enable anonymization using rules from the config file.-o FILENAME,--output FILENAME: Output Markdown filename (default:repository.md, or mode-specific names likediff_....md,class_....md).
Examples:
# Process the current directory with default settings
repo-formatter
# Process a specific directory
repo-formatter ../my-other-project
# Use a specific config file and enable anonymization
repo-formatter -c /path/to/my_config.yaml -a .
# Find all files containing "UserManager"
repo-formatter --mode class --class-name UserManager
# Get uncommitted changes as a patch file (diff_current.md)
repo-formatter --mode patch --diff-target current
# Get the diff between 'develop' branch and 'main' branch (diff_develop..main.md)
repo-formatter --mode patch --diff-target develop..main
Configuration (.repo_formatter.yaml)
Create a .repo_formatter.yaml file in the root of your repository (or specify with -c).
# Paths to exclude, relative to the repository root.
# This matches the full path, so 'data' excludes the 'data' directory at the root,
# and 'app/logs' excludes 'logs' inside 'app'.
exclude_paths:
- .git
- .vscode
- node_modules
- build
- dist
- venv
- __pycache__
- specific_file_to_ignore.log
- app/content # Excludes the 'content' directory inside 'app'
# Force the inclusion of specific files or directories, even if they are in an excluded path.
# This is useful for including a specific file from an otherwise excluded directory.
force_include:
- docs/IMPORTANT.md # Include this file even if 'docs' is in exclude_paths.
# List of file extensions to include (lowercase, including the dot).
# If empty or omitted, all extensions (not excluded by path) are included.
include_extensions:
- .py
- .js
- .html
- .css
- .md
# Dictionary of strings to anonymize (case-insensitive keys).
# The replacement value's case will try to mimic the original match.
anonymize:
"CompanyName": "ClientProject"
"internal_api_key": "REDACTED_KEY"
"ProjectX": "CodenameZephyr"
Development (using Devcontainer)
- Make sure you have Docker and the VS Code "Dev Containers" extension installed.
- Open the
repo-formatterfolder in VS Code. - When prompted, click "Reopen in Container".
- VS Code will build the container and install dependencies.
- You can now run/debug the tool within the isolated container environment. The terminal in VS Code will be inside the container.
# Inside the devcontainer terminal
repo-formatter --help
repo-formatter sample_project # Test on the sample project
repo-formatter sample_project -a # Test anonymization
# Initialize git in sample_project to test patch mode
cd sample_project
git init
git add .
git commit -m "Initial commit"
echo "// New comment" >> src/main.cpp
cd ..
repo-formatter sample_project --mode patch --diff-target current
Token Estimation
The tool uses the tiktoken library to provide accurate token counts for OpenAI models. The library is installed automatically as a dependency.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file repo_formatter-0.2.0.tar.gz.
File metadata
- Download URL: repo_formatter-0.2.0.tar.gz
- Upload date:
- Size: 15.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff765d57147762d95f304380dc4d715f3f15cab1a16b1f4e2442a30cf3a1fe52
|
|
| MD5 |
544ba1288f4c354785c8ec6d5ec06204
|
|
| BLAKE2b-256 |
24263e684d4697c2dd3ef7f0129ac0b53786b269c7e628102fa3bea1a9fdb34f
|
File details
Details for the file repo_formatter-0.2.0-py3-none-any.whl.
File metadata
- Download URL: repo_formatter-0.2.0-py3-none-any.whl
- Upload date:
- Size: 15.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
981211f73f68a3ca51dee346de05fb12bc4dadeba496a90e8ae67894eed7a7db
|
|
| MD5 |
a977210f840f8d1ea9d13a5e722058d3
|
|
| BLAKE2b-256 |
29f6e84878af8a0f3977c9ddde289dc87068d336b129a8e8c58d1a91e5b526c9
|