Consolidate your codebase into structured Markdown context for LLMs (Claude, GPT, Gemini).
Project description
🏛️ Code Assembler Pro
Turn your codebase into structured, LLM-ready context—and rebuild it from AI suggestions.
Code Assembler Pro is a high-grade engineering utility designed to bridge the gap between your source code and Large Language Models (Claude, GPT-4o, Gemini, DeepSeek).
It doesn't just concatenate files; it generates a contextual technical document optimized for LLM ingestion, and provides a reliable rebuild engine to reconstruct projects from AI-modified Markdown files.
🎯 Why Code Assembler Pro?
Copy-pasting raw files into a chat window leads to context loss. Code Assembler Pro solves this by:
- 🗺️ Project Mapping: Automatically generates a clickable Table of Contents and architectural overview.
- ♻️ Bidirectional Workflow: Use
--rebuildto turn an AI's Markdown response back into a physical directory structure. - ⏱️ Token Efficiency: Use
--since(Delta Mode) to send only modified files, saving thousands of tokens. - ✂️ Smart Compression: Use
--compressto reduce a dependency's code to signatures + docstrings only — dramatically shrinking token count while preserving full structural context. - 🛡️ Metadata Manifest: Injects a hidden JSON manifest for 100% reliable project reconstruction and change tracking.
✨ Key Features
- ♻️ Rebuild Mode (
--rebuild): Reconstruct an entire project from a Markdown snapshot. Perfect for applying AI-generated refactors instantly. - ⏱️ Delta Mode (
--since): Generate updates containing only files modified, added, or deleted since a previous assembly. - 🗜️ Compression Mode (
--compress): Reduce source files to structural skeletons — signatures and docstrings only. Python always works out of the box; other languages use individually installed tree-sitter packages. - 📋 Clipboard Integration (
--clip): Direct copy to system clipboard for instant ingestion into LLMs. - 🧠 Architecture Analysis: Detects design patterns (MVC, API, Testing) and provides file distribution stats.
- 📊 Token Metrics: Real-time estimation of token count to stay within model context windows.
- 📝 Enhanced Syntax Highlighting: Support for 50+ extensions including Jinja2, Terraform, and smart detection for
Dockerfile,Makefile, and.env. - 🖥️ Cross-Platform: Native support for Windows, macOS, and Linux with automatic emoji/ASCII adaptation.
🚀 Installation
Standard install (no compression)
pip install code-assembler-pro
With compression support
Python files are always supported via stdlib ast — no extra install needed.
For other languages, install the corresponding extra:
# JavaScript + TypeScript
pip install "code-assembler-pro[compress-web]"
# Rust + Go + C + C++
pip install "code-assembler-pro[compress-systems]"
# A single language
pip install "code-assembler-pro[compress-js]"
pip install "code-assembler-pro[compress-rust]"
# Everything
pip install "code-assembler-pro[compress-all]"
From source (development)
git clone https://github.com/xmehaut/code-assembler-pro.git
cd code-assembler-pro
pip install -e .
💻 Quick Start (CLI)
1. Assemble & Copy (The "One-Shot" Workflow)
Consolidate your code and copy it directly to your clipboard:
code-assembler . --ext py md --clip
2. Iterative Update (The "Token-Saver" Workflow)
Only send what changed since your last assembly:
code-assembler . --ext py --since codebase.md --clip
3. Rebuild from AI (The "Round-Trip" Workflow)
Restore a project from a Markdown file (e.g., after an AI refactor):
code-assembler --rebuild refactored_codebase.md --output-dir ./restored_project
4. Compress a Dependency (The "Skeleton" Workflow)
Generate a lightweight snapshot of a third-party package — full structure, minimal tokens:
# Your own code — full detail
code-assembler src/ --ext py --output my_package.md
# A dependency — signatures + docstrings only
code-assembler .venv/lib/some_dep/ --ext py --compress --output dep_skeleton.md
📖 CLI Options Reference
| Option | Description |
|---|---|
paths |
Files or directories to analyze |
--ext / -e |
Extensions and filenames to include (e.g., py md Dockerfile) |
--output / -o |
Output file name (default: codebase.md) |
--since / -s |
Delta Mode: Only include changes since this snapshot |
--rebuild |
Reconstruct project from a Markdown file |
--output-dir |
Target directory for reconstruction |
--clip / -k |
Copy result directly to clipboard |
--dry-run |
Preview rebuild without writing files |
--compress / -z |
(v4.5) Compress to signatures + docstrings only |
--compress-level |
(v4.5) signatures (default) or docstrings_only |
--interactive / -i |
Launch the interactive wizard |
--config / -c |
Load a JSON configuration file |
--exclude / -x |
Patterns to exclude (added to defaults) |
--max-size |
Maximum file size in MB (default: 10.0) |
--version |
Show version and exit |
🗜️ Compression Mode — How It Works
--compress reduces each file to its structural skeleton. The goal is to give an LLM
full context about a codebase's shape and API surface without the implementation noise.
# Original (full file) — ~80 tokens
def connect(host: str, port: int, timeout: float = 30.0) -> Connection:
"""Establish a TCP connection to the server."""
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(timeout)
sock.connect((host, port))
return Connection(sock)
# Compressed — ~15 tokens
def connect(host: str, port: int, timeout: float = 30.0) -> Connection:
"""Establish a TCP connection to the server."""
...
Language support:
| Language | Requirement |
|---|---|
| Python | ✅ Always available (stdlib ast) |
| JavaScript / JSX | pip install "code-assembler-pro[compress-js]" |
| TypeScript / TSX | pip install "code-assembler-pro[compress-ts]" |
| Rust | pip install "code-assembler-pro[compress-rust]" |
| Go | pip install "code-assembler-pro[compress-go]" |
| Java | pip install "code-assembler-pro[compress-java]" |
| C | pip install "code-assembler-pro[compress-c]" |
| C++ | pip install "code-assembler-pro[compress-cpp]" |
Missing parsers are reported at startup with the exact install command — other files are passed through unchanged.
🔌 Programmatic API
Code Assembler Pro can be integrated into your Python pipelines (CI/CD, custom AI agents).
Basic Assembly
from code_assembler import assemble_codebase
markdown = assemble_codebase(
paths=["./src"],
extensions=[".py", ".js"],
output="context.md"
)
Compressed snapshot of a dependency
markdown = assemble_codebase(
paths=[".venv/lib/requests"],
extensions=[".py"],
output="requests_skeleton.md",
compress=True,
)
Incremental Update (Delta Mode)
assemble_codebase(
paths=["./src"],
extensions=[".py"],
since="previous_snapshot.md",
output="delta_update.md"
)
Project Reconstruction
from code_assembler.rebuilder import CodebaseRebuilder
rebuilder = CodebaseRebuilder("ai_response.md", "./new_src")
rebuilder.rebuild()
⚙️ Advanced Configuration (JSON)
For complex projects, use a JSON configuration file:
{
"paths": ["./src", "./infra"],
"extensions": [".py", ".ts", ".j2", "Dockerfile", ".env"],
"exclude_patterns": ["migrations", "__pycache__", "*.test.ts"],
"output": "project_context.md",
"recursive": true,
"include_readmes": true,
"max_file_size_mb": 2.0,
"truncate_large_files": true,
"truncation_limit_lines": 500,
"compress": false,
"compress_level": "signatures"
}
Run it using: code-assembler --config assembler_config.json
💡 Recommended Use Cases
1. Massive Refactoring Loop
- Assemble your project:
code-assembler . -e py --clip - Paste into Claude: "Refactor this project to use Pydantic v2."
- Save Claude's response as
refactor.md. - Apply changes:
code-assembler --rebuild refactor.md --output-dir .
2. Dependency Context (new in v4.5)
Give the AI full structural context of a library without burning your token budget:
code-assembler .venv/lib/pydantic/ -e py --compress --output pydantic_api.md
3. Incremental Debugging
After fixing a bug, send only the delta to the AI to verify the fix without re-sending the whole codebase:
code-assembler . -e py --since previous_snapshot.md --clip
4. Infrastructure Audit
Include Dockerfile, Makefile, and .tf files to give the AI a full view of your deployment stack.
🤝 Contributing
Contributions are welcome!
- Fork the Project.
- Create your Feature Branch (
git checkout -b feature/AmazingFeature). - Commit your Changes (
git commit -m 'Add some AmazingFeature'). - Push to the Branch.
- Open a Pull Request.
📄 License
Distributed under the MIT License. See LICENSE for more information.
Code Assembler Pro — Give your AI the context it deserves, then take the code back. 🚀
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file code_assembler_pro-4.5.1.tar.gz.
File metadata
- Download URL: code_assembler_pro-4.5.1.tar.gz
- Upload date:
- Size: 52.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0296e76e0d76c1bcab882e4e125e7b1be060eddf352b56975cb10d794de0f156
|
|
| MD5 |
c2ea5b785d9b51f918270b84842afc51
|
|
| BLAKE2b-256 |
0b55418fda488021c518cc85f6af13687a4072e988bf376d028c3b15a2d318f8
|
File details
Details for the file code_assembler_pro-4.5.1-py3-none-any.whl.
File metadata
- Download URL: code_assembler_pro-4.5.1-py3-none-any.whl
- Upload date:
- Size: 39.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e17dc10421c93daa182e6ff132a8cdc50a066d0b8278a796da24b246d3d59650
|
|
| MD5 |
8cb37efdc2e848fe48e344e27491ef78
|
|
| BLAKE2b-256 |
ad100d51276edeb332b556594746dc55c3e128c23ddccacee89117923aa04e81
|