LLM-optimized codebase snapshot generator

These details have not been verified by PyPI

Project links

Project description

mdrepoatlas: Codebase to Markdown (LLM-Optimized Snapshot Generator)

mdrepoatlas converts a software project into a single structured Markdown document (code_base.md) designed for Large Language Models to navigate efficiently.

Instead of uploading repositories, zipping folders, or pasting fragments, mdrepoatlas produces a deterministic, navigable, AI-ready snapshot of your codebase.

Why mdrepoatlas Exists

LLMs do not understand repositories.

They understand documents.

Traditional repo exports create problems:

❌ Too many irrelevant files (node_modules, binaries)
❌ No navigation structure
❌ Context fragmentation
❌ Token waste
❌ Hard for LLMs to reason globally

mdrepoatlas solves this by generating a single authoritative document:


code_base.md

containing:

✅ Metadata header
✅ Project fingerprint detection
✅ Directory tree
✅ Language-grouped index
✅ Deterministic file ordering
✅ Binary/build exclusion
✅ Size-safe embedding
✅ Stable navigation anchors

The result is a document an LLM can read, internalize, and navigate efficiently.

Example Output


code_base.md
├── Metadata Header
├── Project Navigation Guide
├── Directory Structure
├── File Index (grouped by language)
└── Full Source Files
└── ### FILE: src/main.py (...)

Supported Projects

mdrepoatlas is framework-agnostic.

Works with:

Python / Django / FastAPI
React / Next.js / Node
C / C++
Fortran
Rust / Go
Mixed monorepos
Research repositories
Scientific computing projects
Enterprise platforms

Installation

Clone:

git clone https://github.com/DavidoffichW/mdrepoatlas.git
cd mdrepoatlas

Install (editable)

pip install -e .

Usage

Interactive mode:

mdrepoatlas

Non-interactive:

mdrepoatlas /path/to/repo -t /path/to/output -o code_base.md

Exclude patterns (comma-separated; supports globs):

mdrepoatlas /path/to/repo -x "node_modules/**,dist/**,*.pdf"

Disable default exclusions:

mdrepoatlas /path/to/repo --no-default-excludes

Size limits:

mdrepoatlas /path/to/repo --max-file-bytes 1048576 --max-total-bytes 0

No dependencies required.

Python 3.8+ recommended.

You will be prompted for:

Prompt	Description
Source directory	Project root
Target directory	Output location
Excludes	Optional glob patterns
Default excludes	Skip builds/binaries
Size limits	Prevent huge files

Example:

Source directory:
~/projects/my_app

Target directory:
~/exports

Exclude:
docs/build/**, *.csv

Output:

exports/code_base.md

Default Smart Exclusions

Automatically removes noise:

.git/
node_modules/
virtual environments
build artifacts
binaries
media files
compiled objects
caches

LLM receives signal only.

Why This Works Well For LLMs

The generated document teaches the model how to read it.

Key design principles:

1. Deterministic Structure

Every file appears as:

### FILE: path/to/file.py (metadata)

LLMs can jump instantly.

2. Navigation Before Content

Models first learn:

project structure
entrypoints
languages
priorities

before reading implementation.

3. Context Efficiency

Instead of scanning thousands of irrelevant files:

binaries are omitted
minified bundles skipped
oversized files summarized

Example Prompt for ChatGPT / Claude

After generating code_base.md, upload it and start with:

🔹 Recommended Initialization Prompt

You are now analyzing a full project snapshot.

The uploaded file `code_base.md` is an authoritative
LLM-optimized export of the repository.

Instructions:
1. Read the metadata header first.
2. Use the Directory Structure and Index sections to build a mental map.
3. Treat each "### FILE:" section as an independent module.
4. Do NOT assume missing files exist outside the snapshot.
5. Prefer entrypoints and core modules when reasoning.

First task:
Summarize the system architecture and identify primary subsystems.

🔹 Example Follow-up Prompts

Architecture understanding:

Explain the project architecture using only the snapshot.

Refactoring:

Identify architectural weaknesses and propose improvements.

Bug investigation:

Search for potential concurrency or state-management issues.

Feature design:

Design a new feature consistent with existing patterns.

Recommended LLM Workflow

Run mdrepoatlas
Upload code_base.md
Initialize model using prompt above
Work normally

You now have full-repo reasoning.

Design Philosophy

mdrepoatlas treats an LLM as:

a deterministic reader of structured technical documents.

The goal is not compression.

The goal is cognitive alignment between repository and model.

Comparison

Method	Result
Upload repo	❌ inconsistent
Paste files	❌ fragmented
Zip archive	❌ opaque
`mdrepoatlas`	✅ structured understanding

Roadmap

Planned improvements:

pip installable CLI
gitignore parsing
incremental snapshots
diff snapshots
multi-document mode
token estimation
IDE integration
local LLM pipeline support

Contributing

PRs welcome.

Good areas:

language detection
ordering heuristics
performance
additional exclusions
LLM workflow research

License

MIT License.

Author

Created to bridge software engineering and AI reasoning workflows.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Mar 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mdrepoatlas-0.1.2.tar.gz (14.5 kB view details)

Uploaded Mar 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mdrepoatlas-0.1.2-py3-none-any.whl (12.2 kB view details)

Uploaded Mar 3, 2026 Python 3

File details

Details for the file mdrepoatlas-0.1.2.tar.gz.

File metadata

Download URL: mdrepoatlas-0.1.2.tar.gz
Upload date: Mar 3, 2026
Size: 14.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.2

File hashes

Hashes for mdrepoatlas-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`b04a7204dd4fd40ab5707f5e3784921bcb500b09b9c9c94a011537a178a83bd4`
MD5	`4aa9e056fd4a853086f184d6f4108ab5`
BLAKE2b-256	`51bbd293aab32ffe4611c508379ea70e240e448c5d64b014444c9adba050b356`

See more details on using hashes here.

File details

Details for the file mdrepoatlas-0.1.2-py3-none-any.whl.

File metadata

Download URL: mdrepoatlas-0.1.2-py3-none-any.whl
Upload date: Mar 3, 2026
Size: 12.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.2

File hashes

Hashes for mdrepoatlas-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fb401f5ef536935e8b5a1c5c2657e87a4be67b11e20307684d37d3203dbba0a5`
MD5	`e89480f5cf93b3820b362d421296c7c2`
BLAKE2b-256	`2a93101b62087470ca4f3bba163cc91b1e920bbb37b22f1ca077dc7fd4110745`

See more details on using hashes here.

mdrepoatlas 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mdrepoatlas: Codebase to Markdown (LLM-Optimized Snapshot Generator)

Why mdrepoatlas Exists

Example Output

Supported Projects

Installation

Install (editable)

Usage

Default Smart Exclusions

Why This Works Well For LLMs

1. Deterministic Structure

2. Navigation Before Content

3. Context Efficiency

Example Prompt for ChatGPT / Claude

🔹 Recommended Initialization Prompt

🔹 Example Follow-up Prompts

Recommended LLM Workflow

Design Philosophy

Comparison

Roadmap

Contributing

License

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes