Skip to main content

A tool to generate comprehensive Markdown artifacts of directory structures and file contents

Project description

codemapper

logo

Overview

The Code Mapper is a powerful Python script that creates a comprehensive Markdown document representing the structure and contents of a given directory or GitHub repository. This tool is designed to provide a quick and thorough overview of codebases, making it invaluable for developers, AI systems, and analysts who need to quickly understand the layout and content of a project.

See audio explainers for this project:

  • podcasts Auto generated by Gemini (NotebookLLM)

Features

  • Generates a hierarchical table of contents based on file structure
  • Creates an accurate file tree representation of the directory structure
  • Produces code blocks for each file's contents with appropriate syntax highlighting
  • Respects .gitignore rules when processing files and directories
  • Excludes .git directories by default
  • Supports various file types with appropriate code fence highlighting
  • Handles file encoding detection for accurate content reading
  • Provides an option to include files normally ignored by .gitignore
  • Can clone and analyze GitHub repositories
  • Saves output in a '_mapped' directory
  • Automatically acknowledges large and binary files without printing their contents
  • Displays file type and size information for large and binary files

Requirements

  • Python 3.6+
  • pathspec library (for handling .gitignore rules)
  • chardet library (for file encoding detection)

Installation

From PyPI

You can install CodeMapper directly from PyPI using pip:

pip install codemapper

From Source

  1. Clone this repository:

    git clone https://github.com/yourusername/codemapper.git
    
  2. Install the required dependencies:

    pip install pathspec chardet
    

Usage

Run the script from the command line, providing the path to the directory or GitHub repository URL you want to analyze:

codemapper <path_to_directory_or_github_url> [--include-ignored]

Options

  • <path_to_directory_or_github_url>: The path to the directory or GitHub repository URL you want to analyze (required)
  • --include-ignored: Include files that are normally ignored by .gitignore (optional)

Output

The script generates a Markdown file named <directory_name>codemap.md in the '_mapped' directory. This file contains:

  1. A table of contents for easy navigation
  2. A file tree representation of the directory structure
  3. The contents of each file, formatted with appropriate syntax highlighting
  4. Information about large and binary files (type and size) without their contents

Example use and output:

codemapper https://github.com/shaneholloman/ansible-role-apache

Example output see here

Use Cases

  • Quickly understand the structure of new or unfamiliar projects
  • Generate documentation for your codebase
  • Facilitate code reviews by providing a comprehensive overview
  • Aid in onboarding new team members to a project
  • Assist AI systems in analyzing and understanding codebases
  • Analyze GitHub repositories without needing to clone them manually

TODO

  • Add support for creating directly from repo url
  • Implement a clever way to include images in the artifacts, maybe base64 encode them directly to the markdown file, but that could chew thru tokens at prompt time? Suggestions?
  • Add support for other Git hosting services (e.g., GitLab, Bitbucket)
  • Implement a progress indicator for cloning/analyzing large repositories
  • Table of Contents in some cases needs improvement. We may add some ignores
  • Use changelog.md for version history

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Thanks to the pathspec and chardet libraries for making this tool possible.

Version History

  • 3.2.0 (2024-09-23):
    • Improved handling of large and binary files:
      • Large and binary files are now always acknowledged without attempting to print their contents
      • File type and size information is displayed for large and binary files
    • Removed option to include large file contents as it's not practical for binary files
    • Simplified command-line options by removing flags related to large file handling
    • Added PyPI installation support

[For full version history, see changelog.md]


Don't forget to star this repository if you find it useful!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codemapper-3.2.3.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

codemapper-3.2.3-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file codemapper-3.2.3.tar.gz.

File metadata

  • Download URL: codemapper-3.2.3.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for codemapper-3.2.3.tar.gz
Algorithm Hash digest
SHA256 c3b2ca07b15f27cf40e7d6281cdb861a9a1819cab5433a92ef213f4e62f91eab
MD5 00910e9c947a97b45e891b5968d8f7bb
BLAKE2b-256 58d11c8ef17cfe7057ee241c9644606697070d96894e018c7dc9fd537db72db8

See more details on using hashes here.

File details

Details for the file codemapper-3.2.3-py3-none-any.whl.

File metadata

  • Download URL: codemapper-3.2.3-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for codemapper-3.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 60bfc369f21491f82ab16a075bd3e67b42c03492bad6cf7a486df926510e691e
MD5 3f6aaea7b6cd5206e1e7c91b27937843
BLAKE2b-256 5a17e4e74238a13551b342b65684655fbc47546e08fec09b03b5a1dabf7d7174

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page