Skip to main content

A tool to generate comprehensive Markdown artifacts of directory structures and file contents

Project description

codemapper

Pylint

logo

Overview

The Code Mapper is a powerful Python script that creates a comprehensive Markdown document representing the structure and contents of a given directory or GitHub repository. This tool is designed to provide a quick and thorough overview of codebases, making it invaluable for developers, AI systems, and analysts who need to quickly understand the layout and content of a project.

See audio explainers for this project:

  • podcasts Auto generated by Gemini (NotebookLLM)

Features

  • Generates a hierarchical table of contents based on file structure
  • Creates an accurate file tree representation of the directory structure
  • Produces code blocks for each file's contents with appropriate syntax highlighting
  • Respects .gitignore rules when processing files and directories
  • Excludes .git directories by default
  • Supports various file types with appropriate code fence highlighting
  • Handles file encoding detection for accurate content reading
  • Provides an option to include files normally ignored by .gitignore
  • Can clone and analyze GitHub repositories
  • Saves output in a '_mapped' directory
  • Automatically acknowledges large and binary files without printing their contents
  • Displays file type and size information for large and binary files

Requirements

  • Python 3.6+
  • pathspec library (for handling .gitignore rules)
  • chardet library (for file encoding detection)

Installation

From PyPI

You can install CodeMapper directly from PyPI using pip:

pip install codemapper

From Source

  1. Clone this repository:

    git clone https://github.com/yourusername/codemapper.git
    
  2. Install the required dependencies:

    pip install pathspec chardet
    

Usage

Run the script from the command line, providing the path to the directory or GitHub repository URL you want to analyze:

codemapper <path_to_directory_or_github_url> [--include-ignored]

Options

  • <path_to_directory_or_github_url>: The path to the directory or GitHub repository URL you want to analyze (required)
  • --include-ignored: Include files that are normally ignored by .gitignore (optional)

Output

The script generates a Markdown file named <directory_name>codemap.md in the '_mapped' directory. This file contains:

  1. A table of contents for easy navigation
  2. A file tree representation of the directory structure
  3. The contents of each file, formatted with appropriate syntax highlighting
  4. Information about large and binary files (type and size) without their contents

Example use and output:

codemapper https://github.com/shaneholloman/ansible-role-apache

Example output see here

Use Cases

  • Quickly understand the structure of new or unfamiliar projects
  • Generate documentation for your codebase
  • Facilitate code reviews by providing a comprehensive overview
  • Aid in onboarding new team members to a project
  • Assist AI systems in analyzing and understanding codebases
  • Analyze GitHub repositories without needing to clone them manually

TODO

  • Add support for creating directly from repo url
  • Implement a clever way to include images in the artifacts, maybe base64 encode them directly to the markdown file, but that could chew thru tokens at prompt time? Suggestions?
  • Add support for other Git hosting services (e.g., GitLab, Bitbucket)
  • Implement a progress indicator for cloning/analyzing large repositories
  • Table of Contents in some cases needs improvement. We may add some ignores
    • For TOC consider a more robust library like md_toc no user complaints yet
  • Use changelog.md for version history

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Thanks to the pathspec and chardet libraries for making this tool possible.

Version History

[For full version history, see changelog.md]


Don't forget to star this repository if you find it useful!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codemapper-3.3.0.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

codemapper-3.3.0-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file codemapper-3.3.0.tar.gz.

File metadata

  • Download URL: codemapper-3.3.0.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for codemapper-3.3.0.tar.gz
Algorithm Hash digest
SHA256 66ecee7d4d6225d3ad7c2c6997292f660a454609b3eaed8ac5fa7c1bd441aecf
MD5 13be29637675f725ba39de7e55426298
BLAKE2b-256 8c3efbbc676acf24c4e98501a81ad64ce0ed35fffcd13827b500ad7293a28ecd

See more details on using hashes here.

File details

Details for the file codemapper-3.3.0-py3-none-any.whl.

File metadata

  • Download URL: codemapper-3.3.0-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for codemapper-3.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1618fbcffb1703236da3c331452e6bbadd92673fc2c24f2a93a6c628b63d4eef
MD5 83a207e292633980067377d1fc7695c6
BLAKE2b-256 799a392e7bfc9d195f1e0cb384f4183e8599b7278dacd2b4e0fbc601ad0c0cd0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page