Skip to main content

A utility to extract and format a codebase into Markdown format

Project description

Git2Text - Codebase Extraction Utility

Git2Text is a utility that simplifies the process of extracting and formatting the entire structure of a codebase into a single text file. Whether you're working with a local Git project, a remote Git repository, or any other codebase, Git2Text is perfect for copying and pasting your code into ChatGPT or other large language models (LLMs). With Git2Text, you can avoid the hassle of manually copying and pasting the source for LLM consumption.

Features

  • Extract Complete Codebase: Convert your entire codebase into a Markdown-formatted text.
  • Support for Local and Remote Repositories: Work with local directories or clone remote Git repositories on-the-fly.
  • Tree View Representation: Automatically generate a directory structure to provide context.
  • Code Block Formatting: Files are formatted with appropriate syntax highlighting for better readability.
  • Easy Copy to Clipboard: Quickly copy the output for pasting into LLMs like ChatGPT.
  • GLOB Pattern Support: Use powerful GLOB patterns for fine-grained control over file inclusion and exclusion.
  • .gitignore Integration: Respect .gitignore rules by default, with option to override.
  • Cross-Platform Compatibility: Works on Windows, macOS, and Linux.

Prerequisites

  • Python 3.6+
  • Pathspec library for .gitignore parsing (Install via pip install pathspec)
  • Git (for cloning remote repositories)
  • xclip or xsel for Clipboard Support on Linux: If you are using Linux and want clipboard functionality, you need to have either xclip or xsel installed.

Installation

  1. Clone the repository:
    git clone https://github.com/mrauter1/git2text.git
    cd git2text
    

Option 1: Manual Installation

  1. Install the package and dependencies:

    python install.py
    

    This will install the package and attempt to automatically add the git2text executable to your system's PATH.

    If the script cannot automatically modify your PATH, it will prompt you to add it manually or provide instructions for Unix-based systems to create a symlink to /usr/local/bin.

Option 2: Installation Script

Use the provided installation scripts to install the package and ensure git2text is added to your system's PATH automatically.

Windows

Run the following command in Command Prompt:

install.bat

macOS/Linux

Run the following command in your terminal:

chmod +x install.sh
./install.sh

Usage

Once installed, you can run git2text from any terminal or command prompt.

Running the Script

git2text <path-or-url> [options]

The <path-or-url> can be:

Options

  • -o, --output: Specify the output file path.
  • -ig, --ignore: List of files or directories to ignore (supports GLOB patterns).
  • -inc, --include: List of files or directories to include (supports GLOB patterns). If specified, only these paths will be processed.
  • -se, --skip-empty-files: Skip empty files during extraction.
  • -cp, --clipboard: Copy the generated content to the clipboard.
  • -igi, --ignoregitignore: Ignore the .gitignore file when specified.

Example Usage

Extract Entire Codebase from a Local Directory to a Markdown File

git2text /path/to/local/codebase -o output.md

Clone and Extract a Remote Git Repository

git2text https://github.com/username/repo.git -o output.md

This command will clone the specified repository to a temporary directory, extract its contents, and save the output to output.md.

Skip .gitignore and Empty Files

git2text https://github.com/username/repo.git -igi -se -o output.md

Include Only Specific Files and Copy to Clipboard

git2text /path/to/codebase -inc "*.py" -cp

Ignore Specific Files and Directories

git2text /path/to/codebase -ig "*.log" "__pycache__" -o output.md

.globalignore Support

Git2Text also supports a .globalignore file located in the same directory as the git2text.py script. This file works similarly to a .gitignore file but applies globally across any codebase you process.

If a .globalignore file is present, it will be used to exclude files or directories specified in it, in addition to .gitignore.

To ignore the .globalignore file, use the -igi flag:

git2text /path/to/codebase -igi

Modifying .globalignore

To modify or change the global ignore rules, simply edit the .globalignore file located alongside the script. Common entries include ignoring directories like node_modules/, dist/, and files like *.log.

Example .globalignore:

node_modules/
dist/
*.log
*.tmp

Example Output

The output of Git2Text follows a Markdown structure for easy readability. Here's a sample of how it formats the files:

├── main.py
├── folder/
│   ├── file.json

# File: main.py
```python
print("Hello, World!")
```
# End of file: main.py
```
# File: folder/file.json
```json
{"name": "example"}
```
# End of file: folder/file.json

Contributing

Feel free to contribute to the project by opening an issue or submitting a pull request. We welcome feedback and suggestions to improve Git2Text!

License

This project is licensed under the MIT License.

Contact

For any questions or support, please open an issue on the GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git2text-0.1.1.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

git2text-0.1.1-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file git2text-0.1.1.tar.gz.

File metadata

  • Download URL: git2text-0.1.1.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for git2text-0.1.1.tar.gz
Algorithm Hash digest
SHA256 349ecc17a87a69921d815c4915e49eb2ae8288070989294cf5a7530e0e8f72ce
MD5 5485ec8314851504f7b8f0c9fe135eeb
BLAKE2b-256 1636ebd6453abdb7164fad69d7c215c3eef505eb6225e57f2e9ad7fd70436531

See more details on using hashes here.

File details

Details for the file git2text-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: git2text-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for git2text-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bdf091521bd432f1ac4ee77503c2310d884dcbb1b9d3ab609a78a9072f242654
MD5 92b9f3327ecd647b9f6a9ee298315aee
BLAKE2b-256 4d80a61b104a3854b7035aced9b82b429e25a10722987f3f58ba19e68425b437

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page