Skip to main content

Get Text of Your Repo for AI, LLMs & Docs!

Project description

🚀 Gittxt: Get Text of Your Repo for AI, LLMs & Docs!

Gittxt is a lightweight CLI tool that extracts text from Git repositories and formats it into AI-friendly outputs (.txt, .json, .md).
Whether you’re using ChatGPT, Grok, or Ollama, or any LLM, Gittxt helps process repositories for insights, training, and documentation.

✨ Why Use Gittxt?

Extract Readable Text from Git Repos
Convert Code & Docs into AI-Friendly Formats
Generate JSON for LLM Training (Ideal for AI Preprocessing)
Create Markdown Files for Documentation
Summarize & Analyze GitHub Repositories


📌 Installation (From PyPI)

pip install gittxt

Verify installation:

gittxt --help

Expected Output:

Usage: gittxt [OPTIONS] SOURCE
Options:
  --include TEXT
  --exclude TEXT
  --size-limit INTEGER
  --branch TEXT
  --output-dir TEXT
  --output-format [txt|json|md]
  --max-lines INTEGER
  --summary
  --debug
  --help  Show this message and exit.

📌 How to Use Gittxt

1️⃣ Extract Text from a Local Repository

gittxt .

✅ Extracts all readable text from your repo into gittxt-outputs/text/.


2️⃣ Extract from a Remote GitHub Repo

gittxt https://github.com/sandy-sp/sandy-sp

✅ Automatically clones the repo, scans it, and extracts text.


3️⃣ Use AI-Friendly Output Formats

🧠 JSON (Best for AI & LLM Training)

gittxt . --output-format json --output repo_dump.json

Why JSON?

  • Perfect format for AI & LLMs (GPT-4, Grok, LLaMA).
  • Prepares structured data for AI training.
  • Can be used to fine-tune models with repository insights.

📜 TXT (For AI Chat & Analysis)

gittxt . --output-format txt --output repo_dump.txt

Why TXT?

  • Extracts pure text, making it easy for AI-powered chat analysis.
  • Good for summarization and AI-assisted code review.

📝 Markdown (Best for Documentation)

gittxt . --output-format md --output repo_dump.md

Why Markdown?

  • Great for GitHub docs & project READMEs.
  • LLMs like ChatGPT use Markdown for structured responses.
  • Retains headings, code snippets, and structure.

4️⃣ Get a Summary Report

gittxt . --summary

Example Output:

📊 Summary Report:
 - Scanned 105 text files
 - Total Size: 3.2 MB
 - File Types: .py, .md, .txt
 - Saved in: gittxt-outputs/text/repo_dump.txt

Helps quickly analyze repositories for AI training.


🆕 What's New in v1.2.0?

Bug Fixes & Enhancements

  • Better file filtering (--include, --exclude).
  • Faster processing with improved caching.
  • More accurate MIME-type detection.

🚀 New Features

  • ✅ Markdown Output (--output-format md) → Generates AI-friendly structured docs.
  • 📊 Summary Reports (--summary) → Instantly view repo insights.
  • 🔍 Debug Mode (--debug) → See detailed logs of the extraction process.

📌 Contribute & Develop

1️⃣ Run Tests

pytest tests/

2️⃣ Format Code

black src/

3️⃣ Submit a PR

  1. Fork the repo
  2. Create a new branch (feature/my-change)
  3. Push changes
  4. Submit a PR! 🚀

📜 License

Gittxt is licensed under MIT.


💡 Next Features Coming Soon!

  • Interactive CLI for easy selection
  • Web UI for scanning repositories visually
  • Smarter AI-based file summarization

📌 Made by Sandeep Paidipati 🚀 Gittxt: Get Text of Your Repo for AI, LLMs & Docs!


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gittxt-1.2.1.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gittxt-1.2.1-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file gittxt-1.2.1.tar.gz.

File metadata

  • Download URL: gittxt-1.2.1.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.13.2 Linux/6.8.0-1021-azure

File hashes

Hashes for gittxt-1.2.1.tar.gz
Algorithm Hash digest
SHA256 d249c1accf9361749bf63a0e05e0c77682bf0f2b08c94eb3baaad69f2fd24cd4
MD5 4f111f094a70081477d10ab058ab10c9
BLAKE2b-256 c3de273d895a054ebf499f7347bacd7c902e773d20e51a58cbd4d6b2da87c750

See more details on using hashes here.

File details

Details for the file gittxt-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: gittxt-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.13.2 Linux/6.8.0-1021-azure

File hashes

Hashes for gittxt-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5ca57d382897db9fe9d3fe75e5f5b59877d0e25e76c186c15681d0ca7674f9c9
MD5 73742ac94ba0874e0914960da7664908
BLAKE2b-256 a2bf4f63596148879326c7e8a307d29a4900c534920e4657845465fa9ef6a5d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page