Skip to main content

Gittxt: Get Text from Git — Optimized for AI.

Project description

🚀 AI-Ready Text Extractor for Git Repos | CLI tool for dataset prep, summaries, reverse engineering & bundling

🚀 Gittxt: Get Text from Git — Optimized for AI

Docs Python Version PyPI version Release Tested with Pytest PyPI Downloads GitHub repo size GitHub top language Build Status Made for LLMs License

Gittxt is an open-source tool that transforms GitHub repositories into LLM-compatible datasets.

Perfect for developers, data scientists, and AI engineers, Gittxt helps you extract and structure .txt, .json, .md content into clean, analyzable formats for use in:

  • Prompt engineering
  • Fine-tuning & retrieval
  • Codebase summarization
  • Open-source LLM workflows

💡 Why Gittxt?

Large Language Models often expect input in very specific formats. Many tools (e.g., ChatGPT, Gemini, Ollama) struggle with arbitrary GitHub URLs, complex folders, or non-text assets.

Gittxt bridges this gap by:

  • Extracting all usable text from a repo
  • Organizing it for easy ingestion by LLMs
  • Offering structured .txt, .json, .md, .zip outputs
  • Giving you full control with filtering, formatting, and plugin support

✨ Features at a Glance

  • ✅ Text extractor for code, docs, config files
  • ✅ Output: .txt, .json, .md, .zip
  • ✅ CLI and plugin system (FastAPI, Streamlit)
  • ✅ AI-ready summaries (OpenAI / Ollama)
  • ✅ Reverse engineer .txt/.json reports back into repo structure
  • .gittxtignore support
  • ✅ Async scanning for large projects
  • ✅ Works offline and in constrained compute environments

📁 Output Types

outputs/
├── txt/         # Plain text report
├── json/        # Structured metadata
├── md/          # Markdown-formatted summary
└── zip/         # Bundled results + manifest

🚀 Quickstart

Install

pip install gittxt

Run your first scan

gittxt scan https://github.com/sandy-sp/gittxt --output-format txt,json --lite --zip

Reverse engineer a summary

gittxt re outputs/project.md -o ./restored

🌐 Explore the Visual Web App

Try the hosted version (no install required!)

👉 Launch Streamlit App


📈 Gittxt for AI Workflows

  • Use it to build structured input for LLMs
  • Ideal for prompt chaining, document agents, code summarization
  • Helps transform messy repos into single-file, AI-consumable reports

📖 Full Documentation

All CLI flags, plugins, formats, and filters are documented here:

📚 Explore Gittxt Docs


🔧 Plugin Support

Gittxt supports modular plugins:

  • gittxt-api: Run via FastAPI backend
  • gittxt-streamlit: Interactive dashboard

Install & run with:

gittxt plugin install gittxt-streamlit
gittxt plugin run gittxt-streamlit

🧠 Built for Developers & AI Engineers

Created by Sandeep Paidipati, Gittxt was born out of a need to:

  • Quickly preview and summarize GitHub repos with LLMs
  • Avoid manual copying, filtering, and converting files
  • Create AI-ready datasets for learning and experimentation

🙏 Support the Project

  • ⭐️ Star this repo if it helped you
  • 🧵 Share it with your dev/AI community
  • 🤝 Contact me for collaboration or sponsorship

🔒 License

MIT License © Sandeep Paidipati


GittxtGet Text from Git — Optimized for AI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gittxt-1.7.7.tar.gz (32.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gittxt-1.7.7-py3-none-any.whl (45.6 kB view details)

Uploaded Python 3

File details

Details for the file gittxt-1.7.7.tar.gz.

File metadata

  • Download URL: gittxt-1.7.7.tar.gz
  • Upload date:
  • Size: 32.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.2 Linux/6.8.0-1021-azure

File hashes

Hashes for gittxt-1.7.7.tar.gz
Algorithm Hash digest
SHA256 8b3bf19790489d6e941eddc49bdf7c51c6da6de6b98ba7c323a47b7e95ad9316
MD5 bda8a4d1ad9e099d432765b40e712346
BLAKE2b-256 f8e5ee0e42dc32b78ec59739df40828ce859b14e2638ec4881e74b11ad857326

See more details on using hashes here.

File details

Details for the file gittxt-1.7.7-py3-none-any.whl.

File metadata

  • Download URL: gittxt-1.7.7-py3-none-any.whl
  • Upload date:
  • Size: 45.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.2 Linux/6.8.0-1021-azure

File hashes

Hashes for gittxt-1.7.7-py3-none-any.whl
Algorithm Hash digest
SHA256 738a397581d878796be4d2d1d9629be2fbe8a3be363da85ef8b0b6a20327f472
MD5 4aed1db73205920cb09b26a271494959
BLAKE2b-256 6f3c4bb93af16923a90ef9b726462166c6804df16005059a14808f37813fc2aa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page