Gittxt: Get Text from Git — Optimized for AI.
Project description
🚀 AI-Ready Text Extractor for Git Repos | CLI tool for dataset prep, summaries, reverse engineering & bundling
🚀 Gittxt: Get Text from Git — Optimized for AI
Gittxt is an open-source tool that transforms GitHub repositories into LLM-compatible datasets.
Perfect for developers, data scientists, and AI engineers, Gittxt helps you extract and structure .txt, .json, .md content into clean, analyzable formats for use in:
- Prompt engineering
- Fine-tuning & retrieval
- Codebase summarization
- Open-source LLM workflows
💡 Why Gittxt?
Large Language Models often expect input in very specific formats. Many tools (e.g., ChatGPT, Gemini, Ollama) struggle with arbitrary GitHub URLs, complex folders, or non-text assets.
Gittxt bridges this gap by:
- Extracting all usable text from a repo
- Organizing it for easy ingestion by LLMs
- Offering structured
.txt,.json,.md,.zipoutputs - Giving you full control with filtering, formatting, and plugin support
✨ Features at a Glance
- ✅ Text extractor for code, docs, config files
- ✅ Output:
.txt,.json,.md,.zip - ✅ CLI and plugin system (FastAPI, Streamlit)
- ✅ AI-ready summaries (OpenAI / Ollama)
- ✅ Reverse engineer
.txt/.jsonreports back into repo structure - ✅
.gittxtignoresupport - ✅ Async scanning for large projects
- ✅ Works offline and in constrained compute environments
📁 Output Types
outputs/
├── txt/ # Plain text report
├── json/ # Structured metadata
├── md/ # Markdown-formatted summary
└── zip/ # Bundled results + manifest
🚀 Quickstart
Install
pip install gittxt
Run your first scan
gittxt scan https://github.com/sandy-sp/gittxt --output-format txt,json --lite --zip
Reverse engineer a summary
gittxt re outputs/project.md -o ./restored
🌐 Explore the Visual Web App
Try the hosted version (no install required!)
📈 Gittxt for AI Workflows
- Use it to build structured input for LLMs
- Ideal for prompt chaining, document agents, code summarization
- Helps transform messy repos into single-file, AI-consumable reports
📖 Full Documentation
All CLI flags, plugins, formats, and filters are documented here:
🔧 Plugin Support
Gittxt supports modular plugins:
gittxt-api: Run via FastAPI backendgittxt-streamlit: Interactive dashboard
Install & run with:
gittxt plugin install gittxt-streamlit
gittxt plugin run gittxt-streamlit
🧠 Built for Developers & AI Engineers
Created by Sandeep Paidipati, Gittxt was born out of a need to:
- Quickly preview and summarize GitHub repos with LLMs
- Avoid manual copying, filtering, and converting files
- Create AI-ready datasets for learning and experimentation
🙏 Support the Project
- ⭐️ Star this repo if it helped you
- 🧵 Share it with your dev/AI community
- 🤝 Contact me for collaboration or sponsorship
🔒 License
MIT License © Sandeep Paidipati
Gittxt — Get Text from Git — Optimized for AI
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gittxt-1.7.7.tar.gz.
File metadata
- Download URL: gittxt-1.7.7.tar.gz
- Upload date:
- Size: 32.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.13.2 Linux/6.8.0-1021-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b3bf19790489d6e941eddc49bdf7c51c6da6de6b98ba7c323a47b7e95ad9316
|
|
| MD5 |
bda8a4d1ad9e099d432765b40e712346
|
|
| BLAKE2b-256 |
f8e5ee0e42dc32b78ec59739df40828ce859b14e2638ec4881e74b11ad857326
|
File details
Details for the file gittxt-1.7.7-py3-none-any.whl.
File metadata
- Download URL: gittxt-1.7.7-py3-none-any.whl
- Upload date:
- Size: 45.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.2 CPython/3.13.2 Linux/6.8.0-1021-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
738a397581d878796be4d2d1d9629be2fbe8a3be363da85ef8b0b6a20327f472
|
|
| MD5 |
4aed1db73205920cb09b26a271494959
|
|
| BLAKE2b-256 |
6f3c4bb93af16923a90ef9b726462166c6804df16005059a14808f37813fc2aa
|