Git history analyzer with LLM-powered narrative generation
Project description
GitView
Git history analyzer with LLM-powered narrative generation
GitView extracts your repository's git history and uses AI to generate compelling narratives about how your codebase evolved. Instead of manually reading through thousands of commits, get a comprehensive story of your project's journey.
Example run on this repository:
[(https://github.com/carstenbund/gitview/blob/main/output/history_story.md)]
Features
- ** Comprehensive History Extraction**: Extracts commit metadata, LOC changes, language breakdown, README evolution, comment analysis, and more
- ** Smart Chunking**: Automatically divides history into meaningful "phases" or "epochs" based on significant changes
- ** LLM-Powered Summaries**: Uses Claude to generate narrative summaries for each phase
- ** Global Story Generation**: Combines phase summaries into executive summaries, timelines, technical retrospectives, and deletion stories
- ** Multiple Output Formats**: Generates markdown reports, JSON data, and timelines
Installation
Option 1: Install with pip (recommended)
This creates a gitview command in your PATH:
# Clone the repository
git clone https://github.com/yourusername/gitview.git
cd gitview
# Install in editable mode with dependencies
pip install -e .
# The gitview command is now available system-wide
gitview --version
gitview --help
How it works: The pip install -e . command reads pyproject.toml and setup.py, which define an entry point that creates /usr/local/bin/gitview (or similar on Windows) that calls gitview.cli:main.
Option 2: Run directly from repo (no installation)
Use the executable wrapper in bin/:
# Clone the repository
git clone https://github.com/yourusername/gitview.git
cd gitview
# Install dependencies only
pip install -r requirements.txt
# Run directly from the repo
./bin/gitview --version
./bin/gitview analyze
# Or add bin/ to your PATH
export PATH="$PWD/bin:$PATH"
gitview analyze
Option 3: Run as Python module
# Install dependencies
pip install -r requirements.txt
# Run as a module
python -m gitview.cli --help
python -m gitview.cli analyze
Verify Installation
Run the verification script to check everything is set up correctly:
python verify_installation.py
This will check:
- Python version (3.8+ required)
- All required dependencies
gitviewcommand availability- LLM backend configuration (API keys, Ollama server)
Troubleshooting Installation
If gitview command is not found after installation:
# Option 1: Use full path to module
python -m gitview.cli analyze
# Option 2: Reinstall in editable mode
pip uninstall gitview -y
pip install -e .
# Option 3: Check if it's in your PATH
which gitview # Unix/Linux/Mac
where gitview # Windows
Quick Start
# Using Anthropic Claude (default)
export ANTHROPIC_API_KEY="your-api-key-here"
gitview analyze
# Using OpenAI GPT
export OPENAI_API_KEY="your-api-key-here"
gitview analyze --backend openai
# Using local Ollama (no API key needed)
gitview analyze --backend ollama --model llama3
# Skip LLM summarization (just extract and chunk)
gitview analyze --skip-llm
Usage
Full Analysis Pipeline
The main command runs the complete pipeline: extract → chunk → summarize → story → output
gitview analyze [OPTIONS]
Options:
-r, --repo PATH Path to git repository (default: current directory)
-o, --output PATH Output directory (default: "output")
-s, --strategy STRATEGY Chunking strategy: fixed, time, or adaptive (default: adaptive)
--chunk-size INTEGER Chunk size for fixed strategy (default: 50)
--max-commits INTEGER Maximum commits to analyze
--branch TEXT Branch to analyze (default: HEAD)
-b, --backend BACKEND LLM backend: anthropic, openai, or ollama (auto-detected)
-m, --model TEXT Model identifier (uses backend defaults if not specified)
--api-key TEXT API key for the backend (defaults to env var)
--ollama-url TEXT Ollama API URL (default: http://localhost:11434)
--repo-name TEXT Repository name for output
--skip-llm Skip LLM summarization (extract and chunk only)
Extract Only
Extract git history to JSONL file without LLM processing:
gitview extract --repo /path/to/repo --output history.jsonl
Chunk Only
Chunk an extracted JSONL file into phases:
gitview chunk history.jsonl --output ./phases --strategy adaptive
Chunking Strategies
GitView supports three chunking strategies:
1. Adaptive (Recommended)
Automatically splits history when significant changes occur:
- LOC changes by >30%
- Large deletions/additions detected
- README rewrites
- Major refactorings
gitview analyze --strategy adaptive
2. Fixed Size
Splits history into fixed-size chunks (e.g., 50 commits per phase):
gitview analyze --strategy fixed --chunk-size 50
3. Time-Based
Splits by time periods (week, month, quarter, year):
gitview analyze --strategy time --period quarter
Output Files
GitView generates several output files:
output/
├── repo_history.jsonl # Raw commit data
├── phases/ # Phase data
│ ├── phase_01.json
│ ├── phase_02.json
│ └── phase_index.json
├── history_story.md # Main narrative report
├── timeline.md # Simple timeline
└── history_data.json # Complete data in JSON
Main Report (history_story.md)
Contains:
- Executive Summary: High-level overview for stakeholders
- Timeline: Chronological phases with descriptive headings
- Full Narrative: Complete story of the codebase evolution
- Technical Evolution: Architectural journey and key decisions
- Story of Deletions: What was removed and why
- Phase Details: Detailed breakdown of each phase
- Statistics: Comprehensive metrics
How It Works
Phase 1: Extract Raw History
Analyzes git commits and extracts:
- Commit metadata (hash, author, date, message)
- Lines of code changes (insertions/deletions)
- File statistics
- Language breakdown
- README state and changes
- Code comments and density
- Detection of large changes, refactors, etc.
Phase 2: Chunk into Epochs
Divides history into meaningful phases based on:
- Significant LOC changes
- Large deletions or additions
- Language mix changes
- README rewrites
- Major refactorings
Phase 3: Summarize Each Phase
Uses Claude to generate narrative summaries for each phase, answering:
- What were the main activities?
- Why were changes made?
- What was deleted/added and why?
- How did documentation evolve?
- What do commit messages reveal?
Phase 4: Generate Global Story
Combines phase summaries to create:
- Executive summary for non-technical readers
- Chronological timeline with meaningful headings
- Technical retrospective
- Story of code deletions and cleanups
- Full detailed narrative
Examples
Analyze a Large Open Source Project
gitview analyze \
--repo /path/to/large-project \
--output ./project-analysis \
--strategy adaptive \
--repo-name "My Project"
Quick Analysis Without LLM
Perfect for quick exploration or when you don't have an API key:
gitview analyze --skip-llm --output ./quick-analysis
Extract and Process Later
# Extract once
gitview extract --repo /path/to/repo --output history.jsonl
# Experiment with different chunking strategies
gitview chunk history.jsonl --strategy adaptive --output ./adaptive-phases
gitview chunk history.jsonl --strategy fixed --chunk-size 25 --output ./fixed-phases
Architecture
┌─────────────────────┐
│ Git Repository │
└──────────┬──────────┘
│
v
┌─────────────────────┐
│ Extractor │ Analyzes commits, extracts metadata
│ (extractor.py) │ Output: repo_history.jsonl
└──────────┬──────────┘
│
v
┌─────────────────────┐
│ Chunker │ Splits into meaningful phases
│ (chunker.py) │ Strategies: adaptive, fixed, time
└──────────┬──────────┘
│
v
┌─────────────────────┐
│ Summarizer │ LLM summarizes each phase
│ (summarizer.py) │ Uses Claude API
└──────────┬──────────┘
│
v
┌─────────────────────┐
│ StoryTeller │ Generates global narratives
│ (storyteller.py) │ Multiple story formats
└──────────┬──────────┘
│
v
┌─────────────────────┐
│ Writer │ Outputs markdown, JSON, etc.
│ (writer.py) │
└─────────────────────┘
Requirements
- Python 3.8+
- Git repository with commit history
- One of the following LLM backends:
- Anthropic Claude (requires API key)
- OpenAI GPT (requires API key)
- Ollama (runs locally, no API key needed)
- Dependencies: gitpython, anthropic, openai, requests, click, rich, pydantic
LLM Backend Configuration
GitView supports three LLM backends with automatic detection based on environment variables:
Anthropic Claude (Default)
Get an API key from Anthropic
export ANTHROPIC_API_KEY="your-api-key-here"
gitview analyze
Default models:
claude-sonnet-4-5-20250929(default)claude-3-opus-20240229(more powerful)claude-3-haiku-20240307(faster)
OpenAI GPT
Get an API key from OpenAI
export OPENAI_API_KEY="your-api-key-here"
gitview analyze --backend openai
Default models:
gpt-4(default)gpt-4-turbo-preview(faster)gpt-3.5-turbo(cheaper)
Ollama (Local)
Install Ollama and pull a model:
# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a model
ollama pull llama3
# Start Ollama server
ollama serve
# Use with GitView (no API key needed)
gitview analyze --backend ollama --model llama3
Popular Ollama models:
llama3(default, balanced)mistral(fast, good quality)codellama(optimized for code)mixtral(large, powerful)
Custom Configuration
# Specify custom model
gitview analyze --backend anthropic --model claude-3-opus-20240229
# Use custom Ollama URL
gitview analyze --backend ollama --ollama-url http://192.168.1.100:11434
# Pass API key directly (instead of env var)
gitview analyze --backend openai --api-key "your-key"
Use Cases
- Technical Documentation: Automatically generate project history documentation
- Onboarding: Help new developers understand codebase evolution
- Retrospectives: Review what worked and what didn't
- Project Reports: Create compelling narratives for stakeholders
- Code Archaeology: Understand why code evolved the way it did
- Cleanup Planning: Identify what to remove based on deletion history
Contributing
Contributions welcome! Please open an issue or submit a pull request.
License
MIT License - see LICENSE file for details
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gitview-0.1.2.tar.gz.
File metadata
- Download URL: gitview-0.1.2.tar.gz
- Upload date:
- Size: 37.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb4c7585e5901a71bc326d397ce0ab15ef3c5947e83654bd88394806c21de265
|
|
| MD5 |
f1abba9d9c036c547f579cd7889c3fa5
|
|
| BLAKE2b-256 |
57747b7f3043bf18e5ede2dd07f8774fa8de4ecc24f6834573637f722911f586
|
File details
Details for the file gitview-0.1.2-py3-none-any.whl.
File metadata
- Download URL: gitview-0.1.2-py3-none-any.whl
- Upload date:
- Size: 35.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0d62ca05e32b0021ce232210bcee57dc6e43ea9f1c95ff62fdaf451860b4578
|
|
| MD5 |
a385930867bd46c480ec9f8d50564aff
|
|
| BLAKE2b-256 |
eb54c608a8ac721e452111f826527a2c2a662e4e3755ccd09b37a76f90a1cd3a
|