Skip to main content

Analyse the network of forked git repositories

Project description

git-fork-recon

Summarise key changes in forked repositories.

Synopsis:

git(hub) repository ๐Ÿ ฒ pull forks ๐Ÿ ฒ โœจLLMโœจ ๐Ÿ ฒ summary report

Why ?

A popular repository may have many forks.

Most of them are pointless.

A handful have just the bugfix or feature that matters to you.

Through the dark magic of large language models โœจ, git-fork-recon helps find these interesting forks.


Features

  • Filters and prioritizes forks based on number of commits ahead of parent, starts, recent activity, PRs. Ignores forks with no changes.

  • Use locally hosted or remote LLMs with an OpenAI-compatible API.

  • Local caching of git repositories and forks (as remotes)

  • Detailed Markdown reports with:

    • Repository overview
    • Analysis of significant forks
    • Commit details and statistics
    • Links to GitHub commits and repositories
    • Overall summary of changes highlighting bugfixes, new features and innovations in the most interesting forks
  • REST API server for programmatic access with:

    • Asynchronous analysis with background processing
    • Versioned caching with filesystem storage
    • Authentication support with Bearer tokens
    • Configurable concurrency and rate limiting
    • Simple web UI

Quickstart

The first time you run git-fork-recon, it will start the first-time configuration wizard (see Configuration below). You'll need a Github Access Token and details of an OpenAI-compatible endpoint.

(using uv for convenience)

# Install uv if you haven't already (or use pip and a virtualenv etc if you prefer)
curl -LsSf https://astral.sh/uv/install.sh | sh

# This will start the first-time configuration wizard, or show --help
uvx git-fork-recon

# Once configured, analyse a specific repository
uvx git-fork-recon https://github.com/DunbrackLab/IPSAE

This will generate a Markdown report in the current directory ({username}-{repo}-forks.md).

Tip: you can view the report in the terminal like:

uvx frogmouth DunbrackLab-IPSAE-forks.md

Web interface

To run the simple local web UI:

uvx --from 'git-fork-recon[server]' git-fork-recon-server

Go to http://localhost:8000/ui to see the web UI.

Installation (quick)

uv tool install 'git-fork-recon[server]'

Now you can run: git-fork-recon or git-fork-recon-server like any other command.

Installation (development)

Quick: using uv sync (automatically creates a .venv and installs server and dev dependencies)

# Install uv if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

uv sync --all-extras

source .venv/bin/activate

Note: The first time you run uv sync, it will create a uv.lock file for reproducible builds. This file should be committed to version control to ensure all developers use the exact same dependency versions.


Or: using uv with manual venv creation:

# Create and activate a new virtual environment
uv venv
source .venv/bin/activate

# Install the package in editable mode, with optional server and dev dependencies
uv pip install -e '.[server,dev]'

Configuration

Configuration is stored in a TOML file located in the platform-specific config directory (e.g., ~/.config/git-fork-recon/config.toml on Linux). On first run, an interactive setup wizard will guide you through configuration.

First-Time Setup

When you run git-fork-recon for the first time, you'll be prompted for some configuration values.

You will need:

  1. A Github API token to READ repositories and their forks.
  1. Access to an OpenAI-compatible endpoint. You can use:

First-time configuration wizard options:

  • OpenAI-Compatible Endpoint: Choose from:
    • Environment variable (OPENAI_BASE_URL)
    • Local Ollama server (http://localhost:11434)
    • OpenRouter
    • Google AI Studio
    • Custom URL
  • Endpoint API Key: Enter your API key or choose to always use an environment variable
  • Model: Select a model based on your chosen endpoint
  • GitHub Token: Enter your GitHub token or choose to always use an environment variable
  • Cache Directories: Configure repository and report cache locations (defaults to $HOME/.cache/git-fork-recon/repos and $HOME/.cache/git-fork-recon/reports)

Config File Structure

The config file (config.toml) has the following structure:

# Configure an OpenAI-compatible endpoint
[endpoint]
base_url = "https://openrouter.ai/api/v1"
api_key = "sk-..."
model = "deepseek/deepseek-v3.2"
# context_length = 64000  # Optional: Override default context length

[github]
# Get a token https://github.com/settings/tokens with permissions: public_repo, user:email
token = "ghp_..."

[cache]
repo = "$HOME/.cache/git-fork-recon/repos"
report = "$HOME/.cache/git-fork-recon/reports"

[server]
# Server configuration options (commented out by default)

Environment Variable References

You can use environment variable references in the config file by prefixing with $:

[endpoint]
base_url = "$OPENAI_BASE_URL"  # Reads from OPENAI_BASE_URL env var
api_key = "$OPENAI_API_KEY"    # Reads from OPENAI_API_KEY env var

[github]
token = "$GITHUB_TOKEN"        # Reads from GITHUB_TOKEN env var

Custom Config File

You can specify a custom config file location using the --config option:

git-fork-recon --config /path/to/config.toml https://github.com/user/repo

Commandline options

$ git-fork-recon --help

 Usage: git-fork-recon [OPTIONS] [REPO_URL]

 Analyze a GitHub repository's fork network and generate a summary report.


โ•ญโ”€ Arguments โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚   repo_url      [REPO_URL]  URL of the GitHub repository to analyze          โ”‚
โ”‚                             [default: None]                                  โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
โ•ญโ”€ Options โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ --output              -o      PATH     Output file path (defaults to         โ”‚
โ”‚                                        {repo_name}-forks.md)                 โ”‚
โ”‚                                        [default: None]                       โ”‚
โ”‚ --active-within               TEXT     Only consider forks with activity     โ”‚
โ”‚                                        within this time period (e.g. '1      โ”‚
โ”‚                                        hour', '2 days', '6 months', '1       โ”‚
โ”‚                                        year')                                โ”‚
โ”‚                                        [default: None]                       โ”‚
โ”‚ --config                      PATH     Path to config.toml file [default:     โ”‚
โ”‚                                        None]                                  โ”‚
โ”‚ --model                       TEXT     OpenRouter model to use (overrides    โ”‚
โ”‚                                        MODEL env var)                        โ”‚
โ”‚                                        [default: None]                       โ”‚
โ”‚ --context-length              INTEGER  Override model context length         โ”‚
โ”‚                                        (overrides CONTEXT_LENGTH env var)    โ”‚
โ”‚                                        [default: None]                       โ”‚
โ”‚ --api-base-url                TEXT     OpenAI-compatible API base URL        โ”‚
โ”‚                                        [default: None]                       โ”‚
โ”‚ --api-key-env-var             TEXT     Environment variable containing the   โ”‚
โ”‚                                        API key                               โ”‚
โ”‚                                        [default: None]                       โ”‚
โ”‚ --parallel            -p      INTEGER  Number of parallel requests           โ”‚
โ”‚                                        [default: 5]                          โ”‚
โ”‚ --verbose             -v               Enable verbose logging                โ”‚
โ”‚ --clear-cache                          Clear cached repository data before   โ”‚
โ”‚                                        analysis                              โ”‚
โ”‚ --force-fetch                           Force fetch updates from cached     โ”‚
โ”‚                                        repositories and remotes              โ”‚
โ”‚ --force                                Force overwrite existing output file  โ”‚
โ”‚ --max-forks                   INTEGER  Maximum number of forks to analyze    โ”‚
โ”‚                                        (default: no limit)                   โ”‚
โ”‚                                        [default: None]                       โ”‚
โ”‚ --output-formats              TEXT     Comma-separated list of additional    โ”‚
โ”‚                                        formats to generate (html,pdf)        โ”‚
โ”‚                                        [default: None]                       โ”‚
โ”‚ --install-completion                   Install completion for the current    โ”‚
โ”‚                                        shell.                                โ”‚
โ”‚ --show-completion                      Show completion for the current       โ”‚
โ”‚                                        shell, to copy it or customize the    โ”‚
โ”‚                                        installation.                         โ”‚
โ”‚ --help                                 Show this message and exit.           โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Server-mode configuration

For the REST API server, additional environment variables are available:

  • ALLOWED_MODELS: Comma-separated list of allowed LLM models (default: unrestricted)
  • SERVER_HOST: Host to bind the server to (default: 127.0.0.1)
  • SERVER_PORT: Port to bind the server to (default: 8000)
  • REPORT_CACHE_DIR: Directory for server report cache (defaults to ~/.cache/git-fork-recon/reports using platformdirs)
  • DISABLE_AUTH: Set to 1 to disable authentication (default: enabled)
  • AUTH_BEARER_TOKEN: Bearer token for API authentication
  • PARALLEL_TASKS: Maximum concurrent analysis tasks (default: 2)
  • DISABLE_UI: Set to 1 to disable the web UI at /ui endpoint (default: enabled)

Run the server

Start the server:

# Using installed package
git-fork-recon-server --host 127.0.0.1 --port 8000

# Using uvx
uvx --from 'git-fork-recon[server]' git-fork-recon-server --host 127.0.0.1 --port 8000

Go to http://localhost:8000/ui to see the web UI.

REST API Endpoints

  • POST /analyze - Start repository analysis
  • GET /report/{owner}/{repo}/{timestamp}/report.{format} - Get cached report
  • GET /report/{owner}/{repo}/latest/report.{format} - Get latest cached report
  • GET /report/{owner}/{repo}/{timestamp}/status - Get status for specific report version
  • GET /report/{owner}/{repo}/latest/status - Get status for latest report
  • GET /metadata/{owner}/{repo}/{timestamp} - Get metadata for specific report version
  • GET /metadata/{owner}/{repo}/latest - Get metadata for latest report
  • GET /health - Health check endpoint
  • GET /health/ready - Readiness check endpoint
  • GET /ui - Web UI for repository analysis (unless disabled with DISABLE_UI=1)

Example Request

curl -X POST "http://localhost:8000/analyze" \
  -H "Authorization: Bearer your-token" \
  -H "Content-Type: application/json" \
  -d '{
    "repo_url": "https://github.com/martinpacesa/BindCraft",
    "model": "deepseek/deepseek-chat-v3-0324:free",
    "format": "markdown"
  }'

Example Response

{
  "status": "generating",
  "retry-after": "2025-10-04T12:35:00Z"
}

When analysis is complete:

{
  "status": "available",
  "link": "/report/martinpacesa/BindCraft/latest/report.md",
  "last-updated": "2025-10-04T12:34:56Z"
}

Retrieving the Generated Report

Once the analysis is complete, you can retrieve the report using the provided link:

# Get the latest report
curl -X GET "http://localhost:8000/report/martinpacesa/BindCraft/latest/report.md" \
  -H "Authorization: Bearer your-token" \
  -o martinpacesa-BindCraft-forks.md

# Or get a specific version by timestamp
curl -X GET "http://localhost:8000/report/martinpacesa/BindCraft/2025-10-04T12-34-56Z/report.md" \
  -H "Authorization: Bearer your-token" \
  -o martinpacesa-BindCraft-forks-v2025-10-04.md

# Get report in different formats (markdown, json, html, pdf)
curl -X GET "http://localhost:8000/report/martinpacesa/BindCraft/latest/report.json" \
  -H "Authorization: Bearer your-token" \
  -o martinpacesa-BindCraft-forks.json

Checking Analysis Status

If you request a report while it's still being generated, you'll receive a 202 Accepted response with a Retry-After header:

curl -X GET "http://localhost:8000/report/martinpacesa/BindCraft/latest/report.md" \
  -H "Authorization: Bearer your-token"

Response (while generating):

{
  "status": "generating",
  "retry-after": "Wed, 05 Oct 2025 12:35:00 GMT"
}

Output is generated as {username}-{repo}-forks.md by default (use -o to specify a different file name, -o - to print to stdout).

See also

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git_fork_recon-0.1.5.tar.gz (165.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

git_fork_recon-0.1.5-py3-none-any.whl (55.3 kB view details)

Uploaded Python 3

File details

Details for the file git_fork_recon-0.1.5.tar.gz.

File metadata

  • Download URL: git_fork_recon-0.1.5.tar.gz
  • Upload date:
  • Size: 165.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.4

File hashes

Hashes for git_fork_recon-0.1.5.tar.gz
Algorithm Hash digest
SHA256 bbcb20c6c4451e998ed795f1c46d071c2052940cf94dbbe230a616e535cd4faf
MD5 50bb773b0100a851c977554fa330365c
BLAKE2b-256 8ead467a74de756a4a1618e5a0ce89d1c1892b9c8ad14b620e1e6921bb44abfc

See more details on using hashes here.

File details

Details for the file git_fork_recon-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for git_fork_recon-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d754d950954f570ebd2de6ba8e4853347ea0d4dfb919bfd08d94151994554527
MD5 6746490bbbaeaf0e025fddd378c9399e
BLAKE2b-256 aa784ff04f8eff64593c034d84d4cf10abaf5d07cfdf0011e0217388e7400047

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page