AI-powered PR impact analysis — risk scoring, blast radius, test gaps, deployment strategy
Project description
pull-assist
         Multi-agent PR impact analysis — a LangGraph-orchestrated pipeline of specialist agents scores merge risk, maps blast radius, surfaces test gaps, simulates runtime breakage, and advises on rollback and deployment strategy.
Point it at a GitHub pull request or a local diff; get a structured report with risk score, top concerns, propagation chains, and next steps before you merge.
Table of contents
- How it works
- Architecture
- Agent pipeline
- Graph layer
- Install via pip
- Custom CLI (
pa) - Backend: AMD GPU + vLLM
- Quick start
- Input modes
- Configuration
- Admin & GPU registry
- Reports & memory
- Development
- Project structure
- License
How it works
- Ingest — Fetch a PR from GitHub (or read a local
.patch/ diff file). - Parse — Extract changed files, languages, symbols, and test coverage signals from the diff.
- Analyze — Run a LangGraph orchestration of specialized LLM agents (with optional GitHub tool calls).
- Reason — Build an evidence graph, failure propagation chains, and deployment advice (no extra LLM calls).
- Report — Print a Rich terminal summary and save Markdown + JSON under
reports/. - Remember — Persist results in a local SQLite memory store for repo history context on future runs.
Architecture
┌─────────────────────────────────────────────────────────────────────────┐
│ Your machine │
│ ┌──────────────┐ ┌─────────────┐ ┌────────────────────────────┐ │
│ │ pa / │───▶│ GitHub API │ │ LangGraph orchestrator │ │
│ │ pullassist │ │ (PR, diff, │ │ + 7 specialized agents │ │
│ │ main.py │ │ search) │ │ + graph layer │ │
│ └──────┬───────┘ └─────────────┘ └─────────────┬──────────────┘ │
│ │ │ │
│ │ OpenAI-compatible API │ │
│ └──────────────────────────────────────────────┘ │
└───────────────────────────────┬─────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ GPU server (AMD + ROCm) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │
│ │ Nginx │───▶│ FastAPI │───▶│ vLLM │ │
│ │ :443 / :80 │ │ Proxy :9000 │ │ (DeepSeek-Coder-V2) │ │
│ │ │ │ auth, rate │ │ :8000 │ │
│ │ │ │ limits, SSE │ │ AMD ROCm / rocm/vllm │ │
│ └──────────────┘ └──────────────┘ └──────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
| Layer | Role |
|---|---|
CLI (pa, pullassist) |
User-facing commands, config, registry discovery |
| Agents | LLM reasoning + GitHub tools (symbol search, file fetch, test discovery) |
| Graph layer | Deterministic evidence graph, propagation chains, deployment advice |
Proxy (server/proxy.py) |
API keys, rate limiting, request queuing, SSE coalescing for LangChain |
| vLLM | Model inference on AMD GPUs via ROCm; OpenAI-compatible /v1 API |
Agent pipeline
Orchestration lives in agents/orchestrator.py as a LangGraph StateGraph. All agents share a typed state dict (PRAnalysisState) — a shared “whiteboard” each node reads and writes.
Execution flow
START
→ Dependency Mapper (GitHub tools: symbol search, file tree)
→ Change Simulator (GitHub tools: fetch caller context)
→ Test Gap Agent (GitHub tools: find tests, fetch test files)
→ Rollback Advisor (LLM only)
→ Business Impact (deterministic path patterns — no LLM)
→ Risk Evaluator (LLM only — synthesizes all findings)
→ Critic (LLM only — challenges other agents)
→ [conditional] if SIGNIFICANT_ISSUES and reruns < 2:
re-run flagged agents with Critic objections
→ Risk Evaluator (re-score)
→ Critic (re-check)
→ Graph Layer (deterministic)
END
Agents
| Agent | Tools | Purpose |
|---|---|---|
| Dependency Mapper | search_symbol, file_tree |
Blast radius: direct/indirect dependents of changed symbols |
| Change Simulator | fetch_file |
Before/after runtime behavior; breaking scenarios per caller |
| Test Gap Agent | find_test_files, fetch_file |
Uncovered functions and missing test scenarios |
| Rollback Advisor | — | Rollback difficulty, risks, and step-by-step guidance |
| Business Impact | — | Classifies changed paths into business domains (auth, payments, etc.) via pattern rules |
| Risk Evaluator | — | Weighted 0–10 risk score across blast radius, tests, runtime, complexity |
| Critic | — | Flags inconsistencies; can trigger re-runs and score corrections |
Tool-calling agents use LangChain’s AgentExecutor. By default, legacy prompt-based tools work with plain vLLM. Set USE_NATIVE_TOOL_CALLING=true only when vLLM is started with --enable-auto-tool-choice and a matching --tool-call-parser.
After agents finish, outputs are validated against JSON schemas in config/settings.py (AGENT_OUTPUT_SCHEMAS).
Graph layer (deterministic)
No additional LLM calls — built from agent outputs and diff metadata:
| Module | Output |
|---|---|
graph/evidence_graph.py |
Symbol-centric graph of callers and transitive deps (confidence degrades per hop) |
graph/propagation_engine.py |
Failure propagation chains with arrow diagrams |
graph/deployment_advisor.py |
Deployment strategy recommendation (e.g. canary vs full rollout) |
Static diff analysis (github/diff_static_risks.py) augments test-gap findings from the raw patch.
Install via pip
The package is published as pull-assist on PyPI-style metadata (pyproject.toml). After install, two console scripts are available: pa and pullassist.
From PyPI
pip install pull-assist
Optional extras:
pip install "pull-assist[server]" # FastAPI proxy (GPU host)
pip install "pull-assist[dev]" # pytest, build, twine
From source (GitHub)
git clone https://github.com/Rohan-Julius/pull-assist.git
cd pull-assist
pip install -e .
Requirements
- Python 3.10+
- A GitHub personal access token (for PR URLs and remote repo tool calls)
- Access to an LLM endpoint (local vLLM or shared GPU server via the proxy)
Verify installation:
pa --version
pullassist --version
Custom CLI (pa)
The CLI is built with Click and Rich (cli/app.py). It is the recommended interface for day-to-day use.
Commands
| Command | Description |
|---|---|
pa review <PR_URL> |
Analyze a GitHub pull request |
pa review --diff FILE --repo owner/repo |
Local patch + GitHub context for tools |
pa review --diff FILE --local PATH |
Fully offline (local git only) |
pa history [repo] |
Past analyses from the memory store |
pa config show | set | reset |
Manage ~/.pull-assist/config.json |
pa status |
Connectivity checks (GitHub, LLM, memory, deps) |
pa admin … |
GPU registry management (admin only) |
Examples
# Configure once
pa config set --token ghp_xxxxxxxx --key pa-your-api-key
pa config set --server http://your-gpu-host:9000/v1 # optional if using registry
# Verify setup
pa status
# Analyze a PR
pa review https://github.com/owner/repo/pull/123
# Local diff (offline)
git diff main..feature > changes.patch
pa review --diff changes.patch --local .
# Compact vs full report
pa review https://github.com/owner/repo/pull/123 --full
# Data pipeline only (no GPU)
pa review https://github.com/owner/repo/pull/123 --day1-only
VS Code integration
When run inside VS Code (TERM_PROGRAM=vscode), pa can open a dedicated integrated terminal with a clean pull-assist> prompt (see cli/app.py).
Legacy entry point
main.py remains available for scripting and direct python main.py usage with the same three input modes and --day1-only / --verbose flags.
Backend: AMD GPU + vLLM
Inference runs on AMD GPUs using ROCm and vLLM. vLLM exposes an OpenAI-compatible API (/v1/chat/completions), which LangChain’s ChatOpenAI uses via config/settings.py.
Default model
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct (configurable via LLM_MODEL / pa config set --model)
Two deployment options
Option A — vLLM already running (proxy only):
# On the GPU host: start vLLM on :8000 (see docker-compose.yml comments)
docker compose up -d
Option B — Full stack (vLLM + proxy + nginx):
docker compose -f docker-compose.full.yml up -d
For AMD ROCm, use the ROCm vLLM image in docker-compose.full.yml:
image: rocm/vllm:latest
devices:
- /dev/kfd
- /dev/dri
Install ROCm drivers on the host first: AMD ROCm install guide.
Proxy layer
server/proxy.py is a FastAPI app that sits in front of vLLM:
- Per-user API key authentication
- Rate limiting and concurrency caps per key
- Usage logging
- SSE coalescing so LangChain clients that disable streaming still work through port
:9000
CLI (:pa) → Proxy (:9000) → vLLM (:8000)
Start the proxy:
uvicorn server.proxy:app --host 0.0.0.0 --port 9000
Environment variables:
| Variable | Default | Description |
|---|---|---|
VLLM_BACKEND_URL |
http://localhost:8000 |
Upstream vLLM base URL |
PROXY_PORT |
9000 |
Proxy listen port |
RATE_LIMIT_PER_MINUTE |
30 |
Requests per API key per minute |
MAX_CONCURRENT_PER_KEY |
2 |
Parallel requests per key |
API_KEYS_FILE |
server/api_keys.json |
Key store (gitignored) |
Running vLLM manually (AMD host)
python -m vllm.entrypoints.openai.api_server \
--model deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct \
--host 0.0.0.0 --port 8000 \
--dtype float16 --max-model-len 16384 \
--gpu-memory-utilization 0.85 --trust-remote-code
Set LLM_MAX_TOKENS (default 1024) so prompt + completion fit within --max-model-len.
Quick start
End users (CLI + shared GPU)
pip install pull-assist
pa config set --token ghp_xxx --key pa-xxx
pa status
pa review https://github.com/owner/repo/pull/1
If your team uses the GPU registry (GitHub Gist), the active server URL is discovered automatically — you only need your API key and GitHub token.
Developers (local)
python -m venv venv && source venv/bin/activate
pip install -e .
cp .env.example .env # create and fill in tokens (see below)
python main.py --pr https://github.com/owner/repo/pull/1
Example .env:
GITHUB_TOKEN=ghp_...
LLM_BASE_URL=http://localhost:8000/v1
LLM_API_KEY=not-needed
LLM_MODEL=deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
Input modes
| Mode | Command | GitHub API | GPU |
|---|---|---|---|
| GitHub PR URL | pa review <URL> |
Yes (PR + diff + reviews) | Yes |
| Local diff + remote repo | pa review --diff f.patch --repo o/r |
Yes (tools only) | Yes |
| Local diff + local repo | pa review --diff f.patch --local . |
No | Yes |
| Context only | --day1-only |
As above | No |
Supported languages for symbol extraction include Python, JavaScript/TypeScript, Java, Go, Ruby, Rust, and more (see config/settings.py → SUPPORTED_LANGUAGES).
Configuration
Settings are layered:
~/.pull-assist/config.json(CLI — preferred for end users)- Environment variables (
PA_*orLLM_*,GITHUB_TOKEN) .env(local development)- GPU registry Gist (auto-discovers
LLM_BASE_URLwhen active)
pa config show
pa config set --token ghp_... --server http://host:9000/v1 --key pa-abc --model deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
pa config reset
| Setting | Env vars |
|---|---|
| LLM server | PA_SERVER, LLM_BASE_URL |
| API key | PA_API_KEY, LLM_API_KEY |
| GitHub token | PA_GITHUB_TOKEN, GITHUB_TOKEN |
| Model | PA_MODEL, LLM_MODEL |
Admin & GPU registry
For teams sharing one GPU server, a GitHub Gist registry (cli/registry.py) publishes the current proxy URL so users do not need the raw IP.
Admin (one-time setup):
pa admin init
git add cli/registry.py && git commit -m "Add registry gist ID"
# When GPU is up
pa admin set-gpu http://<GPU_IP>:9000/v1
pa admin deactivate # mark offline
pa admin status
Users: pa config set --token … --key … — registry resolves the server when active: true.
Reports & memory
| Output | Location |
|---|---|
| Terminal summary | Rich panels via output/formatter.py |
| Markdown report | reports/<repo>-pr-<n>-<timestamp>.md |
| JSON report | reports/<repo>-pr-<n>-<timestamp>.json |
| Analysis history | memory/pr_history.db (SQLite) |
pa history and pa history owner/repo browse past runs. Prior analyses are injected into agent prompts as repo context.
Development
# Install with dev deps
pip install -e ".[dev]" # or: pip install -r requirements.txt
# Run tests
pytest
# Run without agents (diff parsing + context only)
python main.py --pr <URL> --day1-only
pa review <URL> --day1-only
Key test modules: tests/test_agents.py, tests/test_graph_layer.py, tests/test_diff_parser.py, tests/test_orchestrator_patch.py, tests/test_proxy_sse.py.
Project structure
pull-assist/
├── agents/ # LangGraph nodes + specialist agents
│ ├── orchestrator.py # StateGraph definition
│ ├── dependency_mapper.py
│ ├── change_simulator.py
│ ├── test_gap.py
│ ├── risk_evaluator.py
│ ├── critic.py
│ ├── rollback_advisor.py
│ └── business_impact.py
├── cli/ # `pa` / `pullassist` Click CLI
├── config/settings.py # LLM, GitHub, prompts, risk weights
├── github/ # PR client, diff parser, static risks
├── graph/ # Evidence graph, propagation, deployment
├── memory/ # SQLite PR history store
├── output/ # Report builder + Rich formatter
├── server/proxy.py # FastAPI auth proxy for vLLM
├── tools/ # LangChain GitHub tools
├── main.py # Script entry point
├── docker-compose.yml # Proxy-only (vLLM external)
├── docker-compose.full.yml # vLLM + proxy + nginx (AMD ROCm)
└── pyproject.toml # Package metadata + console scripts
License
MIT — see LICENSE.
Publishing to PyPI (maintainers)
- Create accounts: pypi.org and optionally test.pypi.org.
- Enable 2FA on PyPI (required for uploads).
- Create an API token: Account → API tokens → scope Entire account (first upload) or project
pull-assist. - Bump
versioninpyproject.tomlandcli/__init__.pyfor each release. - Build and upload:
pip install build twine
export TWINE_USERNAME=__token__
export TWINE_PASSWORD=pypi-AgENdHlwaS5vcmcC... # your token — never commit this
# Dry run on TestPyPI first
./scripts/publish-pypi.sh test
pip install -i https://test.pypi.org/simple/ pull-assist
# Production
./scripts/publish-pypi.sh
The name pull-assist is not taken on PyPI yet (verified). Package builds pass twine check.
Links
- Repository: github.com/Rohan-Julius/pull-assist
- Issues & contributions: use GitHub Issues and Pull Requests on the repo above
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pull_assist-0.1.0.tar.gz.
File metadata
- Download URL: pull_assist-0.1.0.tar.gz
- Upload date:
- Size: 734.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
29a5455dbb04fd71c769412b8b7f376170c36a3228b8d39bdfb127fcacad623c
|
|
| MD5 |
e4d438e757dc37bd03f9abf115287cb7
|
|
| BLAKE2b-256 |
530d1fc2fda310e1356dc189d2045032ab721e2cf4fc5e2e839230458a1f9a64
|
File details
Details for the file pull_assist-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pull_assist-0.1.0-py3-none-any.whl
- Upload date:
- Size: 132.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d462855fe88fe55e02aee3086da21ae6714a3f28e68e11a65b0a97e4a973bd5b
|
|
| MD5 |
6e8ee5b184d85bf6ff292ac780d0ec98
|
|
| BLAKE2b-256 |
681335561eb4fde0fe6adfa2a524cfe735cfa19cc63178aafb7d6155057040fb
|