Skip to main content

Normalized GitHub-CLI bridge for Claude Code and other MCP clients — typed JSON envelopes, structured error codes, audit log, mode-rich field rosters. CLI + MCP server.

Project description

github-inside-claude-code

PyPI version Python versions License: MIT

A normalized wrapper around the GitHub CLI (gh) that makes GitHub operations as integral to Claude Code as Bash tool calls.

The CLI is installed as gh-tool_usegh- because it wraps the GitHub CLI, gicc is the acronym of the repository name. Throughout this document the project is referred to as the bridge for readability; the canonical project name is the repository name above.

Status: v0.9.3 (Beta). 88 ops across 6 families ship.

Family Count Examples
gh_write_* 29 pr_create / pr_merge / pr_review / pr_ready / pr_close / pr_edit / pr_comment / issue_create / issue_close / issue_edit / issue_reopen / issue_comment / label_create / label_delete / secret_set / variable_set / release_create / release_upload / project_item_add / tasks_add / tasks_done / aw_compile / aw_run / aw_enable / aw_disable / aw_add / aw_update / run_rerun / run_cancel
gh_read_* 30 pr / pr_list / pr_checks / pr_diff / issue / issue_list / repo / file / search / run / run_list / workflow_list / release_list / release_view / release_download / project_list / project_view / project_item_list / project_field_list / aw_status / aw_list / aw_checks / aw_domains / aw_validate / tasks_list / tasks_today / tasks_triage / secret_list / variable_list / label_list
gh_session_* 6 whoami / ensure_scope / rate_limit / config_get / extensions_list / audit_summary
gh_watch_* 4 pr / ci / workflow / pr_checks
gh_query_* 11 pr_reviews / repo_activity / repo_stats / models_list / models_view / models_run / tasks_standup / aw_health / aw_audit / my_dashboard / notifications
gh_research_* 8 stackexchange_search / stackexchange_question / wikipedia_search / wikidata_sparql / hackernews_search / hackernews_item / tavily_search / tavily_research
Total 88

The problem

gh is a great tool for humans. It is a difficult tool for AI agents.

Bash is integral to Claude Code because four properties are taken for granted:

  1. Availability — Claude types ls without first asking "is ls installed?"
  2. Predictable output schema — stdout/stderr/exit-code, parseable.
  3. Composability — pipes, output-of-A as input-to-B.
  4. Visible failure semantics — exit code ≠ 0 means something, without guessing.

gh only really delivers on (1). The other three are inconsistent across its surface, because the CLI grew over a decade across many authors. Three concrete points of friction an agent hits every session:

gh pr list                          # TTY table, parseable only by column-counting
gh pr list --json number,title      # JSON array, but no error info when auth fails
gh pr view 42                       # TTY with ANSI; body at end without delimiter
gh pr view 42 --json body           # JSON, but body is raw markdown without meta
gh api repos/X/Y/pulls/42           # raw GitHub API — different schema than `gh pr view`
gh issue close 42                   # silent on success; English prose on failure

For a human, this is fine — you skim, you read, you move on. For an agent, every call requires guessing which output shape will come back and how to interpret a failure. The result is fragile multi-step workflows and a lot of re-reading the same docs.

The bridge is not a replacement for gh. gh stays. The bridge wraps the subset of operations an agent actually performs and gives them one predictable shape.


Naming convention

Tool-call names must match the Anthropic regex ^[a-zA-Z0-9_-]{1,64}$ — dots are not allowed. The bridge exposes each operation in three equivalent forms:

Form Where Example
Tool-call API / MCP exposure (canonical) gh_write_pr_merge
CLI Bash invocation gh-tool_use write pr merge --number 42
Python API optional library binding gh.write.pr.merge(number=42)

The tool-call form is the canonical reference. CLI and Python forms are mechanical transformations of it.

The five tool families

The whole surface is five families. Each tool returns the same envelope (GhResult) with the same error taxonomy.

gh_read_* — single-entity reads

Tool-call CLI Returns
gh_read_pr gh-tool_use read pr PR with state, head/base, mergeable, reviews
gh_read_issue gh-tool_use read issue issue with state, labels, assignees
gh_read_repo gh-tool_use read repo repo metadata
gh_read_file gh-tool_use read file file content + sha at ref
gh_read_search gh-tool_use read search repo/code/issue/pr search results

One tool per entity. Predictable shape. Replaces the maze of gh pr view / gh api repos/.../pulls/N / gh search prs.

gh_write_* — mutations, every one runs through one confirm hook

Tool-call CLI Risk
gh_write_pr_create gh-tool_use write pr create low
gh_write_pr_merge gh-tool_use write pr merge high
gh_write_pr_comment gh-tool_use write pr comment low
gh_write_issue_create gh-tool_use write issue create low
gh_write_issue_close gh-tool_use write issue close medium
gh_write_issue_comment gh-tool_use write issue comment low

Every gh_write_* carries a static risk label and routes through one configurable confirm strategy. See Confirm protocol.

gh_watch_* — long-running observers

Tool-call CLI Streams
gh_watch_pr gh-tool_use watch pr PR state changes, CI events, review events
gh_watch_ci gh-tool_use watch ci check-run status until success/failure
gh_watch_workflow gh-tool_use watch workflow workflow-run progress

Stream-assembler shape: a sequence of typed events with a terminal event setting the exit. Wraps gh-pr-await / gh-watch / custom polling loops.

gh_query_* — composed reads (GraphQL-shaped)

Tool-call CLI Joins
gh_query_pr_reviews gh-tool_use query pr-reviews threads + comments + resolutions
gh_query_repo_activity gh-tool_use query repo-activity commits + PRs + issues since cutoff
gh_query_dependents gh-tool_use query dependents repos depending on this one

The gap between gh_read_* (single entity) and "I need four API calls and a join". One call, one composed result.

gh_session_* — auth, capability, quota

Tool-call CLI Returns
gh_session_whoami gh-tool_use session whoami user, scopes, rate limit
gh_session_ensure_scope gh-tool_use session ensure-scope success or actionable error
gh_session_rate_limit gh-tool_use session rate-limit remaining, reset_at, used

Capability probe so an agent can verify before attempting, rather than try-and-handle-the-error.

Why not one tool per family with an action parameter?

Anthropic's tool-design guidance suggests consolidating related operations into fewer tools with an action parameter (e.g. one gh_write_pr tool with action: "create" | "merge" | "comment"). The bridge deliberately goes the other way for gh_write_* because:

  1. The confirm hook needs static, per-operation risk labels. A single tool with an action parameter would have to derive risk dynamically, which is harder to audit.
  2. The input schemas are substantially different (merge takes strategy + delete_branch, comment takes body, create takes title + body + base + head + ...). A discriminated-union schema is harder for Claude to fill correctly than six narrow schemas.
  3. Each operation deserves its own description because the failure modes differ (merge can hit branch_protection, create cannot).

If after real-world use the operations turn out to share most of their schema, that's the signal to consolidate. Not before.


Error-code taxonomy

Every GhResult is either success=True with typed data, or success=False with a structured error. The error code is one of a closed enum — Claude branches on the code, not on stderr text.

Code When Recoverable Suggested action
auth_missing No token yes gh auth login
auth_insufficient_scope Token lacks scope yes gh auth refresh -s <scope>
auth_expired Token rejected as expired yes gh auth refresh
not_found 404 no Verify entity exists
permission_denied 403 — exists, caller can't no Caller lacks permission
validation_failed 422 — fields invalid no Fix input, retry
merge_conflict PR has conflicts yes Rebase, push, retry
branch_protection Required checks/reviews yes Wait for checks or request reviews
stale_state Optimistic-concurrency-like yes Refresh state, retry
rate_limited 429 or X-RateLimit yes Wait until meta.rate_limit_reset_at
network Connection-level yes Retry with backoff
user_cancelled Confirm hook rejected no Respect user decision
unknown Unmatched error maybe Inspect raw_stderr; report if reproducible

Each error carries:

  • message_for_user — human-readable
  • message_for_claude — actionable: "Run X, then retry"
  • suggested_action — the literal command, when one applies
  • recoverable — whether retry is plausibly useful

Confirm protocol

All gh_write_* calls route through a single confirm hook. The risk label is static per operation:

Tool-call Risk
gh_write_pr_create low
gh_write_pr_comment low
gh_write_issue_create low
gh_write_issue_comment low
gh_write_issue_close medium
gh_write_pr_merge high

Three strategies, picked by config (not code):

  • AlwaysConfirm — default; every write prompts. Used in interactive sessions.
  • RiskBasedConfirm(threshold="medium") — prompts at medium+. Semi-autonomous.
  • AllowList — per-repo + per-operation allowlist. CI / cron.

Config never lowers the risk label; it only changes whether a prompt is issued. Audit logs always record the static risk, so gh_write_pr_merge shows as "high-risk operation auto-allowed by AllowList rule X" rather than hiding it.


What the bridge is not

Three deliberate non-features for Tag 1, listed so future-me doesn't drift:

Not an MCP server

The bridge is a Python CLI that wraps gh and returns GhResult as JSON. Reasons:

  • Single-shot subprocess to a local Python CLI is faster than the MCP roundtrip for the most common case (one tool call, one response).
  • No MCP server lifecycle to manage.
  • An MCP wrapper over the bridge can be added trivially later. The reverse is much harder.

Not a 30-tool plugin marketplace

Five families, ~25-30 methods total, all under five mental slots. If you find yourself building a seventh family, that's the signal you've drifted from "make gh integral" to "build a meta-tool".

Not a guardrails engine

The confirm hook has three strategies and a static risk table. That's the whole guardrails surface. If you need per-method overrides, per-time-of-day rules, per-user policies — that's a different tool, build it separately. This one stays auditable and focused.


What's deliberately deferred

  • gh_watch_* streaming implementation — schema defined here, code later.
  • gh_query_* GraphQL composition — useful, but gh_write_* ships first.
  • Daemon for caching — only built if the first 100 real calls show meaningful latency in gh_session_whoami / rate-limit checks. No daemon for its own sake.
  • gh-aw workflow integration — belongs in a sixth family (gh_workflow_*), separate question.
  • Multi-repo defaults — Tag 1: current working directory's repo. That's it.

How this relates to the broader ecosystem

The GitHub-for-Claude-Code stack has many useful pieces:

  • TUI extensions (gh-s, gh-select, gh-repo-explore, gh-statusline, gh-markdown-preview) are explicitly for humans. They are not in scope for the bridge — Claude doesn't drive fzf.
  • gh-pr-await, gh-watch — the kind of long-running observer that gh_watch_* normalizes. The bridge wraps these; it doesn't replace them.
  • gh-sparkle, gh-models — useful upstream of gh_write_pr_create (generate the title/body), but not part of the bridge surface itself.
  • gh-aw — agentic workflows on GitHub-hosted runners. A peer system, not something the bridge embeds. Possible future gh.workflow family.
  • MCP servers (shuymn/gh-mcp etc.) — a parallel transport. They can coexist with the bridge; they don't replace its normalization work.

Installing for Claude Code

macOS users on python.org Python: before using federation ops (Tavily, Wikipedia, HackerNews, etc.), run once: /Applications/Python\ 3.13/Install\ Certificates.command See install.sh --help for details. System Python (/usr/bin/python3, Apple-installed) is unaffected.

Quickstart

pip install github-inside-claude-code
# or with MCP-server adapter:
pip install "github-inside-claude-code[mcp]"
# or run the MCP server ephemerally without install:
uvx gh-tool-use-mcp

For local development (editable install + skill / strict-mode / hooks):

git clone https://github.com/Kirchlive/github-inside-claude-code.git
cd github-inside-claude-code
pip install -e ".[dev]"
bash install.sh

Since v0.6.0, two installation paths are supported. Pick one:

Path A — MCP-first (recommended since v0.6.0)

Claude Code (and other MCP-capable clients) speak the MCP protocol natively. With the MCP server registered, all 88 ops appear in Claude's tool list as first-class mcp__gh-tool-use__* entries — no Bash steering, no permission-deny rules, no PreToolUse hooks needed. The MCP client's built-in per-tool-call approval replaces the bridge's CLI confirm-hook.

# Skill (for context) — lite install: no strict-mode, no hooks
./install.sh --lite

# Install the MCP server (pulls fastmcp as optional extra):
pip install "github-inside-claude-code[mcp]"     # from PyPI
# OR, from a local checkout:
pip install -e ".[mcp]"

# Then register the MCP server (printer command list):
./install.sh --mcp-help

The [mcp] extra is required for the MCP server. Without it, python -m github_inside_claude_code.mcp_server and the gh-tool-use-mcp entry point fail with ModuleNotFoundError: fastmcp.

This is the structurally cleaner path: MCP is deterministic where the Bash-Wrapper was probabilistic (Phase-7.7.X spent eight iterations patching steering-via-permissions before MCP made it moot).

Path B — Legacy Bash-Wrapper (default-on, retained for backward compat)

The original install path: skill + strict-mode permissions.deny rules + 3 PreToolUse hooks that steer Claude through the wrapped CLI binary. This works without MCP, and remains available as defense-in-depth for users who want both paths active at once.

# Full install: skill + strict-mode + hooks
./install.sh

# Status check:
./install.sh --check

# Remove:
./install.sh --uninstall

install.sh symlinks skills/gh-tool-use/SKILL.md into ~/.claude/skills/gh-tool-use/SKILL.md (hyphenated since Phase 10.0 for gh skill spec compliance; the CLI binary gh-tool_use keeps the underscore for backward compatibility). Override the destination root via CLAUDE_SKILLS_DIR=/path ./install.sh. Edits to SKILL.md in this repo are live — no re-install needed when the skill text changes.

The skill description triggers on prompts mentioning pull requests, issues, CI/workflows, code search, repo activity, auth scopes, or any generic "do something on GitHub" phrasing. Claude then invokes gh-tool_use via the Bash tool and branches on the typed error.code from the returned GhResult.

Strict-Mode Enforcement (Path B — opt-in since v0.6.0, default-on before)

v0.6.0 note: when you install via Path A (--lite + MCP-server), this layer is NOT activated. MCP-Client's native approval replaces it. Both layers can coexist for users who want belt-and-braces.

./install.sh (without --lite) aktiviert zwei Layer:

  1. Skill: Symlink unter ~/.claude/skills/gh-tool-use/SKILL.md (hyphen, Phase 10.0 spec compliance)
  2. Strict-Mode: deny/allow-Einträge in ~/.claude/settings.json, die raw gh pr/issue/repo/search/...-Subcommands blocken und stattdessen den Pfad über gh-tool_use erzwingen

Damit wählt Claude die Bridge nicht stochastisch (Skill allein → ~30% Trigger-Rate empirisch) sondern deterministisch für die wrapped Surface. Subcommands die gh-tool_use nicht abdeckt (gh codespace, gh secret, gh pr edit etc.) bleiben unbeschränkt für raw gh.

Off-/On-Toggle

./install.sh --strict-off    # deaktivieren, Skill bleibt installiert
./install.sh --strict-on     # wieder aktivieren
./install.sh --skill-only    # initial nur Skill installieren, kein Strict-Mode
./install.sh --uninstall     # beide Layer entfernen, User-Settings unverändert
./install.sh --check         # Status beider Layer

Wie die deny-Liste gepflegt wird

data/policy.json ist die Source of Truth. Sie wird automatisch aus der Op-REGISTRY generiert:

python scripts/regen_policy.py          # regenerieren
python scripts/regen_policy.py --check  # CI-Drift-Detection

Jede neue Op (z.B. Phase 4.2: gh_read_pr_list etc.) trägt einen gh_subcommand-ClassVar und erweitert deny-Liste automatisch. Keine manuelle JSON-Pflege.

Voraussetzungen

  • jq muss installiert sein (brew install jq / apt install jq)
  • python3 und das venv aus pip install -e ".[dev]" für Policy-Regen

Reversibilität-Garantien

  • ~/.claude/.gh-tool_use-install-manifest.json trackt exakt was hinzugefügt wurde
  • --uninstall und --strict-off entfernen nur diese getrackten Einträge; user-eigene permissions.deny-Rules bleiben unverändert

Architektur-Begründung

Phase 4 zeigte empirisch: Claude triggert Skills nur stochastisch (2/6 produktive Live-Tests). Research bestätigte: nur permissions.deny liefert harte Determinismus-Garantie — MCP-Tools haben keine bessere Selection-Priorität als raw Bash. Vollständige Begründung siehe ADR-013 in docs/decisions.md.

MCP server adapter (Phase 10.1, PoC)

Since v0.5.21 the bridge exposes its operations as native MCP tools via the gh-tool-use-mcp binary. This is the structural alternative to the Bash-wrapper + permission-deny enforcement: when the MCP client (Claude Code, Claude Desktop, Cursor, Codex …) sees gh_read_search as a first-class tool in its tool list, there is no raw gh search for the LLM to misuse — pivot becomes deterministic instead of probabilistic.

PoC scope (Phase 10.1): one operation exposed end-to-end (gh_read_search). Phase 10.3 will bulk-migrate the remaining ops in priority batches, after which the permissions.deny + PreToolUse-Hook layer becomes opt-in instead of default.

Install

# Option A — uvx (recommended; no venv management for the user)
uvx gh-tool-use-mcp     # one-shot run; uvx handles isolated env

# Option B — from a clone
pip install -e ".[mcp]"     # pulls fastmcp on demand

Wire into Claude Code

Add to .mcp.json at your project root, or ~/.claude/mcp.json for user-scope:

{
  "mcpServers": {
    "gh-tool-use": {
      "command": "uvx",
      "args": ["gh-tool-use-mcp"]
    }
  }
}

Or one-shot via the Claude Code CLI:

claude mcp add gh-tool-use -- uvx gh-tool-use-mcp

/mcp inside Claude Code lists the registered servers and tool count.

Auth

The MCP server invokes gh as a subprocess, so authentication flows through gh auth login exactly as with the CLI binary. GITHUB_TOKEN in the spawned process environment also works.

WSL note — venv on native filesystem

If you're on WSL with the repo on /mnt/c/... (Windows-mount), put the venv on the native Linux filesystem instead of /mnt/c:

# Recommended: native venv (~10x faster cold-start)
python3 -m venv ~/.venv-gh-tool-use
~/.venv-gh-tool-use/bin/pip install -e ".[mcp]"
claude mcp add gh-tool-use -- ~/.venv-gh-tool-use/bin/gh-tool-use-mcp

/mnt/c filesystem operations are slow enough that Claude Code's MCP-stdio probe can time out on first connect, forcing a manual /mcp reconnect. Native-FS venv eliminates that. Phase 10.1.1 also lazy-imports the heavy app modules so the FastMCP server itself boots fast even when site-packages live on a slow filesystem.

Coexistence with the CLI binary

The CLI binary gh-tool_use (underscore, backward compat) and the MCP binary gh-tool-use-mcp (hyphen, spec-aligned) ship side by side. CLI remains the right interface for shell scripts, CI pipelines, and direct human use; MCP is the right interface for autonomous agents that should not be reaching into Bash for GitHub data at all.

Coverage-Roadmap für zukünftige Erweiterungen: docs/coverage-roadmap.md.

Next steps

  1. Implement gh_write_* end-to-end — done (six tools, confirm hook, classifier, fixtures).
  2. Write gh_read_* against the same GhResult envelope — done (five tools, shared GhOp pipeline).
  3. Implement gh_session_* — done (three tools, separate GhSessionOp base because scope pre-flight would be circular here).
  4. Implement gh_watch_* — done (three tools, separate GhWatchOp base with a polling loop and Clock abstraction. Tag-1 emits a single GhResult with data.events at termination; NDJSON streaming would be a follow-up).
  5. Implement gh_query_* — done (two tools as GhReadOp subclasses over gh api graphql; dependents deliberately deferred — see above).
  6. Validate the contract against a week of real gh usage. If code: unknown shows up repeatedly, the taxonomy is wrong and the classifiers in error_classify.py and per-op classify_error need extension. This is the most important next step before adding new surface.
  7. Documented Tag-2 candidates (only after validation):
    • NDJSON streaming for gh_watch_* (--stream flag, line-per-event)
    • Optional caching daemon (only if gh_session_whoami latency shows up)
    • MCP server wrapping the CLI (trivial layer; defer until there's demand)
    • Possible gh_workflow_* family for gh-aw integration

See SKILL.md for how Claude is taught to use the bridge.

For the design history — the original architecture conversation, the ADR-style decisions, the GitHub-CLI-extension inventory, and verbatim source quotes — see docs/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

github_inside_claude_code-0.9.4.tar.gz (385.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

github_inside_claude_code-0.9.4-py3-none-any.whl (210.6 kB view details)

Uploaded Python 3

File details

Details for the file github_inside_claude_code-0.9.4.tar.gz.

File metadata

File hashes

Hashes for github_inside_claude_code-0.9.4.tar.gz
Algorithm Hash digest
SHA256 94656462c9b478d8ee1c4dc9bf61364928871ae5fd0e338eddedc38ef8438e8b
MD5 c61157a0ead798b8b2599219e208056e
BLAKE2b-256 06f3a9d01e73cf20b8fdd77dd87842279ea1c6db4a94a7e3b421999e90a94e22

See more details on using hashes here.

Provenance

The following attestation bundles were made for github_inside_claude_code-0.9.4.tar.gz:

Publisher: publish.yml on Kirchlive/github-inside-claude-code

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file github_inside_claude_code-0.9.4-py3-none-any.whl.

File metadata

File hashes

Hashes for github_inside_claude_code-0.9.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9fc1fe36a7414f56b1c2403d1accc5d0f6c945862a12fa02d7bac8fe63a6d8f4
MD5 ac7790c6965cc453836a5db8ae58cbe9
BLAKE2b-256 079de42d4f00530124399ff0e3c4b5ba0461f0b3c4d9fcc3f8f3a8d38b265efb

See more details on using hashes here.

Provenance

The following attestation bundles were made for github_inside_claude_code-0.9.4-py3-none-any.whl:

Publisher: publish.yml on Kirchlive/github-inside-claude-code

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page