Skip to main content

Reusable CLI runtime primitives for provider-backed automation workflows

Project description

coding-cli-runtime

PyPI Python Build License

A Python library for orchestrating LLM coding agent CLIs — Claude Code, Codex, Gemini CLI, and GitHub Copilot.

These CLIs each have different invocation patterns, output formats, error shapes, and timeout behaviors. This library normalizes all of that behind a common CliRunRequestCliRunResult contract, so your automation code doesn't need provider-specific subprocess handling.

The package now exposes a stable core API plus preview provider adapters:

  • coding_cli_runtime — stable metadata, launch primitives, schema helpers, subprocess/session execution, and provider facts.
  • coding_cli_runtime.providers — preview provider-aware adapters that own run/parse/session/recovery flows for Claude, Codex, Copilot, and Gemini.

If you are inspecting only the root package, note that the preview adapter namespace is available as coding_cli_runtime.providers.

Maintainers changing provider/runtime contracts should start with:

What it does (and why not just subprocess.run):

  • Run any provider CLI with unified request/result types and timeout enforcement
  • Query the model catalog (with user-override and live-cache fallback)
  • Classify failures as retryable vs fatal per provider
  • Look up provider auth, config dirs, and headless launch flags
  • Build non-interactive launch commands without hardcoding provider flags
  • Find session logs and preserved conversations after a run
  • Run long-lived sessions with process-group cleanup and transcript mirroring
  • Preview provider-aware launch/result flows with typed adapter events and normalized raw execution views
  • No Python package dependencies — only requires the provider CLIs themselves

Installation

pip install coding-cli-runtime
# or
uv add coding-cli-runtime

Requires Python 3.10+.

Examples

Execute a provider CLI

import asyncio
from pathlib import Path
from coding_cli_runtime import CliRunRequest, run_cli_command

request = CliRunRequest(
    cmd_parts=("codex", "--model", "gpt-5.4", "--quiet", "exec", "fix the tests"),
    cwd=Path("/tmp/my-project"),
    timeout_seconds=120,
)
result = asyncio.run(run_cli_command(request))

print(result.returncode)        # 0
print(result.error_code)        # "none"
print(result.duration_seconds)  # 14.2
print(result.stdout_text[:200])

Swap codex for claude, gemini, or copilot — the request/result shape stays the same. A synchronous variant run_cli_command_sync is also available.

Pick a model from the provider catalog

from coding_cli_runtime import get_provider_spec

codex = get_provider_spec("codex")
print(codex.default_model)   # "gpt-5.3-codex"
print(codex.model_source)    # "codex_cli_cache", "override", or "code"

for model in codex.models:
    print(f"  {model.name}: {model.description}")

The catalog covers all four providers — each with model names, reasoning levels, default settings, and visibility flags.

Model lists are resolved with a three-tier fallback:

  1. User override — drop a JSON file at ~/.config/coding-cli-runtime/providers/<provider>.json to use your own model list immediately, without waiting for a package update.
  2. Live CLI cache — for Codex, the library reads ~/.codex/models_cache.json (auto-refreshed by the Codex CLI) when present. Other providers fall through because their CLIs don't expose a machine-readable model list.
  3. Hardcoded fallback — the model list shipped with the package.

Override file format:

{
  "default_model": "claude-sonnet-4-7",
  "models": [
    "claude-sonnet-4-7",
    {
      "name": "claude-opus-5",
      "description": "Latest opus model",
      "controls": [
        { "name": "effort", "kind": "choice", "choices": ["low", "high"], "default": "low" }
      ]
    }
  ]
}

Set CODING_CLI_RUNTIME_CONFIG_DIR to change the config directory (default: ~/.config/coding-cli-runtime).

Decide whether to retry a failed run

from coding_cli_runtime import classify_provider_failure

classification = classify_provider_failure(
    provider="gemini",
    stderr_text="429 Resource exhausted: rate limit exceeded",
)

if classification.retryable:
    print(f"Retryable ({classification.category}) — will retry")
else:
    print(f"Fatal ({classification.category}) — giving up")

Works for all four providers. Recognizes auth failures, rate limits, network transients, and other provider-specific error patterns.

Use preview provider-aware adapters

from pathlib import Path

from coding_cli_runtime.providers import claude

request = claude.ClaudeExecRequest(
    model="claude-sonnet-4-6",
    prompt="Summarize the repository status as JSON.",
    cwd=Path("/tmp/my-project"),
    output_format="json",
    transcript_path=Path("/tmp/claude-conversation.jsonl"),
)

preview = claude.prepare_launch(request)
print(preview.display_text)

result = claude.run_sync(request)
print(result.raw_execution.returncode)
print(result.parsed_output.structured_output)

session = claude.find_session(
    request.cwd,
    result.raw_execution.started_at,
    prompt_text=request.prompt,
)
conversation = claude.get_conversation(session)
print(conversation.line_count)

These preview adapters are provider-aware and provider-specific on purpose. They are the right API when you want package-owned parsing, session lookup, conversation retrieval, launch preview, adapter events, and provider-specific recovery behavior.

These coding_cli_runtime.providers.* APIs are still preview surfaces and may evolve faster than the stable core metadata/helpers.

The stable root package also exposes low-level find_claude_session(...) and find_codex_session(...) helpers that return a best-effort session log path. Use coding_cli_runtime.providers.find_*_session(...) when you want the preview adapter layer's typed lookup results and provider-specific matching behavior.

For a concrete Copilot local image-evaluation recipe, see copilot_structured_multimodal.md.

Continue a previous CLI session

from pathlib import Path

from coding_cli_runtime.providers import ProviderContinuationMode, claude

workdir = Path("/tmp/my-project")

first = claude.run_sync(
    claude.ClaudeExecRequest(
        model="claude-sonnet-4-6",
        prompt="Reply with exactly FIRST.",
        cwd=workdir,
        output_format="text",
    )
)

resumed = claude.run_sync(
    claude.ClaudeExecRequest(
        model="claude-sonnet-4-6",
        prompt="Reply with exactly SECOND.",
        cwd=workdir,
        output_format="text",
        continuation_mode=ProviderContinuationMode.SESSION_ID,
        session_id=first.session_lookup.session_id,
    )
)

print(resumed.continuation_resolution.resolved_session_id)

Preview adapters support explicit multi-turn continuation via continue_latest, session_id, or session_path where the underlying CLI supports it. For Codex, fresh runs may use native output_schema_path enforcement, while resumed runs intentionally reject output_schema_path because codex exec resume does not support --output-schema.

Common integration tasks

Check whether a provider CLI is installed

from coding_cli_runtime import is_provider_installed

if not is_provider_installed("claude"):
    raise RuntimeError("Claude Code is not available on PATH")

This is intentionally minimal: it checks whether the provider binary exists on PATH. Deeper CLI drift validation belongs in maintainer tooling, not the runtime API.

Resolve workspace env vars and session search paths

from coding_cli_runtime import (
    get_provider_contract,
    resolve_session_search_paths,
    resolve_workspace_env,
)

gemini = get_provider_contract("gemini")

# Derive provider-specific workspace env vars from contract metadata
env = resolve_workspace_env(gemini, "/tmp/run-dir")
# {"GEMINI_CLI_IDE_WORKSPACE_PATH": "/tmp/run-dir"}

# Expand concrete host paths for session log searches
paths = resolve_session_search_paths(gemini)
# (Path.home() / ".gemini" / "tmp",)

Use these helpers when you want the contract facts turned into concrete filesystem/env values without rebuilding the same glue logic in your code.

Look up provider contract metadata

from coding_cli_runtime import get_provider_contract, build_env_overlay, resolve_config_paths, render_prompt

# Get structured metadata for any supported provider
contract = get_provider_contract("claude")
print(contract.binary)                        # "claude"
print(contract.auth.api_key_env_var)          # "CLAUDE_API_KEY"
print(contract.paths.config_dir)              # "~/.claude"
print(contract.headless.approval.flag)        # "--dangerously-skip-permissions"

# Build env var overlay for subprocess
env = build_env_overlay(contract, api_key="sk-...", base_url="https://custom.example.com")
# {"CLAUDE_API_KEY": "sk-...", "ANTHROPIC_BASE_URL": "https://custom.example.com"}

# Resolve config paths for container mounts
host_dir, container_dir = resolve_config_paths(contract, containerized=True)
# ("/home/user/.claude", "/root/.claude")

# Resolve prompt delivery (stdin vs flag vs activation)
payload = render_prompt(contract.headless.prompt, "Fix the bug")
# payload.args = ()            (stdin delivery for Claude)
# payload.stdin_text = "Fix the bug"

ProviderContract is structured as nested sub-contracts (AuthContract, PathContract, HeadlessContract, OutputContract, IoContract, SessionDiscoveryContract, DiagnosticsContract) so callers can drill into whichever aspect they need. This is reference metadata, not a command-construction control plane — callers keep their own command assembly and adopt contract fields selectively.

Query provider I/O conventions

from coding_cli_runtime import get_provider_contract

gemini = get_provider_contract("gemini")

# Workspace env vars with value semantics
for wev in gemini.io.workspace_env_vars:
    print(f"{wev.name} = {wev.value_source}")
    # GEMINI_CLI_IDE_WORKSPACE_PATH = execution_dir

# Session discovery (where session logs live)
sd = gemini.session_discovery
print(sd.session_roots)  # ("tmp",)
print(sd.session_glob)   # "*/chats/session-*.json"

# Output format support
codex = get_provider_contract("codex")
print(codex.output.output_path_flag)    # "-o"
print(codex.output.schema_path_flag)    # "--output-schema"

# Diagnostics (Copilot only)
copilot = get_provider_contract("copilot")
if copilot.diagnostics:
    print(copilot.diagnostics.log_glob)  # "logs/process-*.log"

WorkspaceEnvVar.value_source uses a closed vocabulary: "execution_dir" or "workspace_root".

Build headless launch commands

from coding_cli_runtime import build_claude_headless_core, build_codex_headless_core

# Claude: binary + --print + --permission-mode + --dangerously-skip-permissions + --model
cmd = build_claude_headless_core("claude-sonnet-4-6")
cmd.extend(["--output-format", "text", "--disallowedTools", "Bash,Task"])

# Codex: binary + exec + --full-auto + --sandbox + --skip-git-repo-check + --model
cmd = build_codex_headless_core("gpt-5.4", sandbox_mode="read-only")
cmd.extend(["-C", str(workdir)])

Headless core helpers emit the standard flags for non-interactive runs. Consumers append app-specific tails (tool restrictions, output paths, etc.).

Find session logs after a run

import time
from coding_cli_runtime import find_codex_session, find_claude_session

# Find the most recent Codex session log for a given working directory
session = find_codex_session("/path/to/project", since_ts=time.time() - 300)
if session:
    print(f"Session log: {session}")  # ~/.codex/sessions/.../conversation.jsonl

Works for Codex and Claude. Scans provider config directories for session files matching the working directory and time window.

Derive provider-agnostic transcript facts

from coding_cli_runtime import TranscriptFactsError, derive_transcript_facts

try:
    facts = derive_transcript_facts(
        provider="copilot",
        transcript_path="/tmp/copilot_conversation.jsonl",
    )
except TranscriptFactsError as exc:
    print(f"Transcript parse failed: {exc}")
else:
    print(facts.assistant_messages)
    print(facts.content_messages)
    print(facts.tool_execution_start_count)
    print(facts.final_content)

This keeps raw transcripts provider-native while exposing a small shared facts shape that callers can reuse without depending on provider event schemas directly.

Copilot has three separate artifact surfaces: stdout is the direct response payload, share_output_path is a human-readable Markdown sidecar, and transcript_path is structured JSONL copied from the provider session log. Use the structured JSONL for derive_transcript_facts(...); do not parse the share Markdown as assistant content.

For fresh transcript samples and the maintainer probe harness, see the transcript probe playground in the project repository.

Key types

Type Purpose
CliRunRequest Command spec: cmd, cwd, env, timeout, stream paths
CliRunResult Result: returncode, stdout/stderr, duration, error code
ErrorCode none · spawn_failed · timed_out · non_zero_exit
ProviderSpec Provider catalog entry with models, controls, defaults
ProviderContract Structured provider CLI metadata (auth, paths, headless, I/O, sessions)
WorkspaceEnvVar Env var with value-source semantics (execution_dir, workspace_root)
FailureClassification Classified error with retryable flag and category

Run long-lived CLI sessions

For CLI runs that take minutes (e.g., full app generation), use run_interactive_session() instead of run_cli_command(). It adds:

  • Process-group cleanup (kills orphaned child processes on timeout)
  • Transcript mirroring (streams CLI output to a file while the process runs)
  • Automatic retries on transient failures
from coding_cli_runtime import run_interactive_session

result = await run_interactive_session(
    cmd_parts=("claude", "--print", "--model", "claude-sonnet-4-6"),
    cwd=workdir,
    stdin_text=prompt,
    logger=logger,
    timeout_seconds=600,
)

Only cmd_parts, cwd, stdin_text, and logger are required. Other parameters have sensible defaults.

API summary

The full public API is listed in __init__.py. Key function groups:

Group Functions
Execution run_cli_command, run_cli_command_sync, run_interactive_session
Provider metadata get_provider_contract, get_provider_spec, list_provider_specs
Contract helpers build_env_overlay, resolve_config_paths, render_prompt, resolve_auth, resolve_workspace_env, resolve_session_search_paths
Headless launch build_claude_headless_core, build_codex_headless_core, build_copilot_headless_core, build_gemini_headless_core
Codex batch build_codex_exec_spec
Failure handling classify_provider_failure
Installation check is_provider_installed
Session logs find_codex_session, find_claude_session
Schema load_schema, validate_payload
Utilities redact_text, build_model_id, normalize_path_str

Contributing

See CONTRIBUTING.md for development setup and quality checks.

Prerequisites

This package does not bundle any CLI binaries or credentials. You must install and authenticate the relevant provider CLI yourself before using the execution helpers.

Status

Pre-1.0. API may change between minor versions.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coding_cli_runtime-0.8.0.tar.gz (119.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coding_cli_runtime-0.8.0-py3-none-any.whl (88.4 kB view details)

Uploaded Python 3

File details

Details for the file coding_cli_runtime-0.8.0.tar.gz.

File metadata

  • Download URL: coding_cli_runtime-0.8.0.tar.gz
  • Upload date:
  • Size: 119.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for coding_cli_runtime-0.8.0.tar.gz
Algorithm Hash digest
SHA256 84347ee334c3a98e9646549e7c8afedcbc253fd548c7217d7537d3e00ea711d5
MD5 e077945bce09bafdef172122fe750054
BLAKE2b-256 b646d8ba75d467e790d63d6b9c3e2a427d1d0634e1613d13aa387120bf8ecbbd

See more details on using hashes here.

Provenance

The following attestation bundles were made for coding_cli_runtime-0.8.0.tar.gz:

Publisher: publish-coding-cli-runtime.yml on pj-ms/llm-eval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coding_cli_runtime-0.8.0-py3-none-any.whl.

File metadata

File hashes

Hashes for coding_cli_runtime-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c206567fdd62acb53de160fc22ca1b9f0efb4f52d6d5f46e6df10f9c570a272b
MD5 a2417eed1e615e001e10ae5126be806a
BLAKE2b-256 3cee6d40a8b21b7eb0084a573916a5d3526fe470b2e4bf4cbf292c4d07732e31

See more details on using hashes here.

Provenance

The following attestation bundles were made for coding_cli_runtime-0.8.0-py3-none-any.whl:

Publisher: publish-coding-cli-runtime.yml on pj-ms/llm-eval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page