Skip to main content

Tooling for running AI coding agents in CI/CD environments

Project description

agentic-ci

PyPI version PyPI - Python Version CI License PyPI - Downloads

Run AI coding agents in sandboxed CI environments with streaming output and telemetry. Supports multiple agent harnesses (Claude Code, OpenCode) and isolation backends so you can choose the right tradeoff between simplicity and security.

Backends

Podman (default)

Runs the agent inside a Podman container. Each run creates a fresh container that auto-deletes on exit. The work directory is mounted into the container and gcloud credentials are mounted read-only.

Important: The Podman backend provides only basic container-level isolation. It uses --network host, so the agent has unrestricted network access. There is no filesystem sandboxing beyond the container boundary itself and no network policy enforcement. Use the OpenShell backend if you need stronger security controls.

Good for: local development, CI runners that already have Podman, quick one-off runs in trusted environments.

Requires: podman, a container image with the agent CLI installed (e.g. ghcr.io/opendatahub-io/ai-helpers:latest).

OpenShell

Runs the agent inside an OpenShell sandbox with network policy enforcement, Landlock-based filesystem access control, and fine-grained endpoint restrictions. Network policies limit which hosts the agent can reach (e.g. only Vertex AI, GitHub, PyPI) and filesystem policies restrict which paths are writable. An embedded gateway starts per CI job — no external infrastructure required.

Good for: production CI where you need to control what the agent can access on the network and filesystem.

Requires: openshell and openshell-gateway installed on the host.

Install

uv tool install agentic-ci
# OR
pip install agentic-ci

Usage

Run a prompt

# Podman (default backend)
agentic-ci run "Fix the flaky test in test_auth.py" \
    --image ghcr.io/opendatahub-io/ai-helpers:latest

# OpenShell
agentic-ci run --backend openshell "Fix the flaky test in test_auth.py"

Setup and stop

setup creates and starts the sandbox environment. stop tears it down. run auto-calls setup if the sandbox isn't already running.

# Start the sandbox
agentic-ci setup --image ghcr.io/opendatahub-io/ai-helpers:latest

# Run multiple prompts in the same sandbox (use --keep to prevent auto-teardown)
agentic-ci run "Fix the flaky test" --keep \
    --image ghcr.io/opendatahub-io/ai-helpers:latest
agentic-ci run "Update the changelog" \
    --image ghcr.io/opendatahub-io/ai-helpers:latest

# Tear down the sandbox
agentic-ci stop

Options

agentic-ci {setup,run,stop} [options]
Flag Default Description
--backend podman Sandbox backend to use
--harness claude-code Agent harness (claude-code or opencode)
--workdir PATH . Working directory to mount
--image IMAGE Container or sandbox base image
--model MODEL harness-dependent Agent model (run only). Defaults to claude-opus-4-6 for Claude Code, google-vertex/claude-opus-4-6@default for OpenCode
--keep off Keep the sandbox running after the run completes (run only)
--no-streaming off Disable parsed stream output; agent output is printed raw (run only)
--no-otel off Disable OTEL telemetry collection (run only)
--pre-gates GATES Comma-separated pre-agent gates (run only)
--post-gates GATES Comma-separated post-agent gates (run only)
--policy PATH OpenShell policy file override (openshell backend only)
--timeout SECS 1200 Container timeout (podman backend only)

Extra arguments after the prompt are passed through to the Claude CLI.

Examples

# Use a specific model
agentic-ci run "Update the changelog" \
    --image ghcr.io/opendatahub-io/ai-helpers:latest \
    --model claude-sonnet-4-6

# Disable parsed stream output (prints raw agent output)
agentic-ci run "Run the test suite" \
    --image ghcr.io/opendatahub-io/ai-helpers:latest \
    --no-streaming

# Disable telemetry
agentic-ci run "Fix lint errors" \
    --image ghcr.io/opendatahub-io/ai-helpers:latest \
    --no-otel

# Run with post-agent gates
export TICKET_KEY=AIPCC-123
export BOT_EMAIL=bot@ci.com
agentic-ci run "Fix the bug" \
    --image ghcr.io/opendatahub-io/ai-helpers:latest \
    --post-gates sensitive-files,commit-author,commit-message-key,gitleaks

# OpenShell with custom policy
agentic-ci run --backend openshell "Deploy staging" \
    --policy custom-policy.yml

# OpenShell with repo-level policy (auto-discovered from
# .agentic-ci/openshell-policy.yml in the workdir)
agentic-ci run --backend openshell "Add input validation"

Gates

Gates validate data before and after an AI agent runs. Pre-gates can block execution early; post-gates validate output to catch dangerous changes. Gates read their configuration from environment variables.

Built-in post-agent gates:

Name Required Env Vars Description
sensitive-files Block commits touching .env, *.pem, *.key, etc.
commit-author BOT_EMAIL Verify commit author matches expected bot email
commit-message-key TICKET_KEY Verify ticket key appears in commit message
gitleaks Scan new commits for secrets using gitleaks

Pre-agent gates are supported via --pre-gates with custom implementations (e.g. filtering by comment domain or author).

All required environment variables are validated before any gate runs. If any are missing, the CLI exits immediately with a clear error listing every missing variable and which gate needs it.

Credentials

Two authentication modes are supported. The mode is auto-detected and logged at startup.

Anthropic API key (direct)

Set ANTHROPIC_API_KEY in the environment. No gcloud credentials are needed; the key is passed directly to the agent inside the container or sandbox. Vertex-specific env vars and credential mounts are skipped.

export ANTHROPIC_API_KEY=sk-ant-...
agentic-ci run "Fix the bug" --image ghcr.io/opendatahub-io/ai-helpers:latest

Vertex AI (default)

When ANTHROPIC_API_KEY is not set, both backends use Vertex AI for Claude API access via gcloud Application Default Credentials.

The podman backend checks credentials in this order:

  1. GCLOUD_CREDENTIALS env var (raw JSON or base64-encoded)
  2. GCP_SERVICE_ACCOUNT_KEY env var (file path, raw JSON, or base64-encoded)
  3. ~/.config/gcloud/application_default_credentials.json
  4. Path in GOOGLE_APPLICATION_CREDENTIALS env var

The openshell backend uploads the local ADC file (~/.config/gcloud/application_default_credentials.json or GOOGLE_APPLICATION_CREDENTIALS) into the sandbox.

Environment Variables

Variable Default Description
ANTHROPIC_API_KEY -- Anthropic API key. When set, uses direct API auth instead of Vertex AI
CLAUDE_MODEL claude-opus-4-6 Default model for Claude Code harness (overridden by --model)
CLAUDE_CONTAINER_IMAGE Default container image for Claude Code harness
OPENCODE_MODEL google-vertex/claude-opus-4-6@default Default model for OpenCode harness (overridden by --model)
OPENCODE_CONTAINER_IMAGE Default container image for OpenCode harness
ANTHROPIC_VERTEX_PROJECT_ID Vertex AI project ID
GCP_PROJECT_ID Fallback for ANTHROPIC_VERTEX_PROJECT_ID
GOOGLE_CLOUD_PROJECT GCP project ID (OpenCode uses this before falling back to ANTHROPIC_VERTEX_PROJECT_ID)
CLOUD_ML_REGION global Vertex AI region
VERTEX_LOCATION Vertex AI region (OpenCode uses this before falling back to CLOUD_ML_REGION)
GCLOUD_CREDENTIALS Raw JSON or base64 gcloud credentials
GCP_SERVICE_ACCOUNT_KEY Service account key: file path, raw JSON, or base64-encoded JSON
GOOGLE_APPLICATION_CREDENTIALS Path to ADC credentials file
OPENSHELL_SUPERVISOR_IMAGE openshell/supervisor:dev OpenShell supervisor image (openshell backend only)

Streaming Output

By default, agent output is parsed into human-readable CI logs with:

  • Colored ANSI output (thinking in red, tool calls in gray)
  • Tool call summaries (bash commands, file paths, agent dispatches)
  • Token count display with throughput rate
  • OTEL token/cost summary at completion

Disable with --no-streaming to skip the parsed output and print raw agent output, or --no-otel to skip the token/cost summary.

Python API

from agentic_ci.backends import create_backend
from agentic_ci.harness import create_harness

harness = create_harness("claude-code")
backend = create_backend("podman", harness=harness, workdir="/path/to/repo", image="my-image:latest")
backend.setup()
rc = backend.run(prompt="Fix the bug", model="claude-sonnet-4-6")
backend.stop()

Additional Modules

The package includes several library modules used by downstream pipelines:

  • agentic_ci.jira — Jira REST API client with acli delegation, ADF (Atlassian Document Format) conversion, and rate limiting.
  • agentic_ci.git — Git operations (clone, branch, push, diff, commit info extraction) with security hardening.
  • agentic_ci.pipeline — GitLab child pipeline YAML generation with hash-based slot distribution.
  • agentic_ci.verdict — Structured verdict JSON schema validation.

Building a Pipeline with the Generic Skill Runner

agentic-ci provides a generic skill runner framework that any project can use to build its own AI-powered CI pipeline. You define what happens at each stage via callable hooks; the framework handles container execution, retries, OTEL cost tracking, and gate orchestration.

Quick Start

import json
from pathlib import Path
from agentic_ci.skill import SkillConfig, run_skill

config = SkillConfig(
    skill_name="my-review",
    prompt_builder=lambda ticket_key, mode, skill_name, **kw: (
        f"Use the /{skill_name} skill to review ticket {ticket_key}."
    ),
    verdict_loader=lambda work_dir: json.loads(
        (work_dir / "verdict.json").read_text()
    ),
    label_applier=lambda ticket_key, verdict, **kw: (
        print(f"[{ticket_key}] verdict: {verdict}")
    ),
)

rc = run_skill(
    config,
    ticket_key="PROJ-123",
    work_dir=Path("/tmp/work"),
    config_dir=Path("/tmp/config"),
)

SkillConfig Hooks

All domain-specific behavior is injected via hooks on SkillConfig:

Hook Signature Purpose
prompt_builder (ticket_key, mode, skill_name, **kw) -> str Build the prompt sent to Claude
context_writer (ticket_key, ticket, mode, work_dir, **kw) -> None Write context files before the run
verdict_loader (work_dir) -> dict Load the agent's verdict after the run
verdict_path_fn (work_dir) -> Path Where to find the verdict file
label_applier (ticket_key, verdict, mode, work_dir, **kw) -> None Apply labels/transitions after the run
cost_formatter (cost_data) -> str | None Format OTEL cost data for display
extension_config_writer (ticket_key, ticket, config, work_dir, **kw) -> None Write extra config (e.g. Claude extensions)

Pipeline Flow

run_skill() executes this sequence:

  1. Pre-gates -- each pre_gates callable can block the run early (returns a message to skip, None to continue)
  2. Context -- context_writer writes ticket data and supporting files
  3. Extension config -- extension_config_writer sets up Claude plugins/skills
  4. Prompt -- prompt_builder produces the prompt string
  5. Container -- launches Claude via PodmanBackend (or a custom container_runner)
  6. Retry -- transient failures (exit 124/137/143) retry once if mode is in retryable_modes
  7. Cost -- parses OTEL metrics from the run directory
  8. Post-gates -- each post_gates callable validates the output (e.g. sensitive file check, gitleaks)
  9. Verdict -- verdict_loader reads the agent's structured output
  10. Report -- label_applier applies labels, posts comments, transitions tickets

Example: jira-autofix

The jira-autofix project uses this framework to build an automated Jira bug-fix pipeline:

config = SkillConfig(
    skill_name="autofix-resolve",
    prompt_builder=_build_prompt,         # Jira-specific prompt
    context_writer=_write_context,        # Writes ticket.json to .autofix-context/
    verdict_loader=_load_verdict,          # Reads .autofix-verdict.json
    label_applier=_apply_labels,          # Manages jira-autofix-* labels
    cost_formatter=_format_otel_cost,     # Formats cost for Jira comments
    post_gates=[_autofix_post_gate],      # Commit author check, sensitive files, gitleaks
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentic_ci-0.2.15.tar.gz (54.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentic_ci-0.2.15-py3-none-any.whl (67.5 kB view details)

Uploaded Python 3

File details

Details for the file agentic_ci-0.2.15.tar.gz.

File metadata

  • Download URL: agentic_ci-0.2.15.tar.gz
  • Upload date:
  • Size: 54.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for agentic_ci-0.2.15.tar.gz
Algorithm Hash digest
SHA256 96fa1acb2e9fd6453bf886f5597a769bc9be43f3facce633bb1fa797d98fba72
MD5 1c43fe717d2bececf6439c6e5ca851ba
BLAKE2b-256 c2ee16d93d72d91451e9e8d55a98165d31e14d136570cfa214e675b2d110d44e

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_ci-0.2.15.tar.gz:

Publisher: publish.yml on opendatahub-io/agentic-ci

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentic_ci-0.2.15-py3-none-any.whl.

File metadata

  • Download URL: agentic_ci-0.2.15-py3-none-any.whl
  • Upload date:
  • Size: 67.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for agentic_ci-0.2.15-py3-none-any.whl
Algorithm Hash digest
SHA256 d3f19e3a1dbabd13ad89b361a7356d364071cb0f1e98fda6722c7f29b7d215ca
MD5 be19a41244d80a78bbfb52e0026ed40c
BLAKE2b-256 aac1433f4e98e381828ec42205ecb1971a2deac8d21e981c20b28a8071e0462e

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_ci-0.2.15-py3-none-any.whl:

Publisher: publish.yml on opendatahub-io/agentic-ci

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page