Skip to main content

Declarative YAML-based framework for defining, managing, and orchestrating AI coding agent instances

Project description

scitex-agent-container

SciTeX

Declarative YAML-based AI agent lifecycle management

PyPI version Documentation Tests License: AGPL-3.0

pip install scitex-agent-container


Problem

Managing AI coding agents (Claude Code, Cursor, Aider) in production requires manual script-writing, environment setup, and process monitoring for each agent instance. Scaling from one agent to a fleet across multiple machines means duplicating fragile shell scripts with no health checks, restart policies, remote deployment, or inter-agent communication.

Solution

scitex-agent-container provides declarative YAML definitions that fully specify an agent -- runtime, model, channels, environment, health checks, remote host, Orochi hub connection -- started with a single command:

YAML manifest --> scitex-agent-container start --> screen session
                                                   + remote SSH deploy
                                                   + Orochi auto-connect
                                                   + health monitor
                                                   + restart policy

Installation

Requires Python >= 3.10.

pip install scitex-agent-container

# With Orochi hub integration
pip install scitex-agent-container[orochi]

# With Telegram integration
pip install scitex-agent-container[telegram]

# Development
pip install scitex-agent-container[dev]

Quickstart

  1. Write a YAML manifest:
apiVersion: cld-agent/v1
kind: Agent
metadata:
  name: my-agent
  labels:
    role: worker
    machine: local
spec:
  runtime: claude-code
  model: sonnet
  workdir: ~/proj

  claude:
    flags:
      - --dangerously-skip-permissions
    session: new

  # Auto-connect to Orochi hub
  orochi:
    enabled: true
    hosts:
      - 192.168.11.22      # LAN (fast)
      - scitex-orochi.com   # domain (fallback)
    port: 8559
    token_env: SCITEX_OROCHI_TOKEN
    channels:
      - "#general"

  health:
    enabled: true
    interval: 30
    method: screen-alive

  restart:
    policy: on-failure
    max_retries: 3
    backoff:
      initial: 30
      max: 300
      multiplier: 2
  1. Start and monitor:
scitex-agent-container start agent.yaml
scitex-agent-container status my-agent
scitex-agent-container logs my-agent -n 100
scitex-agent-container attach my-agent   # Ctrl-A D to detach

Remote SSH Deployment

Deploy agents to remote machines with a single YAML:

spec:
  remote:
    host: spartan          # SSH hostname
    user: ywatanabe
    timeout: 120           # seconds (HPC module loads are slow)
    login_shell: true      # bash -l -c (needed for PATH on most hosts)
# Preflight checks (SSH, screen, python, disk) then start
scitex-agent-container start remote-agent.yaml

# Skip preflight for slow hosts (e.g. HPC with module loads)
scitex-agent-container start --no-preflight remote-agent.yaml

# Run preflight checks without starting
scitex-agent-container check remote-agent.yaml

Orochi Auto-Connect

Agents auto-register with the scitex-orochi WebSocket hub on startup:

spec:
  orochi:
    enabled: true
    hosts:                      # tried in order, first reachable wins
      - 127.0.0.1              # localhost (if hub runs here)
      - 192.168.11.22          # LAN IP
      - scitex-orochi.com      # domain (external fallback)
    port: 8559
    token_env: SCITEX_OROCHI_TOKEN
    channels: ["#general", "#research"]
    heartbeat_interval: 60
    reconnect_interval: 10
    reconnect_max_retries: 0    # 0 = infinite

No silent fallbacks -- every host attempt is logged:

INFO  Orochi connection report: [192.168.11.22:FAIL | scitex-orochi.com:OK]
      -- connected via scitex-orochi.com (my-agent@spartan channels=['#general'])

Four Interfaces

Python API
from scitex_agent_container import (
    AgentConfig, load_config, validate_config,
    agent_start, agent_stop, agent_restart, agent_status, agent_logs,
    Registry,
)

config = load_config("agent.yaml")      # Parse YAML manifest
agent_start("agent.yaml")               # Launch agent
info = agent_status("my-agent")         # Query status
agent_stop("my-agent")                  # Stop agent
agent_restart("my-agent")               # Restart agent
output = agent_logs("my-agent")         # Read logs
registry = Registry()                    # Access agent registry
CLI Commands
scitex-agent-container --help-recursive          # Show all commands

# Lifecycle
scitex-agent-container start <config.yaml>       # Start an agent
scitex-agent-container start --no-preflight ...  # Skip SSH preflight checks
scitex-agent-container stop <name>               # Stop an agent
scitex-agent-container restart <name>            # Restart an agent
scitex-agent-container attach <name>             # Attach to screen session

# Inspection
scitex-agent-container status [name] [--json]    # Show agent status
scitex-agent-container list [--json]             # List all agents
scitex-agent-container list --capability gpu     # Filter by capability
scitex-agent-container list --machine spartan    # Filter by machine
scitex-agent-container ps [--json]               # Alias for list
scitex-agent-container logs <name> [-n LINES]    # Show recent output
scitex-agent-container health <name> [--json]    # Run health check
scitex-agent-container find --capability gpu     # Find agents by label

# Configuration
scitex-agent-container validate <config.yaml>    # Validate YAML syntax
scitex-agent-container check <config.yaml>       # Run full preflight checks
scitex-agent-container build [--runtime docker|apptainer]

# Maintenance
scitex-agent-container cleanup                   # Remove stale entries
scitex-agent-container list-python-apis [-v]     # List public API tree
scitex-agent-container version                   # Show version
MCP Server -- for AI Agents

Not yet implemented. Planned for a future release.

Skills -- for AI Agent Discovery

Agent skills are declared in the YAML manifest and injected into the agent's CLAUDE.md at startup:

spec:
  skills:
    required:
      - python-scitex     # Auto-loaded at startup
      - data-analysis
    available:
      - scitex             # Available but not auto-loaded

YAML Spec Reference

Section Key Fields Description
metadata name, labels Agent identity and capability labels
spec.runtime claude-code, cursor, aider AI coding tool to use
spec.model sonnet, opus[1m] Model selection
spec.remote host, user, timeout, login_shell SSH remote deployment
spec.orochi hosts[], port, token_env, channels[] Orochi hub auto-connect
spec.claude channels[], flags[], session Claude Code-specific options
spec.health enabled, interval, method Health monitoring
spec.restart policy, max_retries, backoff Auto-restart on failure
spec.watchdog enabled, interval, responses Auto-respond to prompts
spec.skills required[], available[] Skill injection
spec.container runtime, image, volumes Docker/Apptainer container
spec.telegram bot_token_env, allowed_users Telegram integration
spec.screen name Screen session name override
spec.env key-value pairs Environment variables
spec.hooks pre_start, post_start, pre_stop, post_stop Lifecycle hooks

Part of SciTeX

scitex-agent-container is part of SciTeX. It depends on scitex-container for container runtime abstractions and is used by scitex-orochi for multi-machine agent orchestration.

Four Freedoms for Research

  1. The freedom to run your research anywhere -- your machine, your terms.
  2. The freedom to study how every step works -- from raw data to final manuscript.
  3. The freedom to redistribute your workflows, not just your papers.
  4. The freedom to modify any module and share improvements with the community.

AGPL-3.0 -- because we believe research infrastructure deserves the same freedoms as the software it runs on.


SciTeX

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scitex_agent_container-0.4.1.tar.gz (411.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scitex_agent_container-0.4.1-py3-none-any.whl (48.3 kB view details)

Uploaded Python 3

File details

Details for the file scitex_agent_container-0.4.1.tar.gz.

File metadata

  • Download URL: scitex_agent_container-0.4.1.tar.gz
  • Upload date:
  • Size: 411.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0rc1

File hashes

Hashes for scitex_agent_container-0.4.1.tar.gz
Algorithm Hash digest
SHA256 8578a10ea864423740ae5c4525f42ef34c1508ee938ed760c1d91dec35ec01c6
MD5 d97881ee9a1aade961a0d32313d281d2
BLAKE2b-256 7f9963610d215d41ef25ed3b6183ef61501c07f111da5364eb9d7091ac2623b0

See more details on using hashes here.

File details

Details for the file scitex_agent_container-0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for scitex_agent_container-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 67839935ed4cfd7a8db07f1620270cb71878c6ca14233a47848b2805b9206f52
MD5 f57695f1d44d072160395836d39a7be0
BLAKE2b-256 7e50d0da6259240b5dc898c96a9c8a0d5765c8b25da8933076243cab324d4bff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page