Skip to main content

Agent execution toolkit - tool execution, MCP integration, file operations, and shell commands

Project description

kollabor-agent

kollabor-agent is the tool and agent execution runtime for Kollabor.

It owns tool dispatch, MCP server connections, file operations, shell execution, agent/skill loading, background tasks, process management, and permission-aware execution. The CLI and engine both build on this package when a model wants to act on the local environment.

Current Role

  • Execute built-in tools such as terminal commands, file operations, and MCP tool calls.
  • Load agent definitions and skills from project, user, and bundled locations.
  • Manage MCP stdio servers and expose their tools to model providers.
  • Provide permission risk assessment and approval flow primitives.
  • Run background processes/tasks and provide process lifecycle helpers.
  • Bridge native/XML tool definitions into the unified tool registry.

Package Structure (~25K lines)

kollabor_agent/
├── agent_manager.py              (1744 lines) Agent/skill discovery and loading
├── runtime.py                    (651 lines)  AgentRuntime identity and lifecycle
├── queue_processor.py            (1286 lines) LLM turn execution and tool queue
├── tool_executor.py              (1121 lines) Unified tool dispatch engine
├── tool_definition.py            (173 lines)  ToolDefinition/ToolParameter dataclasses
├── tool_registry.py              (144 lines)  Singleton tool registry
├── tool_definitions/             (10 files)   Individual tool metadata modules
│   ├── file_ops.py               (542 lines)  read/edit/create/delete/move/copy/append
│   ├── hub.py                    (1039 lines) hub_msg, hub_spawn, hub_capture, etc.
│   ├── terminal.py               (188 lines)  terminal/shell execution
│   ├── scratchpad.py             (109 lines)  scratchpad get/append/clear
│   ├── task.py                   (148 lines)  task_checkpoint/complete/approve/reject
│   ├── context.py                (125 lines)  context_query, curate, evict
│   ├── git.py                    (52 lines)   git operations
│   └── wait.py                   (66 lines)   wait_for_user
├── tool_generators/              (3 files)    Schema/doc generators
│   ├── markdown.py               (154 lines)  ToolDefinition -> markdown docs
│   ├── xml_regex.py              (71 lines)   ToolDefinition -> XML regex pattern
│   └── native_json.py            (70 lines)   ToolDefinition -> JSON schema
├── file_operations_executor.py   (1471 lines) Safe file ops with backups/validation
├── shell_executor.py             (365 lines)  Async shell command execution
├── shell_command_service.py      (409 lines)  !-prefix shell command service
├── shell_utils.py                (230 lines)  Shell alias detection for prompt context
├── mcp_integration.py            (1221 lines) MCP JSON-RPC protocol over stdio
├── mcp_manager.py                (381 lines)  MCP config file management (no UI deps)
├── mentiko_adapter.py            (203 lines)  Mentiko platform spawn adapter
├── native_tools_handler.py       (214 lines)  Native API function-calling routing
├── process_manager.py            (782 lines)  Agent subprocess lifecycle management
├── background_task_manager.py    (507 lines)  Background task tracking with circuit breaker
├── permissions/                  (3 files)    Permission system
│   ├── manager.py                (734 lines)  Central permission manager
│   ├── risk_assessor.py          (109 lines)  Tool risk level assessment
│   └── response_handler.py       (106 lines)  User confirmation response handling

Architecture

Tool System

The tool system has four layers:

  1. ToolDefinition (tool_definition.py) — Single source of truth for one tool's metadata: name, description, parameters, XML form, risk level, examples, safety features, key rules. Everything else is generated from this.

  2. ToolRegistry (tool_registry.py) — Singleton holding all ToolDefinitions. Modules in tool_definitions/ auto-register on import via register_all(). Supports lookup by name, native_name (underscore form), or XML tag.

  3. ToolGenerators (tool_generators/) — Generate three artifacts from each ToolDefinition:

    • markdown.py — Markdown docs injected into agent system prompts
    • xml_regex.py — Regex pattern for parsing XML tool tags in LLM responses
    • native_json.py — JSON schema for native API function calling
  4. ToolExecutor (tool_executor.py) — Unified dispatch engine. Routes tool calls to the correct handler:

    • Terminal commands -> ShellExecutor
    • File operations -> FileOperationsExecutor
    • MCP tool calls -> MCPIntegration
    • Plugin handlers -> registered via register_plugin_handler()
    • Returns ToolExecutionResult (success/error, output, execution time, metadata)

Agent Loading

AgentManager (agent_manager.py) discovers and loads agents from three locations in priority order:

  1. Local project: .kollab/agents/
  2. Global user: ~/.kollab/agents/
  3. Bundled defaults: bundles/agents/

Each agent is a directory containing:

  • system_prompt.md (required)
  • agent.json (optional metadata: description, profile, tools, skills)
  • Skill .md files (optional)
  • sections/ directory (optional prompt fragments)

Skills follow the Agent Skills standard (agentskills.io): directories with SKILL.md containing YAML frontmatter + markdown instructions. Assigned to agents via the skills field in agent.json. Progressive disclosure: metadata at startup, full content on activation.

Agent Runtime

AgentRuntime (runtime.py) is the canonical runtime representation of an agent. It merges static definition (from disk) with live state (process, hub, vault). Lifecycle states: BOOTING -> READY -> WORKING -> THINKING -> BLOCKED -> DREAMING -> SUSPENDED -> DYING -> DEAD.

Queue Processing

QueueProcessor (queue_processor.py) manages the LLM turn loop:

  • Message queue with overflow strategies (drop_newest, drop_oldest, block)
  • Batch message processing
  • Conversation continuation (agentic multi-turn)
  • Deduped LLM turn execution
  • Tool result ingestion into context ledger for large outputs (>=8KB)

File Operations

FileOperationsExecutor (file_operations_executor.py) provides 14 safe file operations with:

  • Automatic .bak backups before destructive operations
  • Protected path checking (kollabor/, main.py, .git/, venv/)
  • Path traversal prevention
  • Binary file detection and rejection
  • Optional Python syntax validation with automatic rollback on errors
  • File size limits (10MB edit, 5MB create)
  • Three path access modes: PROJECT_ONLY, KOLLAB_CONFIG, ANYWHERE

Operations: read, edit, create, create_overwrite, delete, move, copy, copy_overwrite, append, insert_after, insert_before, mkdir, rmdir, grep.

Shell Execution

Three shell-related modules:

  • ShellExecutor — Low-level async subprocess execution with cancellation support, timeout handling, and per-instance state isolation.
  • ShellCommandService — High-level !-prefix command service for user input. Validates against dangerous patterns, blocks interactive commands (vim, ssh), handles cd warnings, strips ANSI, emits pre/post/cancel/error events.
  • ShellUtils — Detects user shell aliases (fd, rg, eza, etc.) and formats syntax hints for injection into AI system prompts.

MCP Integration

Two modules handle Model Context Protocol:

  • MCPIntegration (mcp_integration.py) — Live connection management. Implements MCP JSON-RPC 2.0 over stdio: server initialization handshake, tools/list for discovery, tools/call for execution. Manages MCPServerConnection instances (subprocess with piped stdio).

  • MCPManager (mcp_manager.py) — Pure business logic for config file management. Load/save mcp_settings.json, enable/disable servers, configure API keys, list servers/tools with status. No UI dependencies.

Permission System

Three modules under permissions/:

  • PermissionManager — Central permission manager. Integrates with event bus to intercept tool execution. Manages approval modes (auto_approve, ask, etc.), session-scoped and project-scoped approvals, pending confirmations via asyncio.Event, and approval statistics.

  • RiskAssessor — Evaluates tool risk levels based on configurable rules: blocked tools, trusted tools, high/medium risk command patterns (regex), and per-tool-type default risk levels.

  • ResponseHandler — Processes user responses to permission dialogs: deny, approve-once, approve-session, approve-project, approve-always, approve-tool-always. Records approvals in the appropriate scope.

Process Management

ProcessManager (process_manager.py) manages agent subprocess lifecycles:

  • Strategy pattern for spawn backends (default: SubprocessStrategy)
  • Circuit breaker for crash-loop prevention (3 failures in 120s opens circuit)
  • Ring buffer for stdout capture (thread-safe, default 2000 lines)
  • Resource tracking (RSS, uptime, restart count)
  • Pluggable strategies: SpawnStrategy ABC with spawn/kill/is_alive/stdio methods. Future: DockerStrategy, SSHStrategy.

Background Tasks

BackgroundTaskManager (background_task_manager.py) manages async background tasks:

  • Overflow strategies: drop_newest, drop_oldest, block (with configurable timeout)
  • Circuit breaker: CLOSED -> OPEN -> HALF_OPEN -> CLOSED state machine
  • Configurable retry logic (attempts, delay)
  • Task monitoring and periodic cleanup
  • Per-task metrics (optional)

Other Modules

  • MentikoAdapter (mentiko_adapter.py) — Adapter for the Mentiko platform. Spawns kollabor agents via subprocess with --detached flag. Provides both Python API (spawn_agent()) and CLI entrypoint.

  • NativeToolsHandler (native_tools_handler.py) — Bridges native API function calling (OpenAI/Anthropic style) with the tool executor. Handles malformed tool names from LLM confusion, routes file/terminal/plugin/MCP tools to correct handlers.

Usage

from kollabor_agent import MCPIntegration, ToolExecutor
from kollabor_events import EventBus

bus = EventBus()
mcp = MCPIntegration(event_bus=bus)
executor = ToolExecutor(mcp_integration=mcp, event_bus=bus)

result = await executor.execute_tool({
    "id": "tool_1",
    "type": "terminal",
    "command": "pwd",
})

print(result.success, result.output)

Tool Definition Registration

Tool definitions live in tool_definitions/ as Python modules. Each module defines ToolDefinition instances and a register_all() function that auto-registers on import. The registry loads all modules at startup via ToolRegistry.get_global().

To add a new tool:

  1. Create a new module in tool_definitions/
  2. Define a ToolDefinition with name, description, parameters, XML form, examples, and metadata
  3. Add a register_all() function that calls registry.register(tool_def)
  4. Import the module in tool_registry.py _load_definitions()

The three generators (markdown, XML regex, native JSON) automatically pick up new tools. Agent system prompts include only tools listed in the agent's agent.json tools array.

Known Gaps

  • Tool execution depends on caller-provided context for workspace/cwd behavior; service callers must wire project boundaries explicitly until the runtime has a stronger workspace object.
  • MCP session connect behavior is mostly implemented through internal helpers; a smaller public connection API would reduce route-level coupling.
  • Permission scope, bundle scope, and tool registry behavior are powerful but spread across several modules; more contract tests would make changes safer.
  • Some diagnostics and legacy compatibility paths still live inside the runtime and should be made quieter or moved behind debug flags.
  • Tool definitions are hardcoded in Python modules — no user-overridable path from ~/.kollab exists yet.

Roadmap

Phase 1: Execution boundaries

  • Add a first-class workspace/project execution context used by shell and file operations.
  • Expose a public MCP connect/disconnect/list-tools API for service callers.
  • Tighten cancellation behavior across shell, MCP, background, and plugin tools.

Phase 2: Tool contract stabilization

  • Keep the unified tool registry as the canonical source for schemas, permissions, bundle scope, and prompt rendering.
  • Add regression tests for native JSON, XML, and markdown tool generation.
  • Document exact tool result metadata expected by context-service and display layers.

Phase 3: Agent runtime maturity

  • Clarify which runtime pieces are reusable library APIs versus CLI orchestration internals.
  • Harden process cleanup, circuit-breaker behavior, and background task reporting.
  • Expand agent/skill loading tests across project, user, and bundled sources.

Development

Targeted validation examples:

python -m py_compile packages/kollabor-agent/src/kollabor_agent/*.py
python -m pytest tests/unit/mcp tests/unit/test_auto_grant_mcp_tools.py -q

Dependencies

  • kollabor-events
  • kollabor-config
  • kollabor-ai
  • pyyaml

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kollabor_agent-0.5.8.tar.gz (115.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kollabor_agent-0.5.8-py3-none-any.whl (132.1 kB view details)

Uploaded Python 3

File details

Details for the file kollabor_agent-0.5.8.tar.gz.

File metadata

  • Download URL: kollabor_agent-0.5.8.tar.gz
  • Upload date:
  • Size: 115.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kollabor_agent-0.5.8.tar.gz
Algorithm Hash digest
SHA256 1525b01ce34d4475d867bbddfc6cdcda2df8ea8dbb041dcb385cac377bff8508
MD5 456f7a45aa350343348068e62b7ec797
BLAKE2b-256 b8ab594d4b983168af1ac0727d210621e972b5d432e38cd8ed200507933fbc11

See more details on using hashes here.

File details

Details for the file kollabor_agent-0.5.8-py3-none-any.whl.

File metadata

  • Download URL: kollabor_agent-0.5.8-py3-none-any.whl
  • Upload date:
  • Size: 132.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kollabor_agent-0.5.8-py3-none-any.whl
Algorithm Hash digest
SHA256 67f7e5e4f14c449d881e271c86ddf0b1f74dc46985312d1c25f9bbf1bbeb1907
MD5 16c97782ef81d195fb09f35f8289fde2
BLAKE2b-256 8b6b6022ef80c3ac934477d561b064723c97f3f0d6e28041aa065b147cfbc699

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page