Skip to main content

Agent execution toolkit - tool execution, MCP integration, file operations, and shell commands

Project description

kollabor-agent

kollabor-agent is the tool and agent execution runtime for Kollabor.

It owns tool dispatch, MCP server connections, file operations, shell execution, agent/skill loading, background tasks, process management, and permission-aware execution. The CLI and engine both build on this package when a model wants to act on the local environment.

Current Role

  • Execute built-in tools such as terminal commands, file operations, and MCP tool calls.
  • Load agent definitions and skills from project, user, and bundled locations.
  • Manage MCP stdio servers and expose their tools to model providers.
  • Provide permission risk assessment and approval flow primitives.
  • Run background processes/tasks and provide process lifecycle helpers.
  • Bridge native/XML tool definitions into the unified tool registry.

Package Structure (~25K lines)

kollabor_agent/
├── agent_manager.py              (1744 lines) Agent/skill discovery and loading
├── runtime.py                    (651 lines)  AgentRuntime identity and lifecycle
├── queue_processor.py            (1286 lines) LLM turn execution and tool queue
├── tool_executor.py              (1121 lines) Unified tool dispatch engine
├── tool_definition.py            (173 lines)  ToolDefinition/ToolParameter dataclasses
├── tool_registry.py              (144 lines)  Singleton tool registry
├── tool_definitions/             (10 files)   Individual tool metadata modules
│   ├── file_ops.py               (542 lines)  read/edit/create/delete/move/copy/append
│   ├── hub.py                    (1039 lines) hub_msg, hub_spawn, hub_capture, etc.
│   ├── terminal.py               (188 lines)  terminal/shell execution
│   ├── scratchpad.py             (109 lines)  scratchpad get/append/clear
│   ├── task.py                   (148 lines)  task_checkpoint/complete/approve/reject
│   ├── context.py                (125 lines)  context_query, curate, evict
│   ├── git.py                    (52 lines)   git operations
│   └── wait.py                   (66 lines)   wait_for_user
├── tool_generators/              (3 files)    Schema/doc generators
│   ├── markdown.py               (154 lines)  ToolDefinition -> markdown docs
│   ├── xml_regex.py              (71 lines)   ToolDefinition -> XML regex pattern
│   └── native_json.py            (70 lines)   ToolDefinition -> JSON schema
├── file_operations_executor.py   (1471 lines) Safe file ops with backups/validation
├── shell_executor.py             (365 lines)  Async shell command execution
├── shell_command_service.py      (409 lines)  !-prefix shell command service
├── shell_utils.py                (230 lines)  Shell alias detection for prompt context
├── mcp_integration.py            (1221 lines) MCP JSON-RPC protocol over stdio
├── mcp_manager.py                (381 lines)  MCP config file management (no UI deps)
├── mentiko_adapter.py            (203 lines)  Mentiko platform spawn adapter
├── native_tools_handler.py       (214 lines)  Native API function-calling routing
├── process_manager.py            (782 lines)  Agent subprocess lifecycle management
├── background_task_manager.py    (507 lines)  Background task tracking with circuit breaker
├── permissions/                  (3 files)    Permission system
│   ├── manager.py                (734 lines)  Central permission manager
│   ├── risk_assessor.py          (109 lines)  Tool risk level assessment
│   └── response_handler.py       (106 lines)  User confirmation response handling

Architecture

Tool System

The tool system has four layers:

  1. ToolDefinition (tool_definition.py) — Single source of truth for one tool's metadata: name, description, parameters, XML form, risk level, examples, safety features, key rules. Everything else is generated from this.

  2. ToolRegistry (tool_registry.py) — Singleton holding all ToolDefinitions. Modules in tool_definitions/ auto-register on import via register_all(). Supports lookup by name, native_name (underscore form), or XML tag.

  3. ToolGenerators (tool_generators/) — Generate three artifacts from each ToolDefinition:

    • markdown.py — Markdown docs injected into agent system prompts
    • xml_regex.py — Regex pattern for parsing XML tool tags in LLM responses
    • native_json.py — JSON schema for native API function calling
  4. ToolExecutor (tool_executor.py) — Unified dispatch engine. Routes tool calls to the correct handler:

    • Terminal commands -> ShellExecutor
    • File operations -> FileOperationsExecutor
    • MCP tool calls -> MCPIntegration
    • Plugin handlers -> registered via register_plugin_handler()
    • Returns ToolExecutionResult (success/error, output, execution time, metadata)

Agent Loading

AgentManager (agent_manager.py) discovers and loads agents from three locations in priority order:

  1. Local project: .kollab/agents/
  2. Global user: ~/.kollab/agents/
  3. Bundled defaults: bundles/agents/

Each agent is a directory containing:

  • system_prompt.md (required)
  • agent.json (optional metadata: description, profile, tools, skills)
  • Skill .md files (optional)
  • sections/ directory (optional prompt fragments)

Skills follow the Agent Skills standard (agentskills.io): directories with SKILL.md containing YAML frontmatter + markdown instructions. Assigned to agents via the skills field in agent.json. Progressive disclosure: metadata at startup, full content on activation.

Agent Runtime

AgentRuntime (runtime.py) is the canonical runtime representation of an agent. It merges static definition (from disk) with live state (process, hub, vault). Lifecycle states: BOOTING -> READY -> WORKING -> THINKING -> BLOCKED -> DREAMING -> SUSPENDED -> DYING -> DEAD.

Queue Processing

QueueProcessor (queue_processor.py) manages the LLM turn loop:

  • Message queue with overflow strategies (drop_newest, drop_oldest, block)
  • Batch message processing
  • Conversation continuation (agentic multi-turn)
  • Deduped LLM turn execution
  • Tool result ingestion into context ledger for large outputs (>=8KB)

File Operations

FileOperationsExecutor (file_operations_executor.py) provides 14 safe file operations with:

  • Automatic .bak backups before destructive operations
  • Protected path checking (kollabor/, main.py, .git/, venv/)
  • Path traversal prevention
  • Binary file detection and rejection
  • Optional Python syntax validation with automatic rollback on errors
  • File size limits (10MB edit, 5MB create)
  • Three path access modes: PROJECT_ONLY, KOLLAB_CONFIG, ANYWHERE

Operations: read, edit, create, create_overwrite, delete, move, copy, copy_overwrite, append, insert_after, insert_before, mkdir, rmdir, grep.

Shell Execution

Three shell-related modules:

  • ShellExecutor — Low-level async subprocess execution with cancellation support, timeout handling, and per-instance state isolation.
  • ShellCommandService — High-level !-prefix command service for user input. Validates against dangerous patterns, blocks interactive commands (vim, ssh), handles cd warnings, strips ANSI, emits pre/post/cancel/error events.
  • ShellUtils — Detects user shell aliases (fd, rg, eza, etc.) and formats syntax hints for injection into AI system prompts.

MCP Integration

Two modules handle Model Context Protocol:

  • MCPIntegration (mcp_integration.py) — Live connection management. Implements MCP JSON-RPC 2.0 over stdio: server initialization handshake, tools/list for discovery, tools/call for execution. Manages MCPServerConnection instances (subprocess with piped stdio).

  • MCPManager (mcp_manager.py) — Pure business logic for config file management. Load/save mcp_settings.json, enable/disable servers, configure API keys, list servers/tools with status. No UI dependencies.

Permission System

Three modules under permissions/:

  • PermissionManager — Central permission manager. Integrates with event bus to intercept tool execution. Manages approval modes (auto_approve, ask, etc.), session-scoped and project-scoped approvals, pending confirmations via asyncio.Event, and approval statistics.

  • RiskAssessor — Evaluates tool risk levels based on configurable rules: blocked tools, trusted tools, high/medium risk command patterns (regex), and per-tool-type default risk levels.

  • ResponseHandler — Processes user responses to permission dialogs: deny, approve-once, approve-session, approve-project, approve-always, approve-tool-always. Records approvals in the appropriate scope.

Process Management

ProcessManager (process_manager.py) manages agent subprocess lifecycles:

  • Strategy pattern for spawn backends (default: SubprocessStrategy)
  • Circuit breaker for crash-loop prevention (3 failures in 120s opens circuit)
  • Ring buffer for stdout capture (thread-safe, default 2000 lines)
  • Resource tracking (RSS, uptime, restart count)
  • Pluggable strategies: SpawnStrategy ABC with spawn/kill/is_alive/stdio methods. Future: DockerStrategy, SSHStrategy.

Background Tasks

BackgroundTaskManager (background_task_manager.py) manages async background tasks:

  • Overflow strategies: drop_newest, drop_oldest, block (with configurable timeout)
  • Circuit breaker: CLOSED -> OPEN -> HALF_OPEN -> CLOSED state machine
  • Configurable retry logic (attempts, delay)
  • Task monitoring and periodic cleanup
  • Per-task metrics (optional)

Other Modules

  • MentikoAdapter (mentiko_adapter.py) — Adapter for the Mentiko platform. Spawns kollabor agents via subprocess with --detached flag. Provides both Python API (spawn_agent()) and CLI entrypoint.

  • NativeToolsHandler (native_tools_handler.py) — Bridges native API function calling (OpenAI/Anthropic style) with the tool executor. Handles malformed tool names from LLM confusion, routes file/terminal/plugin/MCP tools to correct handlers.

Usage

from kollabor_agent import MCPIntegration, ToolExecutor
from kollabor_events import EventBus

bus = EventBus()
mcp = MCPIntegration(event_bus=bus)
executor = ToolExecutor(mcp_integration=mcp, event_bus=bus)

result = await executor.execute_tool({
    "id": "tool_1",
    "type": "terminal",
    "command": "pwd",
})

print(result.success, result.output)

Tool Definition Registration

Tool definitions live in tool_definitions/ as Python modules. Each module defines ToolDefinition instances and a register_all() function that auto-registers on import. The registry loads all modules at startup via ToolRegistry.get_global().

To add a new tool:

  1. Create a new module in tool_definitions/
  2. Define a ToolDefinition with name, description, parameters, XML form, examples, and metadata
  3. Add a register_all() function that calls registry.register(tool_def)
  4. Import the module in tool_registry.py _load_definitions()

The three generators (markdown, XML regex, native JSON) automatically pick up new tools. Agent system prompts include only tools listed in the agent's agent.json tools array.

Known Gaps

  • Tool execution depends on caller-provided context for workspace/cwd behavior; service callers must wire project boundaries explicitly until the runtime has a stronger workspace object.
  • MCP session connect behavior is mostly implemented through internal helpers; a smaller public connection API would reduce route-level coupling.
  • Permission scope, bundle scope, and tool registry behavior are powerful but spread across several modules; more contract tests would make changes safer.
  • Some diagnostics and legacy compatibility paths still live inside the runtime and should be made quieter or moved behind debug flags.
  • Tool definitions are hardcoded in Python modules — no user-overridable path from ~/.kollab exists yet.

Roadmap

Phase 1: Execution boundaries

  • Add a first-class workspace/project execution context used by shell and file operations.
  • Expose a public MCP connect/disconnect/list-tools API for service callers.
  • Tighten cancellation behavior across shell, MCP, background, and plugin tools.

Phase 2: Tool contract stabilization

  • Keep the unified tool registry as the canonical source for schemas, permissions, bundle scope, and prompt rendering.
  • Add regression tests for native JSON, XML, and markdown tool generation.
  • Document exact tool result metadata expected by context-service and display layers.

Phase 3: Agent runtime maturity

  • Clarify which runtime pieces are reusable library APIs versus CLI orchestration internals.
  • Harden process cleanup, circuit-breaker behavior, and background task reporting.
  • Expand agent/skill loading tests across project, user, and bundled sources.

Development

Targeted validation examples:

python -m py_compile packages/kollabor-agent/src/kollabor_agent/*.py
python -m pytest tests/unit/mcp tests/unit/test_auto_grant_mcp_tools.py -q

Dependencies

  • kollabor-events
  • kollabor-config
  • kollabor-ai
  • pyyaml

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kollabor_agent-1.0.0.tar.gz (114.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kollabor_agent-1.0.0-py3-none-any.whl (131.6 kB view details)

Uploaded Python 3

File details

Details for the file kollabor_agent-1.0.0.tar.gz.

File metadata

  • Download URL: kollabor_agent-1.0.0.tar.gz
  • Upload date:
  • Size: 114.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kollabor_agent-1.0.0.tar.gz
Algorithm Hash digest
SHA256 74e5c5c407ae2d559799b0e72812e29ee731485d4d6dd9d6e9d4f733d41cb2b8
MD5 80495f95508789df681d208cdc54005a
BLAKE2b-256 070eaf897c66d779eb7e30c1d72aba8b9ab3e90d9a980fbc2d62d6b637a07d24

See more details on using hashes here.

File details

Details for the file kollabor_agent-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: kollabor_agent-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 131.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kollabor_agent-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ed65b1c88567e019f756109cf5cc3584f924ea2d4460451dc03b45ac251de368
MD5 38cc074f99efe92ffcf822c92b27073d
BLAKE2b-256 a69bdee52e8dc2af981759672376dc05744417321ceb7f5a076b1f64592fbd2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page