Skip to main content

Agent execution toolkit - tool execution, MCP integration, file operations, and shell commands

Project description

kollabor-agent

kollabor-agent is the tool and agent execution runtime for Kollabor.

It owns tool dispatch, MCP server connections, file operations, shell execution, agent/skill loading, background tasks, process management, and permission-aware execution. The CLI and engine both build on this package when a model wants to act on the local environment.

Current Role

  • Execute built-in tools such as terminal commands, file operations, and MCP tool calls.
  • Load agent definitions and skills from project, user, and bundled locations.
  • Manage MCP stdio servers and expose their tools to model providers.
  • Provide permission risk assessment and approval flow primitives.
  • Run background processes/tasks and provide process lifecycle helpers.
  • Bridge native/XML tool definitions into the unified tool registry.

Package Structure (~25K lines)

kollabor_agent/
├── agent_manager.py              (1744 lines) Agent/skill discovery and loading
├── runtime.py                    (651 lines)  AgentRuntime identity and lifecycle
├── queue_processor.py            (1286 lines) LLM turn execution and tool queue
├── tool_executor.py              (1121 lines) Unified tool dispatch engine
├── tool_definition.py            (173 lines)  ToolDefinition/ToolParameter dataclasses
├── tool_registry.py              (144 lines)  Singleton tool registry
├── tool_definitions/             (10 files)   Individual tool metadata modules
│   ├── file_ops.py               (542 lines)  read/edit/create/delete/move/copy/append
│   ├── hub.py                    (1039 lines) hub_msg, hub_spawn, hub_capture, etc.
│   ├── terminal.py               (188 lines)  terminal/shell execution
│   ├── scratchpad.py             (109 lines)  scratchpad get/append/clear
│   ├── task.py                   (148 lines)  task_checkpoint/complete/approve/reject
│   ├── context.py                (125 lines)  context_query, curate, evict
│   ├── git.py                    (52 lines)   git operations
│   └── wait.py                   (66 lines)   wait_for_user
├── tool_generators/              (3 files)    Schema/doc generators
│   ├── markdown.py               (154 lines)  ToolDefinition -> markdown docs
│   ├── xml_regex.py              (71 lines)   ToolDefinition -> XML regex pattern
│   └── native_json.py            (70 lines)   ToolDefinition -> JSON schema
├── file_operations_executor.py   (1471 lines) Safe file ops with backups/validation
├── shell_executor.py             (365 lines)  Async shell command execution
├── shell_command_service.py      (409 lines)  !-prefix shell command service
├── shell_utils.py                (230 lines)  Shell alias detection for prompt context
├── mcp_integration.py            (1221 lines) MCP JSON-RPC protocol over stdio
├── mcp_manager.py                (381 lines)  MCP config file management (no UI deps)
├── mentiko_adapter.py            (203 lines)  Mentiko platform spawn adapter
├── native_tools_handler.py       (214 lines)  Native API function-calling routing
├── process_manager.py            (782 lines)  Agent subprocess lifecycle management
├── background_task_manager.py    (507 lines)  Background task tracking with circuit breaker
├── permissions/                  (3 files)    Permission system
│   ├── manager.py                (734 lines)  Central permission manager
│   ├── risk_assessor.py          (109 lines)  Tool risk level assessment
│   └── response_handler.py       (106 lines)  User confirmation response handling

Architecture

Tool System

The tool system has four layers:

  1. ToolDefinition (tool_definition.py) — Single source of truth for one tool's metadata: name, description, parameters, XML form, risk level, examples, safety features, key rules. Everything else is generated from this.

  2. ToolRegistry (tool_registry.py) — Singleton holding all ToolDefinitions. Modules in tool_definitions/ auto-register on import via register_all(). Supports lookup by name, native_name (underscore form), or XML tag.

  3. ToolGenerators (tool_generators/) — Generate three artifacts from each ToolDefinition:

    • markdown.py — Markdown docs injected into agent system prompts
    • xml_regex.py — Regex pattern for parsing XML tool tags in LLM responses
    • native_json.py — JSON schema for native API function calling
  4. ToolExecutor (tool_executor.py) — Unified dispatch engine. Routes tool calls to the correct handler:

    • Terminal commands -> ShellExecutor
    • File operations -> FileOperationsExecutor
    • MCP tool calls -> MCPIntegration
    • Plugin handlers -> registered via register_plugin_handler()
    • Returns ToolExecutionResult (success/error, output, execution time, metadata)

Agent Loading

AgentManager (agent_manager.py) discovers and loads agents from three locations in priority order:

  1. Local project: .kollab/agents/
  2. Global user: ~/.kollab/agents/
  3. Bundled defaults: bundles/agents/

Each agent is a directory containing:

  • system_prompt.md (required)
  • agent.json (optional metadata: description, profile, tools, skills)
  • Skill .md files (optional)
  • sections/ directory (optional prompt fragments)

Skills follow the Agent Skills standard (agentskills.io): directories with SKILL.md containing YAML frontmatter + markdown instructions. Assigned to agents via the skills field in agent.json. Progressive disclosure: metadata at startup, full content on activation.

Agent Runtime

AgentRuntime (runtime.py) is the canonical runtime representation of an agent. It merges static definition (from disk) with live state (process, hub, vault). Lifecycle states: BOOTING -> READY -> WORKING -> THINKING -> BLOCKED -> DREAMING -> SUSPENDED -> DYING -> DEAD.

Queue Processing

QueueProcessor (queue_processor.py) manages the LLM turn loop:

  • Message queue with overflow strategies (drop_newest, drop_oldest, block)
  • Batch message processing
  • Conversation continuation (agentic multi-turn)
  • Deduped LLM turn execution
  • Tool result ingestion into context ledger for large outputs (>=8KB)

File Operations

FileOperationsExecutor (file_operations_executor.py) provides 14 safe file operations with:

  • Automatic .bak backups before destructive operations
  • Protected path checking (kollabor/, main.py, .git/, venv/)
  • Path traversal prevention
  • Binary file detection and rejection
  • Optional Python syntax validation with automatic rollback on errors
  • File size limits (10MB edit, 5MB create)
  • Three path access modes: PROJECT_ONLY, KOLLAB_CONFIG, ANYWHERE

Operations: read, edit, create, create_overwrite, delete, move, copy, copy_overwrite, append, insert_after, insert_before, mkdir, rmdir, grep.

Shell Execution

Three shell-related modules:

  • ShellExecutor — Low-level async subprocess execution with cancellation support, timeout handling, and per-instance state isolation.
  • ShellCommandService — High-level !-prefix command service for user input. Validates against dangerous patterns, blocks interactive commands (vim, ssh), handles cd warnings, strips ANSI, emits pre/post/cancel/error events.
  • ShellUtils — Detects user shell aliases (fd, rg, eza, etc.) and formats syntax hints for injection into AI system prompts.

MCP Integration

Two modules handle Model Context Protocol:

  • MCPIntegration (mcp_integration.py) — Live connection management. Implements MCP JSON-RPC 2.0 over stdio: server initialization handshake, tools/list for discovery, tools/call for execution. Manages MCPServerConnection instances (subprocess with piped stdio).

  • MCPManager (mcp_manager.py) — Pure business logic for config file management. Load/save mcp_settings.json, enable/disable servers, configure API keys, list servers/tools with status. No UI dependencies.

Permission System

Three modules under permissions/:

  • PermissionManager — Central permission manager. Integrates with event bus to intercept tool execution. Manages approval modes (auto_approve, ask, etc.), session-scoped and project-scoped approvals, pending confirmations via asyncio.Event, and approval statistics.

  • RiskAssessor — Evaluates tool risk levels based on configurable rules: blocked tools, trusted tools, high/medium risk command patterns (regex), and per-tool-type default risk levels.

  • ResponseHandler — Processes user responses to permission dialogs: deny, approve-once, approve-session, approve-project, approve-always, approve-tool-always. Records approvals in the appropriate scope.

Process Management

ProcessManager (process_manager.py) manages agent subprocess lifecycles:

  • Strategy pattern for spawn backends (default: SubprocessStrategy)
  • Circuit breaker for crash-loop prevention (3 failures in 120s opens circuit)
  • Ring buffer for stdout capture (thread-safe, default 2000 lines)
  • Resource tracking (RSS, uptime, restart count)
  • Pluggable strategies: SpawnStrategy ABC with spawn/kill/is_alive/stdio methods. Future: DockerStrategy, SSHStrategy.

Background Tasks

BackgroundTaskManager (background_task_manager.py) manages async background tasks:

  • Overflow strategies: drop_newest, drop_oldest, block (with configurable timeout)
  • Circuit breaker: CLOSED -> OPEN -> HALF_OPEN -> CLOSED state machine
  • Configurable retry logic (attempts, delay)
  • Task monitoring and periodic cleanup
  • Per-task metrics (optional)

Other Modules

  • MentikoAdapter (mentiko_adapter.py) — Adapter for the Mentiko platform. Spawns kollabor agents via subprocess with --detached flag. Provides both Python API (spawn_agent()) and CLI entrypoint.

  • NativeToolsHandler (native_tools_handler.py) — Bridges native API function calling (OpenAI/Anthropic style) with the tool executor. Handles malformed tool names from LLM confusion, routes file/terminal/plugin/MCP tools to correct handlers.

Usage

from kollabor_agent import MCPIntegration, ToolExecutor
from kollabor_events import EventBus

bus = EventBus()
mcp = MCPIntegration(event_bus=bus)
executor = ToolExecutor(mcp_integration=mcp, event_bus=bus)

result = await executor.execute_tool({
    "id": "tool_1",
    "type": "terminal",
    "command": "pwd",
})

print(result.success, result.output)

Tool Definition Registration

Tool definitions live in tool_definitions/ as Python modules. Each module defines ToolDefinition instances and a register_all() function that auto-registers on import. The registry loads all modules at startup via ToolRegistry.get_global().

To add a new tool:

  1. Create a new module in tool_definitions/
  2. Define a ToolDefinition with name, description, parameters, XML form, examples, and metadata
  3. Add a register_all() function that calls registry.register(tool_def)
  4. Import the module in tool_registry.py _load_definitions()

The three generators (markdown, XML regex, native JSON) automatically pick up new tools. Agent system prompts include only tools listed in the agent's agent.json tools array.

Known Gaps

  • Tool execution depends on caller-provided context for workspace/cwd behavior; service callers must wire project boundaries explicitly until the runtime has a stronger workspace object.
  • MCP session connect behavior is mostly implemented through internal helpers; a smaller public connection API would reduce route-level coupling.
  • Permission scope, bundle scope, and tool registry behavior are powerful but spread across several modules; more contract tests would make changes safer.
  • Some diagnostics and legacy compatibility paths still live inside the runtime and should be made quieter or moved behind debug flags.
  • Tool definitions are hardcoded in Python modules — no user-overridable path from ~/.kollab exists yet.

Roadmap

Phase 1: Execution boundaries

  • Add a first-class workspace/project execution context used by shell and file operations.
  • Expose a public MCP connect/disconnect/list-tools API for service callers.
  • Tighten cancellation behavior across shell, MCP, background, and plugin tools.

Phase 2: Tool contract stabilization

  • Keep the unified tool registry as the canonical source for schemas, permissions, bundle scope, and prompt rendering.
  • Add regression tests for native JSON, XML, and markdown tool generation.
  • Document exact tool result metadata expected by context-service and display layers.

Phase 3: Agent runtime maturity

  • Clarify which runtime pieces are reusable library APIs versus CLI orchestration internals.
  • Harden process cleanup, circuit-breaker behavior, and background task reporting.
  • Expand agent/skill loading tests across project, user, and bundled sources.

Development

Targeted validation examples:

python -m py_compile packages/kollabor-agent/src/kollabor_agent/*.py
python -m pytest tests/unit/mcp tests/unit/test_auto_grant_mcp_tools.py -q

Dependencies

  • kollabor-events
  • kollabor-config
  • kollabor-ai
  • pyyaml

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kollabor_agent-1.0.1.tar.gz (114.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kollabor_agent-1.0.1-py3-none-any.whl (131.6 kB view details)

Uploaded Python 3

File details

Details for the file kollabor_agent-1.0.1.tar.gz.

File metadata

  • Download URL: kollabor_agent-1.0.1.tar.gz
  • Upload date:
  • Size: 114.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kollabor_agent-1.0.1.tar.gz
Algorithm Hash digest
SHA256 1ff586ea2a5cb674eeea5f52b976f9c99d99e5e192a21b7cd80443c1f3ef8d8b
MD5 cbb2af05c56a0f8a803fe40b7eac99a4
BLAKE2b-256 74e2140f7ac16fcc63e109a4e0d7dac9e73dd2fc94150b45d78acd800ed0d394

See more details on using hashes here.

File details

Details for the file kollabor_agent-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: kollabor_agent-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 131.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for kollabor_agent-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0f52e23b21cdb235fe87bf02081e59ea6d492c8d28d935d44df41364cbc941ce
MD5 d8381e85196831b1bd250c3c805183fc
BLAKE2b-256 f51a037182dd2af9fced9d5a120e2c98ba7f6d587a8ef6e0f8098313186c5591

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page