Skip to main content

Collection of Harbor agents for AI evaluation

Project description

Harbor Agents

A collection of custom agents for Harbor - the AI agent evaluation framework.

Installation

pip install harbor-agents

Available Agents

ClaudeCodeWithSkills

Extends Claude Code with custom skills support. Load pre-configured skills into the container for evaluations.

# Load all skills from a directory
harbor run -p ./my-task \
    --agent-import-path harbor_agent.skilled_claude:ClaudeCodeWithSkills \
    -m anthropic/claude-sonnet-4-20250514 \
    --ak skill_dir=./skills

# Load specific skills only
harbor run -p ./my-task \
    --agent-import-path harbor_agent.skilled_claude:ClaudeCodeWithSkills \
    -m anthropic/claude-sonnet-4-20250514 \
    --ak skill_dir=./skills \
    --ak skills=my-skill,another-skill

# Baseline (no skills)
harbor run -p ./my-task \
    --agent-import-path harbor_agent.skilled_claude:ClaudeCodeWithSkills \
    -m anthropic/claude-sonnet-4-20250514 \
    --ak skill_dir=./skills \
    --ak 'skills='

Options:

Option Description
skill_dir Path to directory containing skill folders
skills Filter: omit for all, skill-a,skill-b for specific, empty string for none

Skill Directory Structure:

skills/
├── my-skill/
│   ├── SKILL.md          # Required
│   └── references/       # Optional
└── another-skill/
    └── SKILL.md

MultiTurnAgent

A composite agent for testing multi-turn conversations. Combines a simulated user (which generates prompts) with an inner agent (which processes them).

harbor run -p ./my-task \
    --agent-import-path harbor_agent.multi_turn:MultiTurnAgent \
    -m anthropic/claude-sonnet-4-20250514 \
    --ak simulated_user=my_module:MySimulatedUser \
    --ak agent=harbor.agents.installed.claude_code:ClaudeCode \
    --ak 'agent_kwargs={"model_name": "anthropic/claude-sonnet-4-20250514"}' \
    --ak max_turns=10

Options:

Option Description
simulated_user Import path to SimulatedUser subclass (module.path:ClassName)
agent Import path to inner agent (module.path:ClassName)
simulated_user_kwargs JSON string or dict of kwargs for simulated user
agent_kwargs JSON string or dict of kwargs for inner agent
max_turns Maximum conversation turns (default: 50)

Creating a Simulated User:

from harbor_agent.multi_turn import SimulatedUser, SimulatedUserDone, ConversationMessage

class MySimulatedUser(SimulatedUser):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self._turn = 0

    async def next_message(self, conversation: list[ConversationMessage]) -> str:
        self._turn += 1
        if self._turn > 3:
            raise SimulatedUserDone("Task complete")
        return f"Please do step {self._turn}"

Conversation Flow:

  1. next_message() is called with conversation history
  2. Returns a prompt string → sent to inner agent
  3. Inner agent responds → added to history
  4. Repeat until SimulatedUserDone is raised or max_turns reached

Output: Saves trajectory.json in ATIF format with all conversation turns.

Development

uv sync --all-extras    # Install
uv run pytest -v        # Test
uv run ruff check src   # Lint
uv run mypy src         # Type check

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

harbor_agents-0.2.0.tar.gz (218.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

harbor_agents-0.2.0-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file harbor_agents-0.2.0.tar.gz.

File metadata

  • Download URL: harbor_agents-0.2.0.tar.gz
  • Upload date:
  • Size: 218.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for harbor_agents-0.2.0.tar.gz
Algorithm Hash digest
SHA256 3ebd61c957f8f7ce0e939f77d72d90bc2e2b23f2716344eabf0ebe6b6f7e45ff
MD5 982564ae7289eb88abadca43e0e73f65
BLAKE2b-256 5b4a2599e3b70810e7fd07950b45d5a052f646c859544784a075b147ec402c50

See more details on using hashes here.

Provenance

The following attestation bundles were made for harbor_agents-0.2.0.tar.gz:

Publisher: publish.yml on rotemtam/harbor-agents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file harbor_agents-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: harbor_agents-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for harbor_agents-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 491024188d7b2b4a9b5e68ffcea88256c2eeaeffcec29e8fc22f35bd65ce5f4a
MD5 d9b3811d75765faaf5700645c44a33e5
BLAKE2b-256 eb544bbfb26045b27ebb6323dce762bc5d1f4c3aafe81b60915af83be5b3db5d

See more details on using hashes here.

Provenance

The following attestation bundles were made for harbor_agents-0.2.0-py3-none-any.whl:

Publisher: publish.yml on rotemtam/harbor-agents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page