Collection of Harbor agents for AI evaluation
Project description
Harbor Agents
A collection of custom agents for Harbor - the AI agent evaluation framework.
Installation
pip install harbor-agents
Available Agents
ClaudeCodeWithSkills
Extends Claude Code with custom skills support. Load pre-configured skills into the container for evaluations.
# Load all skills from a directory
harbor run -p ./my-task \
--agent-import-path harbor_agent.skilled_claude:ClaudeCodeWithSkills \
-m anthropic/claude-sonnet-4-20250514 \
--ak skill_dir=./skills
# Load specific skills only
harbor run -p ./my-task \
--agent-import-path harbor_agent.skilled_claude:ClaudeCodeWithSkills \
-m anthropic/claude-sonnet-4-20250514 \
--ak skill_dir=./skills \
--ak skills=my-skill,another-skill
# Baseline (no skills)
harbor run -p ./my-task \
--agent-import-path harbor_agent.skilled_claude:ClaudeCodeWithSkills \
-m anthropic/claude-sonnet-4-20250514 \
--ak skill_dir=./skills \
--ak 'skills='
Options:
| Option | Description |
|---|---|
skill_dir |
Path to directory containing skill folders |
skills |
Filter: omit for all, skill-a,skill-b for specific, empty string for none |
Skill Directory Structure:
skills/
├── my-skill/
│ ├── SKILL.md # Required
│ └── references/ # Optional
└── another-skill/
└── SKILL.md
MultiTurnAgent
A composite agent for testing multi-turn conversations. Combines a simulated user (which generates prompts) with an inner agent (which processes them).
harbor run -p ./my-task \
--agent-import-path harbor_agent.multi_turn:MultiTurnAgent \
-m anthropic/claude-sonnet-4-20250514 \
--ak simulated_user=my_module:MySimulatedUser \
--ak agent=harbor.agents.installed.claude_code:ClaudeCode \
--ak 'agent_kwargs={"model_name": "anthropic/claude-sonnet-4-20250514"}' \
--ak max_turns=10
Options:
| Option | Description |
|---|---|
simulated_user |
Import path to SimulatedUser subclass (module.path:ClassName) |
agent |
Import path to inner agent (module.path:ClassName) |
simulated_user_kwargs |
JSON string or dict of kwargs for simulated user |
agent_kwargs |
JSON string or dict of kwargs for inner agent |
max_turns |
Maximum conversation turns (default: 50) |
Creating a Simulated User:
from harbor_agent.multi_turn import SimulatedUser, SimulatedUserDone, ConversationMessage
class MySimulatedUser(SimulatedUser):
def __init__(self, **kwargs):
super().__init__(**kwargs)
self._turn = 0
async def next_message(self, conversation: list[ConversationMessage]) -> str:
self._turn += 1
if self._turn > 3:
raise SimulatedUserDone("Task complete")
return f"Please do step {self._turn}"
Conversation Flow:
next_message()is called with conversation history- Returns a prompt string → sent to inner agent
- Inner agent responds → added to history
- Repeat until
SimulatedUserDoneis raised ormax_turnsreached
Output: Saves trajectory.json in ATIF format with all conversation turns.
Development
uv sync --all-extras # Install
uv run pytest -v # Test
uv run ruff check src # Lint
uv run mypy src # Type check
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file harbor_agents-0.2.0.tar.gz.
File metadata
- Download URL: harbor_agents-0.2.0.tar.gz
- Upload date:
- Size: 218.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ebd61c957f8f7ce0e939f77d72d90bc2e2b23f2716344eabf0ebe6b6f7e45ff
|
|
| MD5 |
982564ae7289eb88abadca43e0e73f65
|
|
| BLAKE2b-256 |
5b4a2599e3b70810e7fd07950b45d5a052f646c859544784a075b147ec402c50
|
Provenance
The following attestation bundles were made for harbor_agents-0.2.0.tar.gz:
Publisher:
publish.yml on rotemtam/harbor-agents
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
harbor_agents-0.2.0.tar.gz -
Subject digest:
3ebd61c957f8f7ce0e939f77d72d90bc2e2b23f2716344eabf0ebe6b6f7e45ff - Sigstore transparency entry: 920182383
- Sigstore integration time:
-
Permalink:
rotemtam/harbor-agents@f9d0f5a332fd3ae52de973d26cbf062a24789b8e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/rotemtam
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f9d0f5a332fd3ae52de973d26cbf062a24789b8e -
Trigger Event:
push
-
Statement type:
File details
Details for the file harbor_agents-0.2.0-py3-none-any.whl.
File metadata
- Download URL: harbor_agents-0.2.0-py3-none-any.whl
- Upload date:
- Size: 15.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
491024188d7b2b4a9b5e68ffcea88256c2eeaeffcec29e8fc22f35bd65ce5f4a
|
|
| MD5 |
d9b3811d75765faaf5700645c44a33e5
|
|
| BLAKE2b-256 |
eb544bbfb26045b27ebb6323dce762bc5d1f4c3aafe81b60915af83be5b3db5d
|
Provenance
The following attestation bundles were made for harbor_agents-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on rotemtam/harbor-agents
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
harbor_agents-0.2.0-py3-none-any.whl -
Subject digest:
491024188d7b2b4a9b5e68ffcea88256c2eeaeffcec29e8fc22f35bd65ce5f4a - Sigstore transparency entry: 920182417
- Sigstore integration time:
-
Permalink:
rotemtam/harbor-agents@f9d0f5a332fd3ae52de973d26cbf062a24789b8e -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/rotemtam
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f9d0f5a332fd3ae52de973d26cbf062a24789b8e -
Trigger Event:
push
-
Statement type: