A framework for evaluating and optimizing agents and models using sandboxed environments.
Project description
Marina
Marina is a fork of Harbor, a framework for evaluating and optimizing AI agents and language models. You can use Marina to:
- Evaluate arbitrary agents like Claude Code, OpenHands, Codex CLI, and more.
- Build and share your own benchmarks and environments.
- Conduct experiments in thousands of environments in parallel through providers like Daytona and Modal.
- Generate rollouts for RL optimization.
marina run -p examples/tasks/hello-screenshot --agent openhands-sdk --model openrouter/openai/gpt-5.5 --ae LLM_SUPPORTS_VISION=true
Changes
We track upstream Harbor closely, with a few additions of our own.
Agent kwargs (--ak key=value)
disable_builtin_tools— drop the default terminal/file-editor/task-tracker toolsdisable_stuck_detection— turn off the SDK's stuck-agent detection
Environment variables (--ae NAME=value)
LLM_SUPPORTS_VISION— force vision support for models LiteLLM misclassifiesOPENROUTER_REASONING_ENABLED/_EXCLUDE— control OpenRouter reasoning (auto-on for Opus 4.7)OPENROUTER_VERBOSITY— set OpenRouter verbosityLLM_THINKING_DISPLAY— thinking display mode (defaultsummarized)SYSTEM_MESSAGE_SUFFIX/USER_MESSAGE_SUFFIX— append text to the OpenHands SDK system / user promptsAWS_BEARER_TOKEN_BEDROCK— bearer-token auth for Bedrock
Vision support also writes MCP image observations to /logs/agent/trajectory-images/, referenced from the ATIF trajectory.
Installation
Marina is published to PyPI as chakra-marina.
Install the CLI. To put marina on your PATH:
uv tool install chakra-marina
Or with pip:
pip install chakra-marina
Add extras for cloud providers — e.g. chakra-marina[daytona], or
chakra-marina[cloud] for all providers and chakra-marina[all] for everything.
Example: Running a task
Run the bundled hello-screenshot task locally with Docker. It's a vision smoke
test: the agent calls an MCP take_screenshot tool, sees a solid-color image, and
reports the color
export LLM_API_KEY=<YOUR-KEY>
marina run -p examples/tasks/hello-screenshot \
--agent openhands-sdk \
--model openrouter/openai/gpt-5.5 \
--ae LLM_SUPPORTS_VISION=true \
--ae SYSTEM_MESSAGE_SUFFIX="You run fully autonomously — never ask for confirmation."
--ae LLM_SUPPORTS_VISION=true enables vision (required for this task), and
--ae SYSTEM_MESSAGE_SUFFIX="..." appends text to the OpenHands SDK system prompt.
Both apply only to the openhands-sdk agent (see Changes).
To run on a cloud provider (like Daytona) instead of local Docker, pass the --env flag:
export LLM_API_KEY=<YOUR-KEY>
export DAYTONA_API_KEY=<YOUR-KEY>
marina run -p examples/tasks/hello-screenshot \
--agent openhands-sdk \
--model openrouter/openai/gpt-5.5 \
--ae LLM_SUPPORTS_VISION=true \
--env daytona
When running a whole benchmark (many tasks), raise --n-concurrent to fan out across
hundreds or thousands of environments in parallel.
To see all supported agents, and other options run:
marina run --help
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chakra_marina-0.2.0.tar.gz.
File metadata
- Download URL: chakra_marina-0.2.0.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aef530d9f58ba299bbfe2b1b2b6de99bd8e6ad4408a13607878b784b64905d15
|
|
| MD5 |
1f5e21389adfeb48894ea5fc088aba49
|
|
| BLAKE2b-256 |
e37a82477b9deb8429edd235e2c7b75b2d3e7b1ac2837d5cdd05beca17bc0262
|
File details
Details for the file chakra_marina-0.2.0-py3-none-any.whl.
File metadata
- Download URL: chakra_marina-0.2.0-py3-none-any.whl
- Upload date:
- Size: 1.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d270239b439c5a72e8333041a06ce83c08bca0e0672e07ea99c9661a58727f01
|
|
| MD5 |
29ba9ee7611dab9380511f9e1c2c104f
|
|
| BLAKE2b-256 |
8ef45bedbc81c5dd8371abc76216ed89543cff17233e70ae3633d0c049fc8a13
|