Skip to main content

A framework for evaluating and optimizing agents and models using sandboxed environments.

Project description

Marina

Marina banner

Marina is a fork of Harbor, a framework for evaluating and optimizing AI agents and language models. You can use Marina to:

  • Evaluate arbitrary agents like Claude Code, OpenHands, Codex CLI, and more.
  • Build and share your own benchmarks and environments.
  • Conduct experiments in thousands of environments in parallel through providers like Daytona and Modal.
  • Generate rollouts for RL optimization.
marina run -p examples/tasks/hello-screenshot --agent openhands-sdk --model openrouter/openai/gpt-5.5 --ae LLM_SUPPORTS_VISION=true

Changes

We track upstream Harbor closely, with a few additions of our own.

Agent kwargs (--ak key=value)

  • disable_builtin_tools — drop the default terminal/file-editor/task-tracker tools
  • disable_stuck_detection — turn off the SDK's stuck-agent detection

Environment variables (--ae NAME=value)

  • LLM_SUPPORTS_VISION — force vision support for models LiteLLM misclassifies
  • OPENROUTER_REASONING_ENABLED / _EXCLUDE — control OpenRouter reasoning (auto-on for Opus 4.7)
  • OPENROUTER_VERBOSITY — set OpenRouter verbosity
  • LLM_THINKING_DISPLAY — thinking display mode (default summarized)
  • SYSTEM_MESSAGE_SUFFIX / USER_MESSAGE_SUFFIX — append text to the OpenHands SDK system / user prompts
  • AWS_BEARER_TOKEN_BEDROCK — bearer-token auth for Bedrock

Vision support also writes MCP image observations to /logs/agent/trajectory-images/, referenced from the ATIF trajectory.

Installation

Marina is published to PyPI as chakra-marina.

Install the CLI. To put marina on your PATH:

uv tool install chakra-marina

Or with pip:

pip install chakra-marina

Add extras for cloud providers — e.g. chakra-marina[daytona], or chakra-marina[cloud] for all providers and chakra-marina[all] for everything.

Example: Running a task

Run the bundled hello-screenshot task locally with Docker. It's a vision smoke test: the agent calls an MCP take_screenshot tool, sees a solid-color image, and reports the color

export LLM_API_KEY=<YOUR-KEY>
marina run -p examples/tasks/hello-screenshot \
   --agent openhands-sdk \
   --model openrouter/openai/gpt-5.5 \
   --ae LLM_SUPPORTS_VISION=true \
   --ae SYSTEM_MESSAGE_SUFFIX="You run fully autonomously — never ask for confirmation."

--ae LLM_SUPPORTS_VISION=true enables vision (required for this task), and --ae SYSTEM_MESSAGE_SUFFIX="..." appends text to the OpenHands SDK system prompt. Both apply only to the openhands-sdk agent (see Changes).

To run on a cloud provider (like Daytona) instead of local Docker, pass the --env flag:

export LLM_API_KEY=<YOUR-KEY>
export DAYTONA_API_KEY=<YOUR-KEY>
marina run -p examples/tasks/hello-screenshot \
   --agent openhands-sdk \
   --model openrouter/openai/gpt-5.5 \
   --ae LLM_SUPPORTS_VISION=true \
   --env daytona

When running a whole benchmark (many tasks), raise --n-concurrent to fan out across hundreds or thousands of environments in parallel.

To see all supported agents, and other options run:

marina run --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chakra_marina-0.2.1.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chakra_marina-0.2.1-py3-none-any.whl (1.3 MB view details)

Uploaded Python 3

File details

Details for the file chakra_marina-0.2.1.tar.gz.

File metadata

  • Download URL: chakra_marina-0.2.1.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.15

File hashes

Hashes for chakra_marina-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f172202d5fd09fd514a04b41d91ed046643a0227ca9c14611669f9c207540076
MD5 0d5a64361c2e7c604a0814efd86c18ff
BLAKE2b-256 0eb10acc437bcaeaca5cd7efc9b5df48679926f7540722e244a6f5f5029b22a3

See more details on using hashes here.

File details

Details for the file chakra_marina-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for chakra_marina-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d1d389899e02f92598c403d763bca413637cc222bf7589cfabd24ce20b2c57c8
MD5 a02cae6f25077ff9a0bfaf00cf568994
BLAKE2b-256 a667ae9c975f3fdde0c71484aacef5a2c02e7bedb54d906ee3159fcc4baf1adf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page