Skip to main content

A CLI interface for the METR task standard

Project description

Overview

poetry build pip install ./dist/metr_cli-0.0.1-py3-none-any.whl --force-reinstall metr metr task run ./examples/my_task/my_task.py addition python3.10 ./src/metr/metr_cli.py run ./examples/my_task/ addition

A CLI interface for the METR task standard. The METR task standard is a way

This project was generated with cookiecutter

TODO: Make a METR package with all the proper types so you can import it without the full directory

Setup

Requirements

  • Python 3.11+

Installation

Install it directly into an activated virtual environment:

$ pip install metr-cli

or add it to your Poetry project:

$ poetry add metr-cli

Usage

Everything is under the metr command

1 Tasks

metr task create <path> --name test_task --type test

  • This uses cookiecutter to create a new task in the current director

metr task run <task_path>

  • Runs the current task using Docker

metr task validate <task_path>

  • Will run various tests to confirm that the project is ready for publishing
    • Tests if QA is set up well
    • Tests if

Mapping of npm functions to CLI functions

  1. Create a task environment

    • npm: npm run task -- "taskFamilyDirectory" "taskName"
    • CLI: metr task run <task_family_directory> <task_name>
  2. Run an agent inside a task environment

    • npm: npm run agent -- "[docker container name]" "path/to/agent[:path/in/VM]" "command to start agent"
    • CLI: metr task agent <container_name> <agent_path> <start_command>
  3. Score a task environment

    • npm: npm run score -- [docker container name]
    • CLI: metr task score <container_name>
  4. Export files from a task environment

    • npm: npm run export -- [docker container name] [file1] [file2] ...
    • CLI: metr task export <container_name> <file1> <file2> ...
  5. Run tests in a task environment

    • npm:
      • All tests: npm run test -- "taskFamilyDirectory" "taskName" "testFileName"
      • Single test: npm run test -- "taskFamilyDirectory" "taskName" "testFileName::testName"
    • CLI:
      • All tests: metr task test <task_family_directory> <task_name> <test_file>
      • Single test: metr task test <task_family_directory> <task_name> <test_file> --test-name <test_name>
  6. Destroy a task environment

    • npm: npm run destroy -- "taskEnvironmentIdentifier"
    • CLI: metr task destroy <task_environment_identifier>

Note: The metr task create command doesn't directly map to an npm function. It's a custom command for creating new task definitions in your project structure.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metr_cli-0.0.1.tar.gz (3.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metr_cli-0.0.1-py3-none-any.whl (6.5 MB view details)

Uploaded Python 3

File details

Details for the file metr_cli-0.0.1.tar.gz.

File metadata

  • Download URL: metr_cli-0.0.1.tar.gz
  • Upload date:
  • Size: 3.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.1

File hashes

Hashes for metr_cli-0.0.1.tar.gz
Algorithm Hash digest
SHA256 c4198bdb4bc11e43dd25646b6a9ae6d28b8604ed24adc83e8502cee0aae021ce
MD5 4c8474ac6278b28492e53de764f89d42
BLAKE2b-256 e62be6e12426e8c8bc8dbc40ea64dc8f83997109eb872f587117bf9c3fe196ea

See more details on using hashes here.

File details

Details for the file metr_cli-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: metr_cli-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.1

File hashes

Hashes for metr_cli-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e4ad0614010e3a78c403dceea0bb0905c93d6da5e09e8d29e9e9a00b9525a767
MD5 0e78f22fe54c37d3929007090e9d0cef
BLAKE2b-256 b9fc7f01f3e7ab50ee5783923d8d643ee8c43fb4e9ac7d24bee491f634271153

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page