Skip to main content

No project description provided

Project description

arbiterx

Installation

pip install arbiterx

Directory structure of a test suite

submission
├── input
│   ├── input1.txt
│   └── input2.txt
|        ...
├── output
│   ├── output1.txt
│   └── output2.txt
|        ...
└── solution.py // The main file to be executed. Name can be anything.

See the data/submission directory for an example.

Python Executor Example

import json
import os

from rich import print_json # not necessary, just for pretty printing

from arbiterx import CodeExecutor, Constraints


class PythonCodeExecutor(CodeExecutor):
    def get_compile_command(self, src: str) -> str:
        return ""

    def get_run_command(self, src: str) -> str:
        return f"python3 {src}/solution.py"


if __name__ == "__main__":
    constraints: Constraints = {
        "time_limit": 2,
        "memory_limit": 10,
        "memory_swap_limit": 0,  # No swap
        # cpu quota and period are in microseconds
        "cpu_quota": 1000000,
        "cpu_period": 1000000,
    }
    WORK_DIR = <submission_directory>
    with PythonCodeExecutor(
            user="sandbox", # Default is "nobody"
            docker_image="python312:v1",
            src=WORK_DIR,
            constraints=constraints,
            disable_compile=True,
    ) as executor:
        for result in executor.run():
            print_json(json.dumps(result), indent=4)

C++ Executor Example

import json

from rich import print_json
from arbiterx import CodeExecutor, Constraints


class CPPCodeExecutor(CodeExecutor):
    def get_compile_command(self, src: str) -> str:
        return f"g++ -o {src}/a.out {src}/main.cpp"

    def get_run_command(self, src: str) -> str:
        return f"{src}/a.out"


if __name__ == "__main__":
    constraints: Constraints = {
        "time_limit": 2,
        "memory_limit": 10,
        "memory_swap_limit": 0, 
        "cpu_quota": 1000000,
        "cpu_period": 1000000,
    }
    WORK_DIR = <submission_directory>
    with CPPCodeExecutor(
            user="sandbox",
            docker_image="cpp11:v1",
            src=WORK_DIR,
            constraints=constraints,
    ) as executor:
        for result in executor.run(shuffle=True):
            print_json(json.dumps(result), indent=4)

Output

{
    "test_case": 1,
    "exit_code": 0,
    "stats": {
        "memory_peak": 5332992,
        "memory_events": {
            "low": 0,
            "high": 0,
            "max": 0,
            "oom": 0,
            "oom_kill": 0,
            "oom_group_kill": 0
        },
        "cpu_stat": {
            "usage_usec": 19407,
            "user_usec": 8733,
            "system_usec": 10674,
            "nr_periods": 0,
            "nr_throttled": 0,
            "throttled_usec": 0,
            "nr_bursts": 0,
            "burst_usec": 0
        },
        "pids_peak": 4
    },
    "verdict": "AC",
    "verdict_label": "Accepted",
    "verdict_details": "The program ran successfully and produced the correct output.",
    "input": "3\n1\n2\n3\n",
    "actual_output": "YES\nNO\nYES\n",
    "expected_output": "YES\nNO\nYES\n"
}

Sometimes we need custom checker as not all problems can have a predefined output. This makes sense when the expected output is not deterministic or the expected output is not unique. Some criteria may be different for different problems.

  • Criterion 1: The output should be case-insensitive, e.g., YES and yes should be considered the same.
  • Criterion 2: Order of the output should not matter, e.g., 1 2 3 and 3 2 1 should be considered the same.
  • Criterion 3: Output may not be unique, e.g., for a problem where the output is a path from source to destination, there can be multiple paths.

In such cases, we need some middleware to transform the output to a common format that complies with the problem constraints while keeping the original output intact.

In that case we can pass in our custom checker script by its path. Currently arbiterx only supports python scripts as custom checkers.

An example demonstrating the use of custom checker


The custom checker is invoked with 3 arguments:

  • Argument 1: input file path
  • Argument 2: actual output file path
  • Argument 3: expected output file path

The checker should exit with status code 0 if the output is correct, otherwise exit with status code 1.

Create a custom checker script custom_checker.py

#!/usr/bin/python3

import sys

input_file = sys.argv[1]
output_file = sys.argv[2]
expected_output_file = sys.argv[3]

with open(output_file, "r") as f:
    output = f.read().strip()

with open(expected_output_file, "r") as f:
    expected_output = f.read().strip()

if output.upper() == expected_output:
    sys.exit(0)
else:
    sys.exit(1)

Make sure to put the shebang at the top of the script.

Then you should mark the script as executable.

chmod +x custom_checker.py

Now let's use the custom checker in the executor

...
    WORK_DIR = <submission_directory>
    with PythonCodeExecutor(
            user="sandbox",
            docker_image="python312:v1",
            src=os.path.join(WORK_DIR),
            constraints=constraints,
            disable_compile=True,
    ) as executor:
        for result in executor.run(checker=os.path.join(WORK_DIR, "custom_checker.py")):
            print_json(json.dumps(result), indent=4)

For examples in detail, refer to the examples directory.

Set log level

export LOG_LEVEL=DEBUG

This will print the logs in the console.

Possible verdicts

Verdict Label Description
AC Accepted The program ran successfully and produced the correct output.
WA Wrong Answer The program ran successfully but produced incorrect output.
TLE Time Limit Exceeded The program took longer than the allowed execution time.
MLE Memory Limit Exceeded The program used more memory than the allowed limit.
RE Runtime Error The program terminated abnormally with a non-zero exit code.
OLE Output Limit Exceeded The program produced more output than the allowed limit.
CE Compilation Error The program failed to compile successfully.
ILE Idleness Limit Exceeded The program did not produce any output for too long, often indicating an infinite loop that does not consume CPU time.
JE Judgement Error The judgement process failed to produce a verdict.

See the arbiterx/verdicts.py file for more details.

Exceptions

Exception Description
CMDError Exception raised when there is an error in running a command.
DockerDaemonError Exception raised when the Docker daemon is not running.
ContainerCreateError Exception raised when there is an error in creating the container.
ContainerCleanupError Exception raised when there is an error in cleaning up the container.
CgroupMountError Exception raised when the cgroup is not mounted.
CgroupCreateError Exception raised when there is an error in creating the cgroup.
CgroupCleanupError Exception raised when there is an error in cleaning up the cgroup.
CgroupControllerReadError Exception raised when there is an error in reading the cgroup.controllers file.
CgroupControllerError Exception raised when required controllers are not allowed in the cgroup (e.g., cpu and memory controllers are missing in cgroup.controllers).
CgroupSubtreeControlError Exception raised when required controllers are not set in the cgroup.subtree_control file.
CgroupSubtreeControlReadError Exception raised when there is an error in reading the cgroup.subtree_control file.
CgroupSubtreeControlWriteError Exception raised when there is an error in writing the cgroup.subtree_control file.
CgroupSetLimitsError Exception raised when there is an error in setting the limits for the cgroup (e.g., writing memory.max, memory.swap.max, etc.).
CompileError Exception raised when there is an error in compiling the code.
RunError Exception raised when there is an error in running the code.
TestQueueInitializationError Exception raised when there is an error initializing the test queue.
MemoryPeakReadError Exception raised when there is an error in reading peak memory usage.
MemoryEventsReadError Exception raised when there is an error in reading memory events.
CPUStatReadError Exception raised when there is an error in reading CPU statistics.
PIDSPeakReadError Exception raised when there is an error in reading the peak number of PIDs.
EarlyExitError Exception raised when the program exits earlier than expected.
ActualOutputCleanupError Exception raised when there is an error in cleaning up the actual output.

See the arbiterx/exceptions.py file for more details.

Some useful parameters

  • disable_compile: Disable compilation of the code. Useful when the code is already compiled or the code is in an interpreted language.
  • dry_run: Pretty print the commands that will be executed.
with PythonCodeExecutor(
        ...
        dry_run=True,
) as executor:
    for result in executor.run():
        ...
  • shuffle: Randomly shuffle the test cases before running them.
with CPPCodeExecutor(..) as executor:
    for result in executor.run(shuffle=True):
        ...
  • working_dir_in_container: The working directory in the container. Default is /app.
with CPPCodeExecutor(
        ...
        working_dir_in_container="/sandbox",
) as executor:
    for result in executor.run():
        ...
  • early_exit: Exit the loop as soon as a verdict is not AC.
with CPPCodeExecutor(
        ...
) as executor:
    for result in executor.run(early_exit=True):
        ...
  • lazy_container: Create container on the first run. Default is False which creates when the context manager is created.
with CPPCodeExecutor(
        ...
        lazy_container=True,
) as executor:
    for result in executor.run():
        ...
  • cgroup_mount_path: The path where the cgroup is mounted on host. Default is /sys/fs/cgroup. This is used when bind mounting the cgroup to the container.
with CPPCodeExecutor(
        ...
        cgroup_mount_path="/some/custom/path",
) as executor:
    for result in executor.run():
        ...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arbiterx-0.2.2.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arbiterx-0.2.2-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file arbiterx-0.2.2.tar.gz.

File metadata

  • Download URL: arbiterx-0.2.2.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.12.3 Linux/6.8.0-57-generic

File hashes

Hashes for arbiterx-0.2.2.tar.gz
Algorithm Hash digest
SHA256 0ee4e52d65de65ceb60cf78dd55ee5829ae6e7802fc39fdb9f8a905b82cacd90
MD5 ea19914b4efc9a953fc5b75e846599c4
BLAKE2b-256 d0ba5759fc1afb615e987c2cd0d180d85ace9bea1bb5446b8f5ec0620977fd1b

See more details on using hashes here.

File details

Details for the file arbiterx-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: arbiterx-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.12.3 Linux/6.8.0-57-generic

File hashes

Hashes for arbiterx-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a73b2207748b37bbeee8440db99184d0b3fa2c33d1b6863a77e2b905e3b1bb6d
MD5 823cb7901c420af45f89d587d21a1c74
BLAKE2b-256 29c76a132c7e86d6880b316fef8d3de401d7111bb9fee1d088986fe237626522

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page