Skip to main content

Python-controlled command submission runtime for local and SSH runners

Project description

Parallux

Parallux 是一个基于 Python 配置文件的命令提交与调度运行时,支持本地执行、SSH 远端执行、全局并行度控制、Runner 级并行度控制以及 NUMA 核心绑定。

概述

Parallux 配置文件需要在受控运行时中执行。配置文件通过 from parallux import goal 获取运行时对象,并使用 Python 语法描述 Runner、环境变量、命令注册和调度流程。

parallux config.py -- key=value

配置文件可以作为普通 Python 文件编辑,但不应直接通过 python3 config.py 执行。Parallux 会在运行前绑定 goal,从而确保配置文件始终在受控环境中运行。

from parallux import goal


local_a = goal.local(name="local-a", env={"RUNNER_TAG": "A"}, max_jobs=1)
local_b = goal.local(name="local-b", env={"RUNNER_TAG": "B"}, max_jobs=1)

goal.setRunner([local_a, local_b])
goal.setParallel(2)

for index in range(10):
    goal.schd(
        "echo runner=$PARALLUX_RUNNER tag=$RUNNER_TAG",
        name=f"run_{index}",
    )

goal.issue().sync()

安装

开发模式安装:

python3 -m pip install -e .

安装完成后会提供 parallux 命令:

parallux --help

未安装时,可以在源码目录中使用模块入口:

python3 -m parallux examples/basic.py

运行配置文件

通过命令行运行配置文件:

parallux examples/basic.py

向配置文件传递参数:

parallux examples/basic.py -- case=hello

配置文件内可通过以下对象读取参数:

goal.argv
goal.args

仅生成执行计划而不实际运行命令:

parallux examples/basic.py --dry-run

配置文件可以声明为可执行文件:

#!/usr/bin/env parallux

添加执行权限后,可以直接运行:

./examples/basic.py

Runner

本地 Runner:

local = goal.local(
    name="local",
    workspace="workspace/runners/local",
    max_jobs=2,
)
goal.setRunner(local)

SSH Runner:

server = goal.ssh(
    name="server-a",
    host="runner.example.com",
    user="user",
    workspace="workspace/runners/server-a",
    max_jobs=4,
)

goal.setRunner([local, server])

workspace 表示 Runner 侧的输出根目录。未显式指定 workspace 时,默认使用:

~/parallux

指定 work_relpath 时,每个命令的工作目录为:

<runner.workspace>/<work-relpath>/

未指定 work_relpath 时,Parallux 会使用默认路径:

<runner.workspace>/<auto-work-relpath>/

默认工作路径在任务提交时按递增编号生成,并保证同一次运行内唯一。该编号属于内部调度细节,配置文件需要读取输出位置时应使用句柄或执行结果中的 work_relpathwork_dirstdout_pathstderr_path。全局并行度由 goal.setParallel(n) 控制,单个 Runner 的并行度由 max_jobs 控制。

调度接口

goal.schd() 用于注册命令。注册操作不会立即启动命令:

goal.schd("make test", threads=1)

goal.issue() 用于提交当前已经注册的命令,并返回任务组句柄:

handle = goal.issue()
handle.sync()

goal.run() 用于立即提交一个非阻塞命令,并返回命令句柄:

handle = goal.run("echo immediate")
handle.sync()

需要固定 Runner 时,可以在 goal.schd() 中显式指定 runner

local = goal.local(name="local", workspace="workspace/local")
goal.setRunner(local)

goal.schd("echo scheduled on local", runner=local)
goal.issue().sync()

Runner 实例也提供 run(),用于立即提交到指定 Runner:

local.run("echo immediate on local").sync()

配置文件退出时,如果仍存在已经注册但尚未提交的命令,Parallux 会在结束前自动提交并等待这些命令完成。

Handle

goal.run() 和 Runner 实例的 run() 返回单命令句柄,goal.issue() 返回任务组句柄。单命令句柄在命令被调度后会记录分配结果:

handle = goal.run("echo runner=$PARALLUX_RUNNER")

result = handle.sync()
print(handle.runner_name)
print(handle.work_dir)
print(result.stdout_path)

单命令句柄和执行结果包含以下信息:

command
runner / runner_name
work_relpath
work_dir
command_path
stdout_path
stderr_path
cores
numa_node

Runner 状态

Runner 提供运行时状态查询接口,可用于配置文件内部的调度决策:

status = local.status()

print(status.active_jobs)
print(status.available_jobs)
print(status.logical_core_count)
print(status.configured_cores)
print(status.available_cores)
print(status.available_core_count)

常用快捷接口:

local.active_jobs()
local.available_jobs()
local.logical_core_count()
local.configured_cores()
local.configured_core_count()
local.available_cores()
local.available_core_count()

也可以通过 goal.runner_status() 查询所有 Runner 或指定 Runner 的状态:

all_status = goal.runner_status()
one_status = goal.runner_status(local)

logical_core_count 表示 Runner 已知的总逻辑线程数。configured_cores 表示由 Parallux 管理、可用于核心绑定的核心集合。available_cores 表示当前尚未被 Parallux 任务租用的 configured_cores 子集,不表示操作系统层面的 CPU 空闲率。未声明 core_poolnuma_nodes 时,configured_coresavailable_cores 为空列表。

命令选项

goal.schd()goal.run() 和 Runner 实例的 run() 支持相同的命令选项:

goal.schd(
    "echo hello",
    name="hello",
    threads=1,
    numa_node=0,
    cores=[0],
    env={"KEY": "value"},
    cwd=None,
    work_relpath="suite/case",
    check=True,
    timeout=60,
)

命令运行时会注入以下环境变量:

PARALLUX_RUNNER
PARALLUX_WORK_RELPATH
PARALLUX_WORK_DIR

输出文件

每个命令的输出文件写入该命令的工作目录:

<runner.workspace>/<work-relpath>/
  command.txt
  stdout.txt
  stderr.txt

未显式指定 work_relpath 时,输出文件写入默认工作目录:

<runner.workspace>/<auto-work-relpath>/
  command.txt
  stdout.txt
  stderr.txt

NUMA

Runner 可以声明核心池和 NUMA 节点:

local = goal.local(
    name="local",
    core_pool=range(0, 8),
    numa_nodes={0: range(0, 4), 1: range(4, 8)},
)
goal.setRunner(local)

for index in range(2):
    goal.schd(
        "echo runner=$PARALLUX_RUNNER",
        runner=local,
        threads=1,
        numa_node=0,
        work_relpath=f"numa/{index}",
    )

goal.issue().sync()

当 Runner 配置了核心池时,threads 会从核心池中申请核心,并使用 numactl 包装命令。

Workload 工具

workloads() 用于根据输入路径生成稳定的工作路径,适用于批量 workload 场景。

from parallux import goal, workloads


for workload in workloads(
    "inputs/spec/*/*/*.gz",
    levels=3,
    work_prefix="smt-perf-test",
    strip_suffix=True,
):
    goal.schd(
        f"./run-one {workload.input_path}",
        name=f"smt_perf_{workload.name}",
        work_relpath=workload.work_relpath,
    )

goal.issue().sync()

例如输入路径为 a/b/c/dlevels=3 时,workload.relpathb/c/d

文件结构

  • parallux/__init__.py:公开 API、受控运行时入口、调度器、本地/SSH 执行逻辑以及 NUMA 分配逻辑
  • parallux/__main__.pypython3 -m parallux 入口
  • parallux/_core.py:核心配置模型和配置加载逻辑
  • pyproject.toml:包元数据和 parallux 命令入口

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parallux-0.1.0.tar.gz (18.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

parallux-0.1.0-py3-none-any.whl (16.2 kB view details)

Uploaded Python 3

File details

Details for the file parallux-0.1.0.tar.gz.

File metadata

  • Download URL: parallux-0.1.0.tar.gz
  • Upload date:
  • Size: 18.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for parallux-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d428c24acde62c177994647523b68516ab1b437d86c1162544cbae289a8beac0
MD5 0775a00fa3779313c7d489e92afb4ae0
BLAKE2b-256 08d8bc2b699c4218d7450a9f023964b112803a216c01650925105af33f699e70

See more details on using hashes here.

File details

Details for the file parallux-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: parallux-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for parallux-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4069dd755084781177894c3dc292d81d9e3220582fdaf0e98336ca170ca87e1b
MD5 353b0c96f367cdca66545870e4dd66a6
BLAKE2b-256 0c2f278ad2714a4490de4b3b63c337df1637522cbe02031ca7ac37ccd5d99dfe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page