Skip to main content

Python-controlled command submission runtime for local and SSH runners

Project description

Parallux

Parallux 是一个基于 Python 配置文件的命令提交与调度运行时,支持本地执行、SSH 远端执行、全局并行度控制、Runner 级并行度控制以及 NUMA 核心绑定。

概述

Parallux 配置文件需要在受控运行时中执行。配置文件通过 from parallux import goal 获取运行时对象,并使用 Python 语法描述 Runner、环境变量、命令注册和调度流程。

parallux config.py -- key=value

配置文件可以作为普通 Python 文件编辑,但不应直接通过 python3 config.py 执行。Parallux 会在运行前绑定 goal,从而确保配置文件始终在受控环境中运行。

from parallux import goal


local_a = goal.local(name="local-a", env={"RUNNER_TAG": "A"}, max_jobs=1)
local_b = goal.local(name="local-b", env={"RUNNER_TAG": "B"}, max_jobs=1)

goal.setRunner([local_a, local_b])
goal.setParallel(2)

for index in range(10):
    goal.schd(
        "echo runner=$PARALLUX_RUNNER tag=$RUNNER_TAG",
        name=f"run_{index}",
    )

goal.issue().sync()

安装

开发模式安装:

python3 -m pip install -e .

安装完成后会提供 parallux 命令:

parallux --help

未安装时,可以在源码目录中使用模块入口:

python3 -m parallux examples/basic.py

运行配置文件

通过命令行运行配置文件:

parallux examples/basic.py

向配置文件传递参数:

parallux examples/basic.py -- case=hello

配置文件内可通过以下对象读取参数:

goal.argv
goal.args

仅生成执行计划而不实际运行命令:

parallux examples/basic.py --dry-run

配置文件可以声明为可执行文件:

#!/usr/bin/env parallux

添加执行权限后,可以直接运行:

./examples/basic.py

Runner

本地 Runner:

local = goal.local(
    name="local",
    workspace="workspace/runners/local",
    max_jobs=2,
)
goal.setRunner(local)

SSH Runner:

server = goal.ssh(
    name="server-a",
    host="runner.example.com",
    user="user",
    workspace="workspace/runners/server-a",
    max_jobs=4,
)

goal.setRunner([local, server])

workspace 表示 Runner 侧的输出根目录。未显式指定 workspace 时,默认使用:

~/parallux

指定 work_relpath 时,每个命令的工作目录为:

<runner.workspace>/<work-relpath>/

未指定 work_relpath 时,Parallux 会使用默认路径:

<runner.workspace>/<auto-work-relpath>/

默认工作路径在任务提交时按递增编号生成,并保证同一次运行内唯一。该编号属于内部调度细节,配置文件需要读取输出位置时应使用句柄或执行结果中的 work_relpathwork_dirstdout_pathstderr_path。全局并行度由 goal.setParallel(n) 控制,单个 Runner 的并行度由 max_jobs 控制。

调度接口

goal.schd() 用于注册命令。注册操作不会立即启动命令:

goal.schd("make test", threads=1)

goal.issue() 用于提交当前已经注册的命令,并返回任务组句柄:

handle = goal.issue()
handle.sync()

goal.run() 用于立即提交一个非阻塞命令,并返回命令句柄:

handle = goal.run("echo immediate")
handle.sync()

goal.schd()goal.run() 始终由全局调度器选择 Runner。需要固定 Runner 时,应直接使用 Runner 实例的 run()

local = goal.local(name="local", workspace="workspace/local")
goal.setRunner(local)

local.run("echo immediate on local").sync()

配置文件退出时,如果仍存在已经注册但尚未提交的命令,Parallux 会在结束前自动提交并等待这些命令完成。

Handle

goal.run() 和 Runner 实例的 run() 返回单命令句柄,goal.issue() 返回任务组句柄。单命令句柄在命令被调度后会记录分配结果:

handle = goal.run("echo runner=$PARALLUX_RUNNER")

result = handle.sync()
print(handle.runner_name)
print(handle.work_dir)
print(result.stdout_path)

单命令句柄和执行结果包含以下信息:

command
runner / runner_name
work_relpath
work_dir
command_path
stdout_path
stderr_path
cores
numa_node

Runner 状态

Runner 提供运行时状态查询接口,可用于配置文件内部的调度决策:

status = local.status()

print(status.active_jobs)
print(status.available_jobs)
print(status.logical_core_count)
print(status.configured_cores)
print(status.available_cores)
print(status.available_core_count)

常用快捷接口:

local.active_jobs()
local.available_jobs()
local.logical_core_count()
local.configured_cores()
local.configured_core_count()
local.available_cores()
local.available_core_count()

也可以通过 goal.runner_status() 查询所有 Runner 或指定 Runner 的状态:

all_status = goal.runner_status()
one_status = goal.runner_status(local)

logical_core_count 表示 Runner 已知的总逻辑线程数。configured_cores 表示由 Parallux 管理、可用于核心绑定的核心集合。available_cores 表示当前尚未被 Parallux 任务租用的 configured_cores 子集,不表示操作系统层面的 CPU 空闲率。未声明 core_poolnuma_nodes 时,configured_coresavailable_cores 为空列表。

命令选项

goal.schd()goal.run() 和 Runner 实例的 run() 支持相同的命令选项:

goal.schd(
    "echo hello",
    name="hello",
    threads=1,
    numa_node=0,
    cores=[0],
    env={"KEY": "value"},
    cwd=None,
    work_relpath="suite/case",
    check=True,
    timeout=60,
)

命令运行时会注入以下环境变量:

PARALLUX_RUNNER
PARALLUX_WORK_RELPATH
PARALLUX_WORK_DIR

输出文件

每个命令的输出文件写入该命令的工作目录:

<runner.workspace>/<work-relpath>/
  command.txt
  stdout.txt
  stderr.txt

未显式指定 work_relpath 时,输出文件写入默认工作目录:

<runner.workspace>/<auto-work-relpath>/
  command.txt
  stdout.txt
  stderr.txt

NUMA

Runner 可以声明核心池和 NUMA 节点:

local = goal.local(
    name="local",
    core_pool=range(0, 8),
    numa_nodes={0: range(0, 4), 1: range(4, 8)},
)
goal.setRunner(local)

for index in range(2):
    goal.schd(
        "echo runner=$PARALLUX_RUNNER",
        threads=1,
        numa_node=0,
        work_relpath=f"numa/{index}",
    )

goal.issue().sync()

当 Runner 配置了核心池时,threads 会从核心池中申请核心,并使用 numactl 包装命令。

Workload 工具

workloads() 用于根据输入路径生成稳定的工作路径,适用于批量 workload 场景。

from parallux import goal, workloads


for workload in workloads(
    "inputs/spec/*/*/*.gz",
    levels=3,
    work_prefix="smt-perf-test",
    strip_suffix=True,
):
    goal.schd(
        f"./run-one {workload.input_path}",
        name=f"smt_perf_{workload.name}",
        work_relpath=workload.work_relpath,
    )

goal.issue().sync()

例如输入路径为 a/b/c/dlevels=3 时,workload.relpathb/c/d

文件结构

  • parallux/__init__.py:公开 API、受控运行时入口、调度器、本地/SSH 执行逻辑以及 NUMA 分配逻辑
  • parallux/__main__.pypython3 -m parallux 入口
  • parallux/_core.py:核心配置模型和配置加载逻辑
  • pyproject.toml:包元数据和 parallux 命令入口

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parallux-0.1.2.tar.gz (19.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

parallux-0.1.2-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file parallux-0.1.2.tar.gz.

File metadata

  • Download URL: parallux-0.1.2.tar.gz
  • Upload date:
  • Size: 19.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for parallux-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6c550ceea09aefdf823f1b08919b9c0e90e39fbb9c2d29f69054ceecf45d3bdb
MD5 b5db01c2a9af641917effcfe9aec8074
BLAKE2b-256 c6abd0a40d51f473d7301eb06c07f46c3fa7895fc582a5180d50a6d044fd8a97

See more details on using hashes here.

File details

Details for the file parallux-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: parallux-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for parallux-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0118b0ed77d1eb6e338d6729a9a11b9d1b67f9fcc80ab6738964432b44a1b5ed
MD5 1e149c88acf88b8afd6caff1e8fb06c9
BLAKE2b-256 6e262c8ce0e1feda80aabcb8ebf45b4c80ed4aed35d4c3f2cb1a787d07d9359b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page