Skip to main content

FlowHive User Server Agent

Project description

User Server Agent(计算服务器端)

职责

注册到 Control Server,维持心跳;监听下行指令、调度任务、执行脚本;实时上报 GPU 指标、任务日志与结果。

目录结构

  • core/:Agent 核心逻辑(任务模型、异步执行器、任务管理器、GPU 监控器)
  • cli/:CLI 工具(主入口 flowhive.py
  • config/:预留的配置文件目录
  • scripts/:辅助脚本(如 demo_task_manager.py 用于本地测试)
  • tests/:Pytest 用例

本地开发

创建环境并安装依赖:

cd agent
py -m venv .venv
.\.venv\Scripts\activate
pip install -r requirements.txt

或者直接安装(如果已发布到 PyPI):

pip install flowhive

手动测试 TaskManager

如需快速体验任务执行链路,可以直接运行示例脚本:

cd agent
python scripts/demo_task_manager.py

脚本默认会提交一个简单的 Python 命令。也可通过传参覆盖:

python scripts/demo_task_manager.py "python -c \"print('hi')\"" --timeout 10

执行完成后,可在 agent_logs/ 目录查看 stdout/stderr。

连接 Control Server

Agent 现在通过 WebSocket 同 Control Server 通讯。完成依赖安装(pip install websockets),然后运行:

cd agent
# 1. 设置账户信息
python cli/flowhive.py config user.username "test"
python cli/flowhive.py config user.email "user@example.com"
python cli/flowhive.py config user.password "your-password"
python cli/flowhive.py config control_base_url "http://127.0.0.1:8001"
python cli/flowhive.py config label "your-label"

# 2. 验证配置
python cli/flowhive.py config

# 3. 启动 Agent
python cli/flowhive.py run

脚本会自动根据 control_base_url 的协议自动转换:

  • http://ws://(明文 WebSocket)
  • https://wss://(加密 WebSocket,推荐生产环境使用)

生产环境配置示例

# 使用 HTTPS/WSS(推荐)
python cli/flowhive.py config control_base_url "https://your-control-server.com"

Agent 会自动使用 wss:// 连接到 Control Server,确保通信安全。

关键模块

  • GPU 监控器 & ServiceGPUMonitor 基于 NVML 采集显存、利用率与进程显存占用,GPUService 作为单例门面被 TaskManager/控制面复用,确保任务管理器只负责注入监控引用、不直接承担指标查询职责。
  • FlowHive Scheduler:复用现有显存阈值、优先级、最大并发、任务重试等能力。
  • 任务执行器subprocess / torchrun / bash,支持环境变量注入、容器启动。
  • OOM 自愈 & 重试:根据任务状态机,将失败任务重新入队或降级排队。
  • 日志/指标上报:多路复用通道,将 stdout/stderr 流式推送至 Control Server。
  • 心跳服务:1s~5s 周期报告 GPU 拓扑、进程、Agent 版本。

License

This Agent component is licensed under the MIT License. See LICENSE for details.

Note: This is only the Agent component. The Control Server and Web Client are proprietary software and not covered by this license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flow_agent-0.1.0.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flow_agent-0.1.0-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file flow_agent-0.1.0.tar.gz.

File metadata

  • Download URL: flow_agent-0.1.0.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for flow_agent-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a5243fdf05d63ffac450c4dc859381ecb779bb332a557b27aca5d652d79ba854
MD5 04c8c675813846e102da148db83ca8be
BLAKE2b-256 c2d4511d1aaed49f3d780d868c359b1c687848078180f5b9584681f871c5c01a

See more details on using hashes here.

File details

Details for the file flow_agent-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: flow_agent-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for flow_agent-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1cce2fda6f022f23ba5b63add1b772838157cbeb1db498d92b06d1797b83d8a7
MD5 cb1b1ecf0aaf0a37479db56671bb80b9
BLAKE2b-256 b49ec3ccc99d4b2f79b33dd636bc7160e347b7d87402c64eafa4aff2b45e3d82

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page