Skip to main content

FlowHive User Server Agent

Project description

User Server Agent(计算服务器端)

职责

注册到 Control Server,维持心跳;监听下行指令、调度任务、执行脚本;实时上报 GPU 指标、任务日志与结果。

目录结构

  • core/:Agent 核心逻辑(任务模型、异步执行器、任务管理器、GPU 监控器)
  • cli/:CLI 工具(主入口 flowhive.py
  • config/:预留的配置文件目录
  • scripts/:辅助脚本(如 demo_task_manager.py 用于本地测试)
  • tests/:Pytest 用例

本地开发

创建环境并安装依赖:

cd agent
py -m venv .venv
.\.venv\Scripts\activate
pip install -r requirements.txt

或者直接安装(如果已发布到 PyPI):

pip install flowhive-agent -i https://pypi.org/simple/

手动测试 TaskManager

如需快速体验任务执行链路,可以直接运行示例脚本:

cd agent
python scripts/demo_task_manager.py

脚本默认会提交一个简单的 Python 命令。也可通过传参覆盖:

python scripts/demo_task_manager.py "python -c \"print('hi')\"" --timeout 10

执行完成后,可在 agent_logs/ 目录查看 stdout/stderr。

连接 Control Server

Agent 现在通过 WebSocket 同 Control Server 通讯。完成依赖安装(pip install websockets),然后运行:

cd agent
# 1. 设置账户信息
python flowhive_agent/cli/flowhive.py config user.username "test"
python flowhive_agent/cli/flowhive.py config user.email "user@example.com"
python flowhive_agent/cli/flowhive.py config user.password "your-password"
python flowhive_agent/cli/flowhive.py config control_base_url "http://127.0.0.1:8001"
python flowhive_agent/cli/flowhive.py config label "your-label"

# 2. 验证配置
python flowhive_agent/cli/flowhive.py config

# 3. 启动 Agent
python flowhive_agent/cli/flowhive.py run

脚本会自动根据 control_base_url 的协议自动转换:

  • http://ws://(明文 WebSocket)
  • https://wss://(加密 WebSocket,推荐生产环境使用)

生产环境配置示例

# 使用 HTTPS/WSS(推荐)
python flowhive_agent/cli/flowhive.py config control_base_url "https://your-control-server.com"

Agent 会自动使用 wss:// 连接到 Control Server,确保通信安全。

关键模块

  • GPU 监控器 & ServiceGPUMonitor 基于 NVML 采集显存、利用率与进程显存占用,GPUService 作为单例门面被 TaskManager/控制面复用,确保任务管理器只负责注入监控引用、不直接承担指标查询职责。
  • FlowHive Scheduler:复用现有显存阈值、优先级、最大并发、任务重试等能力。
  • 任务执行器subprocess / torchrun / bash,支持环境变量注入、容器启动。
  • OOM 自愈 & 重试:根据任务状态机,将失败任务重新入队或降级排队。
  • 日志/指标上报:多路复用通道,将 stdout/stderr 流式推送至 Control Server。
  • 心跳服务:1s~5s 周期报告 GPU 拓扑、进程、Agent 版本。

License

This Agent component is licensed under the MIT License. See LICENSE for details.

Note: This is only the Agent component. The Control Server and Web Client are proprietary software and not covered by this license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flowhive_agent-0.1.3.tar.gz (30.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flowhive_agent-0.1.3-py3-none-any.whl (30.7 kB view details)

Uploaded Python 3

File details

Details for the file flowhive_agent-0.1.3.tar.gz.

File metadata

  • Download URL: flowhive_agent-0.1.3.tar.gz
  • Upload date:
  • Size: 30.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for flowhive_agent-0.1.3.tar.gz
Algorithm Hash digest
SHA256 de5bf07210a9a1905ce04e21a7a17d62c68e85601350bfaae8e50802a94de277
MD5 5b10ec964f6f3f3afafa6c37a1665fb1
BLAKE2b-256 c513d650a043b1a5e1872f844b225fd2e12088998ca88332e06c1c4e4abacc60

See more details on using hashes here.

File details

Details for the file flowhive_agent-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: flowhive_agent-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 30.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for flowhive_agent-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 016a49d3d73bde51c6b8309c6b3e6d5661d944da28ef5f2dc4618388c924a854
MD5 636f9b58613cce30cf9e188797149e17
BLAKE2b-256 4c7cdc58ec76eb2a533409adf7c44417cc87bbd8e2b4f56d1e3bb9a773b300fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page