Skip to main content

Standardized interface and data structures for infrastructure monitoring plugins in the cjm-plugin-system ecosystem.

Project description

cjm-infra-plugin-system

Install

pip install cjm_infra_plugin_system

Project Structure

nbs/
├── core.ipynb             # Core data structures for infrastructure monitoring
└── plugin_interface.ipynb # Domain-specific plugin interface for system monitoring

Total: 2 notebooks

Module Dependencies

graph LR
    core[core<br/>core]
    plugin_interface[plugin_interface<br/>plugin_interface]

    plugin_interface --> core

1 cross-module dependencies detected

CLI Reference

No CLI commands found in this project.

Module Overview

Detailed documentation for each module in the project:

core (core.ipynb)

Core data structures for infrastructure monitoring

Import

from cjm_infra_plugin_system.core import (
    SystemStats,
    ProcessStats
)

Classes

@dataclass
class SystemStats:
    "Standardized snapshot of system resources."
    
    cpu_percent: float = 0.0  # Overall CPU utilization percentage
    memory_used_mb: float = 0.0  # Currently used system RAM in MB
    memory_total_mb: float = 0.0  # Total system RAM in MB
    memory_available_mb: float = 0.0  # Available system RAM in MB
    gpu_type: str = 'None'  # GPU vendor: 'NVIDIA', 'AMD', 'Intel', 'None'
    gpu_free_memory_mb: float = 0.0  # Free GPU memory in MB (sum of all visible devices)
    gpu_total_memory_mb: float = 0.0  # Total GPU memory in MB
    gpu_used_memory_mb: float = 0.0  # Used GPU memory in MB
    gpu_load_percent: float = 0.0  # GPU compute utilization percentage
    details: Dict[str, Any] = field(...)  # Per-device stats, temperatures, etc.
    
    def to_dict(self) -> Dict[str, Any]:  # Dictionary representation for JSON serialization
        "Convert to dictionary for JSON serialization."
@dataclass
class ProcessStats:
    """
    Per-process resource usage snapshot reported by `MonitorPlugin.list_processes`.
    
    CR-3 introduced this as the typed replacement for `SystemStats.details['processes']`.
    Monitor plugins that can enumerate per-process GPU usage (e.g. NVIDIA via `nvitop`)
    return a list of these; monitors without per-process visibility return `[]` from
    the default `MonitorPlugin.list_processes()` implementation.
    """
    
    pid: int = 0  # OS process ID
    gpu_index: int = -1  # GPU index (0-based); -1 if not GPU-bound or unknown
    gpu_memory_mb: float = 0.0  # GPU memory usage attributable to this process, in MB
    command: str = ''  # Process command line (or short name)
    
    def to_dict(self) -> Dict[str, Any]:  # Dictionary representation for JSON serialization
        "Convert to dictionary for JSON serialization."

plugin_interface (plugin_interface.ipynb)

Domain-specific plugin interface for system monitoring

Import

from cjm_infra_plugin_system.plugin_interface import (
    MonitorPlugin
)

Classes

class MonitorPlugin(PluginInterface):
    """
    Abstract base class for hardware monitoring plugins.
    
    CR-3 shifted MonitorPlugin from dispatcher-style `execute(command=...)` to a
    typed surface: subclasses override `get_system_status()` returning `SystemStats`
    and optionally `list_processes()` returning `List[ProcessStats]`. The legacy
    `execute(command=...)` dispatcher is kept as a backward-compat shim so monitors
    that predate CR-3 keep working until the SG-47 migration cascade.
    
    Subclasses MUST override at least one of `execute()` or `get_system_status()` —
    the `__init_subclass__` guard enforces this at class-definition time to prevent
    the recursion trap where both defaults call each other.
    """
    
    def execute(
            self,
            command: str = "get_system_status",  # REMOVE-AFTER-OVERHAUL: rename to `action=` via SG-47 + SG-42 cascade
            **kwargs: Any,
        ) -> Any
        "Backward-compat dispatcher (REMOVE-AFTER-OVERHAUL).

Bridges pre-CR-3 callers (substrate's `_get_global_stats` + job-monitor's
`services/monitor.py`) to typed methods. New MonitorPlugin subclasses
override `get_system_status()` directly and inherit this dispatcher; old
subclasses override this dispatcher with their own dict-returning logic
and rely on the default `get_system_status()` to wrap the result.

After SG-47 cascade migrates consumers off the dispatcher and SG-48 sweep
runs, this default body drops; `execute()` either becomes abstract again
(with `command=` renamed to `action=` per SG-42) or is removed from
MonitorPlugin entirely if all monitors override typed methods."
    
    def get_system_status(self) -> SystemStats:  # Current system telemetry
            """Gather current system statistics as a typed `SystemStats` snapshot.
            
            The default body (REMOVE-AFTER-OVERHAUL) delegates to
            `self.execute("get_system_status")` and wraps the returned dict so that
            monitor plugins predating CR-3 keep working. New monitors override this
            method directly; the `__init_subclass__` guard ensures at least one of
            `execute()` or `get_system_status()` is overridden by every concrete
            subclass.
            
            Unknown fields in the dispatcher's return dict are filtered out (rather
            than raising `TypeError`) so monitors that emit extra debug fields don't
            crash the wrapping.
            """
            raw = self.execute("get_system_status")
            if isinstance(raw, SystemStats)
        "Gather current system statistics as a typed `SystemStats` snapshot.

The default body (REMOVE-AFTER-OVERHAUL) delegates to
`self.execute("get_system_status")` and wraps the returned dict so that
monitor plugins predating CR-3 keep working. New monitors override this
method directly; the `__init_subclass__` guard ensures at least one of
`execute()` or `get_system_status()` is overridden by every concrete
subclass.

Unknown fields in the dispatcher's return dict are filtered out (rather
than raising `TypeError`) so monitors that emit extra debug fields don't
crash the wrapping."
    
    def list_processes(self) -> List[ProcessStats]:  # Per-process resource usage
        "List per-process resource usage. Default returns `[]`.

Monitors with per-process GPU visibility (NVIDIA via `nvitop`/`pynvml`)
override this. CPU-only monitors and AMD pre-ROCm inherit the empty
default since they cannot enumerate per-GPU-process attribution.

`list_devices()` is deliberately omitted per audit's Q-CR3-1=(c) YAGNI
disposition  add when multi-GPU support surfaces a concrete consumer
need."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cjm_infra_plugin_system-0.0.15.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cjm_infra_plugin_system-0.0.15-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file cjm_infra_plugin_system-0.0.15.tar.gz.

File metadata

  • Download URL: cjm_infra_plugin_system-0.0.15.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for cjm_infra_plugin_system-0.0.15.tar.gz
Algorithm Hash digest
SHA256 e0aee5775d5d98625457685b3de86d8bdd4706177fe5713e9f3ad0c9220a6834
MD5 f08e3af61a5a18a6d8c1e9449b2366e4
BLAKE2b-256 0b7f7acf73b443003c0d4bd1774e6b9c4b2f999b7c4b0ac1f567698194bd164b

See more details on using hashes here.

File details

Details for the file cjm_infra_plugin_system-0.0.15-py3-none-any.whl.

File metadata

File hashes

Hashes for cjm_infra_plugin_system-0.0.15-py3-none-any.whl
Algorithm Hash digest
SHA256 7fac5a24b85933b2776ed4bf87955d13a49ddb18de8396998ff5e82de3304342
MD5 1717eab32b381cdf7da2e0138af67ca1
BLAKE2b-256 ac183c7fd24a5d4995442ab23c8e61c83a4fad10931586e231de2a76d1db481c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page