Skip to main content

Export rocm-smi metrics as prometheus metrics

Project description

rocm-smi-exporter

Export rocm-smi metrics as prometheus metrics

Deployment

The rocm-smi-exporter is built and uploaded to pypi. It is then deployed on the host as a systemd service.

Build and upload pypi package

cd deployment

# Create virtual env
python -m venv .venv
source .venv/bin/activate

pip install -r requirements.txt
python -m build ..

# You'll need to enter your Pypi API token
python3 -m twine upload --repository pypi ../dist/*

# Deactivate virtual env
deactivate

Create systemd service

To create systemd service to running the above pip module. The host must have systemd installed.

# Need to install the module as root in order for the systemd to pick up.
sudo pip install lamini-rocm-smi-exporter

# Copy systemd service definition file.
sudo cp lamini-rocm-smi-exporter.service /etc/systemd/system/

# Always reloading configs, see:
# https://unix.stackexchange.com/a/740098
sudo systemctl daemon-reload

# Enable and start the service so the service can be started after system (re)boot.
sudo systemctl enable lamini-rocm-smi-exporter.service
sudo systemctl status lamini-rocm-smi-exporter.service

image

sudo systemctl start lamini-rocm-smi-exporter.service
sudo systemctl status lamini-rocm-smi-exporter.service

image

Stop and remove systemd service

# Stop the service
sudo systemctl stop lamini-rocm-smi-exporter.service

# Verify that the service is stopped
sudo systemctl status lamini-rocm-smi-exporter.service

image

# Disable the service
sudo systemctl disable lamini-rocm-smi-exporter.service

# Verify that the service is disabled
sudo systemctl status lamini-rocm-smi-exporter.service

image

# Remove service definition file
sudo rm /etc/systemd/system/lamini-rocm-smi-exporter.service

Pants build

Pants uses explicit BUILD files to track source files' dependencies and builds.

Pants is hermetic, means that the entire build environment is specified in pants.toml, which is copied from example-python.

Extra

  • Add args to systemd service
    • The python code accepts --port and other arguments
    • If needed, set its value when launching systemd service

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lamini_rocm_smi_exporter-0.0.1a4.tar.gz (3.6 kB view hashes)

Uploaded Source

Built Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page