Skip to main content

Monitor batch pipelines via API and email alerts — install and deploy on Linux servers from PyPI

Project description

inferyx-monitoring

Version: 1.0.39 · PyPI: inferyx-monitoring · CLI: inferyx-monitoring


Contents

  1. Overview
  2. What's new
  3. Installation
  4. Upgrade
  5. Properties
  6. Batch CSV format

1. Overview

inferyx-monitoring watches batch jobs listed in a CSV file, queries the Inferyx API for each batch, and sends email alerts (and optional Teams / Google Chat messages) when problems are detected.

Alert When it fires
failed Batch execution failed
running Still running past expected end + grace
missed Did not start within schedule window + grace
no_data No API record (email to DevOps)

Two parts on the server

Part What it is Required?
Monitor Polls batches, sends alerts Yes
Admin portal Web UI + backend API (OAuth, edit .env / CSV) Optional

Admin portal = UI + backend together. Always install and upgrade them as one unit. No Node.js on the server — UI is pre-built static files under /var/www/.

Browser  →  https://<host>/monitoring/admin/  →  nginx  →  /var/www/pipeline-monitor-admin/  (static HTML/JS)
Browser  →  https://<host>/monitoring/api/    →  nginx  →  127.0.0.1:8090  (inferyx-monitoring-admin)

Paths

Item Path
Monitor install /opt/pipeline-monitor
Monitor venv /opt/pipeline-monitor/.venv
Monitor config /opt/pipeline-monitor/.env
Batch CSV /opt/pipeline-monitor/batch_file.csv
systemd service inferyx-monitoring.service (monitor only, or monitor + admin API with --with-admin)
Admin API (internal) 127.0.0.1:8090 when --with-admin is set
Admin static files /var/www/pipeline-monitor-admin/
Admin auth policy /etc/pipeline-monitor/auth.policy
Public UI URL https://<host>/monitoring/admin/
Public API URL https://<host>/monitoring/api/
Config examples (pip) /opt/pipeline-monitor/.venv/share/inferyx-monitoring/config/
Upgrade script (pip) /opt/pipeline-monitor/.venv/share/inferyx-monitoring/scripts/upgrade_inferyx_monitoring.sh

Commands below use: VENV=/opt/pipeline-monitor/.venv/bin


2. What's new

1.0.39

  • Fix inferyx-monitoring-service child process lookup (no longer uses /usr/bin/inferyx-monitoring).
  • Simpler upgrade flow: pip install --upgrade then run bundled upgrade script.

1.0.38

  • One systemd service for monitor + admin API (--with-admin). No separate admin unit.

1.0.35

  • Admin URLs under /monitoring/admin/ and /monitoring/api/ (update nginx + auth.policy + OAuth redirect).

1.0.27

  • Admin web UI, OAuth via /etc/pipeline-monitor/auth.policy.

1.0.25

  • Teams and Google Chat webhook alerts.

1.0.20

  • Auto .env migration on start.

1.0.16

  • PyPI install and systemd deployment from inferyx-monitoring package.

3. Installation

Choose A (monitor only) or A then B (monitor + admin portal).


A. Install monitor (required)

Step Action
1 Create user, venv, install package
2 Create .env and batch_file.csv
3 Edit .env and CSV
4 Test once
5 Enable systemd service

Step 1 — User, venv, package

sudo apt update && sudo apt install -y python3 python3-venv python3-pip
id inferyx || sudo useradd --system --home-dir /opt/pipeline-monitor --shell /usr/sbin/nologin inferyx
sudo mkdir -p /opt/pipeline-monitor && sudo chown inferyx:inferyx /opt/pipeline-monitor
sudo -u inferyx python3 -m venv /opt/pipeline-monitor/.venv
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade pip inferyx-monitoring

Step 2 — Config files

sudo -u inferyx /opt/pipeline-monitor/.venv/bin/inferyx-monitoring --init-config --work-dir /opt/pipeline-monitor

Step 3 — Edit config (see properties and CSV)

sudo -u inferyx vi /opt/pipeline-monitor/.env
sudo -u inferyx vi /opt/pipeline-monitor/batch_file.csv
sudo chmod 600 /opt/pipeline-monitor/.env

Step 4 — Test

sudo -u inferyx /opt/pipeline-monitor/.venv/bin/inferyx-monitoring --once \
  --work-dir /opt/pipeline-monitor \
  --env-file /opt/pipeline-monitor/.env \
  --csv-file /opt/pipeline-monitor/batch_file.csv

Step 5 — systemd (one service file)

sudo tee /etc/systemd/system/inferyx-monitoring.service <<'EOF'
[Unit]
Description=Inferyx Pipeline Monitor
After=network-online.target

[Service]
Type=simple
User=inferyx
Group=inferyx
WorkingDirectory=/opt/pipeline-monitor
ExecStart=/opt/pipeline-monitor/.venv/bin/inferyx-monitoring-service \
  --work-dir /opt/pipeline-monitor \
  --env-file /opt/pipeline-monitor/.env \
  --csv-file /opt/pipeline-monitor/batch_file.csv
Restart=always
RestartSec=10
Environment=PYTHONUNBUFFERED=1

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now inferyx-monitoring.service
sudo systemctl status inferyx-monitoring.service

Monitor install done.


B. Install admin portal (optional — UI + backend together)

Do this after section A. The admin portal has 3 pieces — install all of them:

# Piece Where it runs
1 Backend API pip [admin] + --with-admin on the same inferyx-monitoring.service127.0.0.1:8090
2 UI static pages /var/www/pipeline-monitor-admin/ (copied from pip package)
3 Nginx Public URLs /monitoring/admin/ + /monitoring/api/

Plus one config file: /etc/pipeline-monitor/auth.policy (OAuth).

Step 1 — Backend (pip)

sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install 'inferyx-monitoring[admin]'

Step 2 — Auth policy (manual — edit OAuth and URLs)

sudo mkdir -p /etc/pipeline-monitor
sudo cp /opt/pipeline-monitor/.venv/share/inferyx-monitoring/config/auth.policy.example \
  /etc/pipeline-monitor/auth.policy
sudo chmod 600 /etc/pipeline-monitor/auth.policy
sudo vi /etc/pipeline-monitor/auth.policy

Minimum settings:

  • ui.enabled: true
  • ui.ui_base_path: /monitoring/admin/
  • auth.google.redirect_uri: https://<your-host>/monitoring/api/auth/callback/google
  • Register that redirect URI in Google / AWS console

See auth.policy properties.

Step 3 — UI static files → /var/www/

sudo /opt/pipeline-monitor/.venv/bin/inferyx-monitoring-admin-install-ui \
  --target /var/www/pipeline-monitor-admin

Step 4 — Update systemd (add admin API to the same service)

If you already have inferyx-monitoring.service from section A, replace it. Remove any old inferyx-monitoring-admin.service:

sudo systemctl disable --now inferyx-monitoring-admin.service 2>/dev/null || true
sudo rm -f /etc/systemd/system/inferyx-monitoring-admin.service

sudo tee /etc/systemd/system/inferyx-monitoring.service <<'EOF'
[Unit]
Description=Inferyx Pipeline Monitor
After=network-online.target

[Service]
Type=simple
User=inferyx
Group=inferyx
WorkingDirectory=/opt/pipeline-monitor
ExecStart=/opt/pipeline-monitor/.venv/bin/inferyx-monitoring-service \
  --work-dir /opt/pipeline-monitor \
  --env-file /opt/pipeline-monitor/.env \
  --csv-file /opt/pipeline-monitor/batch_file.csv \
  --with-admin
Restart=always
RestartSec=10
Environment=PYTHONUNBUFFERED=1
Environment=PIPELINE_MONITOR_ROOT=/opt/pipeline-monitor
Environment=PIPELINE_ADMIN_POLICY=/etc/pipeline-monitor/auth.policy

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl restart inferyx-monitoring.service
sudo journalctl -u inferyx-monitoring.service -n 30

Or copy the example: share/inferyx-monitoring/config/inferyx-monitoring.service.example

Step 5 — Nginx (manual — one-time setup)

sudo cp /opt/pipeline-monitor/.venv/share/inferyx-monitoring/config/nginx-pipeline-monitor-admin.conf.example \
  /etc/nginx/sites-available/pipeline-monitor-admin
sudo vi /etc/nginx/sites-available/pipeline-monitor-admin   # server_name, SSL certs
sudo ln -sf /etc/nginx/sites-available/pipeline-monitor-admin /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

Step 6 — Verify

Check URL
UI https://<host>/monitoring/admin/
API health https://<host>/monitoring/api/health

Admin portal install done.


C. Install script (optional shortcut)

sudo bash install_inferyx_monitoring.sh                      # monitor only
sudo bash install_inferyx_monitoring.sh --admin --admin-ui   # monitor + admin portal

After pip install, scripts are at: /opt/pipeline-monitor/.venv/share/inferyx-monitoring/scripts/

Nginx is still configured manually (step B.5 above).


4. Upgrade

Always two steps: (1) download latest from PyPI, (2) run the upgrade script.

Script path (after pip install): /opt/pipeline-monitor/.venv/share/inferyx-monitoring/scripts/upgrade_inferyx_monitoring.sh

4.1 Monitor only (no admin portal)

sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade inferyx-monitoring
sudo bash /opt/pipeline-monitor/.venv/share/inferyx-monitoring/scripts/upgrade_inferyx_monitoring.sh --mode monitor

4.2 Monitor + admin portal (most servers)

sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade 'inferyx-monitoring[admin]'
sudo bash /opt/pipeline-monitor/.venv/share/inferyx-monitoring/scripts/upgrade_inferyx_monitoring.sh --migrate-service

The script upgrades pip (again if needed), copies UI to /var/www/pipeline-monitor-admin/, merges legacy inferyx-monitoring-admin.service into one unit, and restarts inferyx-monitoring.service.

Check logs:

sudo systemctl status inferyx-monitoring.service
sudo journalctl -u inferyx-monitoring.service -n 50

Error No such file or directory: /usr/bin/inferyx-monitoring: upgrade to 1.0.39+ (see above). Interim fix: add to the [Service] section of the unit file: Environment=PIPELINE_VENV_PYTHON=/opt/pipeline-monitor/.venv/bin/python

4.3 Upgrade script options

--mode What it does
full Default — pip [admin] + UI copy + restart
monitor Pip monitor only + restart (no UI copy)
ui UI copy only + restart
Flag When to use
--migrate-service First upgrade from two systemd units to one
--skip-pip You already ran pip install --upgrade
--legacy-migrate Rename jfl_batch.csvbatch_file.csv

4.4 Manual files (edit yourself — not changed by upgrade script)

What Example template (in pip package) Live file on server
Nginx share/inferyx-monitoring/config/nginx-pipeline-monitor-admin.conf.example /etc/nginx/sites-available/pipeline-monitor-admin
Auth / OAuth share/inferyx-monitoring/config/auth.policy.example /etc/pipeline-monitor/auth.policy
systemd unit share/inferyx-monitoring/config/inferyx-monitoring.service.example /etc/systemd/system/inferyx-monitoring.service
Monitor secrets inferyx-monitoring --init-config adds keys only /opt/pipeline-monitor/.env
Batch list /opt/pipeline-monitor/batch_file.csv

After nginx or SSL changes:

sudo nginx -t && sudo systemctl reload nginx

4.5 Auto vs manual on restart

Automatic on service start You edit manually
Missing .env keys added SMTP password, API token
Legacy mail body keys retired auth.policy, OAuth console
CSV Status column added if missing Nginx, SSL certs

5. Properties

5.1 .env file (/opt/pipeline-monitor/.env)

Required (set manually on first install)

Property Description
PIPELINE_SMTP_HOST SMTP server hostname
PIPELINE_SMTP_PORT SMTP port (e.g. 587)
PIPELINE_SMTP_USERNAME SMTP username
PIPELINE_SMTP_PASSWORD SMTP password
PIPELINE_FROM_NAME Sender display name
PIPELINE_MAIL_TO Alert email recipients (comma-separated)
PIPELINE_API_BASE_URL Inferyx API URL — no name= in URL
PIPELINE_API_TOKEN API authentication token
PIPELINE_API_TOKEN_HEADER Header name (e.g. token or Authorization)
PIPELINE_DEVOPS_EMAIL Recipient for no_data alerts

Optional — scheduling and API

Property Default Description Since
PIPELINE_CHECK_MODE schedule_windows schedule_windows (recommended) or full_window 1.0.15
PIPELINE_CHECK_WINDOW_MINUTES 10 Minutes around start/end to poll API 1.0.15
PIPELINE_CHECK_INTERVAL 60 Seconds between checks (legacy mode) original
PIPELINE_SCHEDULE_GRACE_MINUTES 5 Grace after expected start/end original
PIPELINE_POST_RUN_GRACE_MINUTES 60 Grace after run completes original
PIPELINE_API_FILTER_BY_SCHEDULE_DATE false Add schedule date to API query original
PIPELINE_API_RETRY_COUNT 3 API retry attempts original
PIPELINE_API_RETRY_BACKOFF_SEC 5 Seconds between API retries original
PIPELINE_ALERT_COOLDOWN_MINUTES 60 Minutes before repeat alert original
PIPELINE_FAILED_ALERT_ONCE_PER_DAY true Limit failed alerts per day original
PIPELINE_ALERT_ONCE_PER_DAY_ALL_SCENARIOS true Limit all alert types per day original
PIPELINE_MAIL_CC (empty) CC recipients original

Optional — Teams / Google Chat

Property Default Description Since
PIPELINE_TEAMS_ENABLED false Enable Teams webhooks 1.0.25
PIPELINE_TEAMS_WEBHOOK_URL (empty) Teams incoming webhook URL 1.0.25
PIPELINE_GCHAT_ENABLED false Enable Google Chat webhooks 1.0.25
PIPELINE_GCHAT_WEBHOOK_URL (empty) Google Chat webhook URL 1.0.25
PIPELINE_CHAT_ALERT_NO_DATA false Also send no_data to chat (default: email only) 1.0.25
PIPELINE_CHAT_GREETING (built-in) Chat message greeting 1.0.25
PIPELINE_CHAT_SIGNATURE (built-in) Chat message signature 1.0.25

Test chat: inferyx-monitoring --test-chat-alerts --work-dir /opt/pipeline-monitor

Optional — email subjects

Property Description Since
PIPELINE_MAIL_SUBJECT_DEFAULT Default subject template original
PIPELINE_MAIL_SUBJECT_NO_DATA Subject for no_data original
PIPELINE_MAIL_SUBJECT_FAILED Subject for failed original
PIPELINE_MAIL_SUBJECT_RUNNING Subject for running original
PIPELINE_MAIL_SUBJECT_MISSED Subject for missed original

Placeholders: {batch_name}, {issue_type_upper}, etc.

Optional — email layout (structured HTML)

Property Description Since
PIPELINE_MAIL_GREETING Email greeting line 1.0.17
PIPELINE_MAIL_ENVIRONMENT Environment label in email original
PIPELINE_MAIL_HEADING_BATCH_DETAILS Section heading 1.0.19
PIPELINE_MAIL_HEADING_ADDITIONAL_INFO Section heading 1.0.19
PIPELINE_MAIL_HEADING_ALERT_SUMMARY Section heading 1.0.19
PIPELINE_MAIL_HEADING_ACTION_REQUIRED Section heading 1.0.19
PIPELINE_MAIL_HEADING_CURRENT_STATUS Section heading 1.0.19
PIPELINE_MAIL_INTRO_RUNNING Body section (running) 1.0.19
PIPELINE_MAIL_SUMMARY_RUNNING Body section (running) 1.0.19
PIPELINE_MAIL_ACTION_RUNNING Body section (running) 1.0.19
PIPELINE_MAIL_STATUS_RUNNING Body section (running) 1.0.19
PIPELINE_MAIL_SIGNATURE Email signature HTML original
PIPELINE_MAIL_FOOTER_NOTE Footer note HTML original

Similar PIPELINE_MAIL_INTRO_*, SUMMARY_*, ACTION_*, STATUS_* exist for failed, missed, no_data (built-in defaults if omitted).

Retired (do not add — removed on upgrade)

Property Replaced by Since retired
PIPELINE_MAIL_BODY_DEFAULT Structured sections above 1.0.20
PIPELINE_MAIL_BODY_NO_DATA Structured sections above 1.0.20
PIPELINE_MAIL_BODY_FAILED Structured sections above 1.0.20
PIPELINE_MAIL_BODY_RUNNING Structured sections above 1.0.20
PIPELINE_MAIL_BODY_MISSED Structured sections above 1.0.20

5.2 auth.policy (/etc/pipeline-monitor/auth.policy)

Admin UI only. Not updated by pip. JSON format. Copy from share/inferyx-monitoring/config/auth.policy.example.

ui section

Property Example Description Since
enabled true Turn Admin UI on/off 1.0.27
session_timeout_minutes 60 Login session length 1.0.27
public_base_url https://monitor.example.com Public site URL (no trailing path) 1.0.27
ui_base_path /monitoring/admin/ UI path under public URL 1.0.35 (was /admin/)

Browser UI URL = public_base_url + ui_base_pathhttps://monitor.example.com/monitoring/admin/

auth section (OAuth)

Property Description Since
auth.google.enabled Enable Google login 1.0.27
auth.google.client_id Google OAuth client ID 1.0.27
auth.google.client_secret_path File containing client secret 1.0.27
auth.google.redirect_uri https://<host>/monitoring/api/auth/callback/google 1.0.35 path
auth.google.allowed_domains Allowed email domains 1.0.27
auth.aws_idc.* AWS IAM Identity Center OAuth (same pattern) 1.0.27

Register redirect_uri in Google / AWS OAuth console.

paths section

Property Default Description Since
work_dir /opt/pipeline-monitor Monitor install directory 1.0.27
env_file /opt/pipeline-monitor/.env Path to .env 1.0.27
csv_file /opt/pipeline-monitor/batch_file.csv Path to batch CSV 1.0.27
audit_log /var/log/pipeline-monitor/admin-audit.log Admin audit log 1.0.27
session_secret_path /etc/pipeline-monitor/session_secret Session signing secret file 1.0.27

branding section

Property Description Since
app_name UI title 1.0.27
logo_url Logo image URL 1.0.27
footer_text Footer text 1.0.27
primary_color Theme color (hex) 1.0.27
favicon_url Favicon URL 1.0.27

security section

Property Default Description Since
restart_service_on_save true Restart monitor after config save from UI 1.0.27
service_name inferyx-monitoring systemd service to restart 1.0.27
rate_limit_per_minute 120 API rate limit per IP 1.0.27

6. Batch CSV format

6.1 Current format (batch_file.csv)

File: /opt/pipeline-monitor/batch_file.csv (since 1.0.15; was jfl_batch.csv)

Example:

Name,Frequency,ExpectedStartTime,AvgExecutionTime,ExpectedDayOfMonth,Status
daily_report,Daily,9:00:00,"10 mins",,Active
monthly_close,Monthly,6:00:00,"45 mins",1,Active
adhoc_job,Once,14:30:00,"5 mins",,Suspended

Use 24-hour times. Quote values that contain commas.

6.2 Column reference

Column Required Description Since
Name Yes Batch name as known to Inferyx API original
Frequency Yes Daily, Weekly, Monthly, or Once original
ExpectedStartTime Yes Expected start time (HH:MM:SS) original
AvgExecutionTime Yes Typical duration (e.g. "10 mins", "1 hour") original
ExpectedDayOfMonth No Day of month for monthly jobs (1–31); empty for Daily/Weekly original
Status Yes* Active or Suspended — only Active rows are monitored 1.0.15

*If Status column is missing, upgrade/migration adds it and sets existing rows to Active.

6.3 Legacy format (before 1.0.15)

Item Old Current
Filename jfl_batch.csv batch_file.csv
Columns Same except no Status column Includes Status
systemd --csv-file May point to jfl_batch.csv Use batch_file.csv

Migration on start copies jfl_batch.csvbatch_file.csv (original kept) and adds Status if missing.

6.4 Frequency values

Value Meaning
Daily Runs every day at ExpectedStartTime
Weekly Runs weekly at ExpectedStartTime
Monthly Runs on ExpectedDayOfMonth at ExpectedStartTime
Once Single scheduled run

6.5 Status values

Value Meaning
Active Monitored
Suspended Ignored (not polled)

Also accepted by Admin UI: Inactive, Disabled (treated as not active).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inferyx_monitoring-1.0.39.tar.gz (178.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inferyx_monitoring-1.0.39-py3-none-any.whl (184.7 kB view details)

Uploaded Python 3

File details

Details for the file inferyx_monitoring-1.0.39.tar.gz.

File metadata

  • Download URL: inferyx_monitoring-1.0.39.tar.gz
  • Upload date:
  • Size: 178.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for inferyx_monitoring-1.0.39.tar.gz
Algorithm Hash digest
SHA256 347b3bfdc55f2559d6d7f91056f3614ed9055f6d8c865c2a1da476cf3e0e7418
MD5 f717072dbd28caf767d3529648fddbe9
BLAKE2b-256 26db44a6888afd2248b0999c92bfaa2c88e31400e2a6c33f9459fff61146b3ff

See more details on using hashes here.

File details

Details for the file inferyx_monitoring-1.0.39-py3-none-any.whl.

File metadata

File hashes

Hashes for inferyx_monitoring-1.0.39-py3-none-any.whl
Algorithm Hash digest
SHA256 1ad90c71e1e45a6ac6d08dac21ddc8d8fac5cb7dae4f2f325bd4c032a88be6a7
MD5 bf3b954488e9718b4c9d2adbf4a63d01
BLAKE2b-256 1011dd3436c4aca1ded5275ae5f39118e218689b24c551ad3714788d440d0c7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page