Skip to main content

Monitor batch pipelines via API and email alerts — install and deploy on Linux servers from PyPI

Project description

inferyx-monitoring

Version: 1.0.43 · PyPI: inferyx-monitoring · CLI: inferyx-monitoring

Three commands

Step Who Command
Build Developer (repo) ./scripts/release.sh
Install New server sudo bash server/install_inferyx_monitoring.sh
Upgrade Existing server sudo bash /opt/pipeline-monitor/.venv/share/inferyx-monitoring/upgrade_inferyx_monitoring.sh
Test local Developer (repo) ./scripts/run_local.sh
Your situation Section
New server §3 Install
Server already has /opt/pipeline-monitor §4 Upgrade
Edit secrets, OAuth, nginx §5 Manual config

Contents

  1. Overview
  2. What's new
  3. Install — new server
  4. Upgrade — existing server
  5. Manual config
  6. Properties
  7. Batch CSV format

1. Overview

inferyx-monitoring watches batch jobs listed in a CSV file, queries the Inferyx API for each batch, and sends email alerts (and optional Teams / Google Chat messages) when problems are detected.

Alert When it fires
failed Batch execution failed
running Still running past expected end + grace
missed Did not start within schedule window + grace
no_data No API record (email to DevOps)

One product (always installed together)

Layer What it does
Monitor Polls batches, sends email/chat alerts
Admin API OAuth, edit .env / CSV from browser
Admin UI Static web pages under /var/www/

Install and upgrade always deploy all three. No Node.js on the server — UI is pre-built in the pip wheel.

UI on/off is controlled only in auth.policy — not separate install modes:

ui.enabled Monitor Admin API process UI files in /var/www/ Browser login / config
true runs runs (--with-admin) deployed works
false runs runs (--with-admin) deployed disabled (503); /api/health still works
"ui": { "enabled": false }

Set in /etc/pipeline-monitor/auth.policy, then sudo systemctl restart inferyx-monitoring.service.

Browser  →  https://<host>/monitoring/admin/  →  nginx  →  /var/www/pipeline-monitor-admin/
Browser  →  https://<host>/monitoring/api/    →  nginx  →  127.0.0.1:8090
Monitor    →  inferyx-monitoring.service (one systemd unit, always --with-admin)

Server paths (one reference — use everywhere)

Set these once per shell session:

IM_HOME=/opt/pipeline-monitor
IM_BIN=$IM_HOME/.venv/bin
IM_PKG=$IM_HOME/.venv/share/inferyx-monitoring
What Template ($IM_PKG/) Live file on server
Install script install_inferyx_monitoring.sh run once (fresh deploy)
Upgrade script upgrade_inferyx_monitoring.sh run on each upgrade
systemd unit inferyx-monitoring.service.example /etc/systemd/system/inferyx-monitoring.service
Nginx nginx-pipeline-monitor-admin.conf.example /etc/nginx/sites-available/pipeline-monitor-admin
OAuth policy auth.policy.example /etc/pipeline-monitor/auth.policy
Monitor config $IM_HOME/.env
Batch CSV $IM_HOME/batch_file.csv
Admin UI static — (pip command copies files) /var/www/pipeline-monitor-admin/
Runtime Path
systemd service inferyx-monitoring.service
Admin API (internal) 127.0.0.1:8090 when --with-admin
Public UI https://<host>/monitoring/admin/
Public API https://<host>/monitoring/api/

Repo (developers): all deploy scripts and templates live in server/ — packaged to $IM_PKG/ on pip install.

Build (developer machine)

./scripts/release.sh              # build wheel + UI
./scripts/release.sh --publish    # upload to PyPI

Test local (developer machine)

cp .env.example .env
cp inferyx_pipeline_monitor/data/batch_file.csv.example batch_file.csv
./scripts/run_local.sh --once     # test monitor
./scripts/run_local.sh            # monitor + admin API on :8090

2. What's new

1.0.43

  • Release publishes to PyPI only (no SCP / deploy tarball in release.sh).
  • Fix twine upload: uses venv python -m twine (avoids urllib3 conflict).

1.0.42

  • Step-by-step install/upgrade docs; upgrade always deploys UI to /var/www/pipeline-monitor-admin/.
  • Upgrade merges missing .env keys automatically.
  • ui.enabled: false — monitor + admin API process run; browser UI disabled.

1.0.41

  • One full package — monitor + admin API + UI always installed; disable UI via auth.policy.
  • Three commands: build (./scripts/release.sh), install, upgrade.

1.0.40

  • One folder for all deploy files: $IM_PKG/.

1.0.39

  • Fix inferyx-monitoring-service child process lookup (no longer uses /usr/bin/inferyx-monitoring).
  • Simpler upgrade flow: pip install --upgrade then run bundled upgrade script.

1.0.38

  • One systemd service for monitor + admin API (--with-admin). No separate admin unit.

1.0.35

  • Admin URLs under /monitoring/admin/ and /monitoring/api/ (update nginx + auth.policy + OAuth redirect).

1.0.27

  • Admin web UI, OAuth via /etc/pipeline-monitor/auth.policy.

1.0.25

  • Teams and Google Chat webhook alerts.

1.0.20

  • Auto .env migration on start.

1.0.16

  • PyPI install and systemd deployment from inferyx-monitoring package.

3. Install

New server only. Existing server → §4 Upgrade.

One command

sudo bash server/install_inferyx_monitoring.sh

(From repo checkout. On server after pip: sudo bash $IM_PKG/install_inferyx_monitoring.sh)

What the script does (steps 1–6 — automatic)

Step Action
1 apt, user inferyx, venv at /opt/pipeline-monitor/.venv
2 pip install inferyx-monitoring (monitor + admin API + UI wheel)
3 Create $IM_HOME/.env and batch_file.csv (missing keys only)
4 Create /etc/pipeline-monitor/auth.policy from template
5 Deploy UI → /var/www/pipeline-monitor-admin/
6 Enable inferyx-monitoring.service (monitor + admin API, --with-admin)

Step 7 — Manual config (you edit)

IM_HOME=/opt/pipeline-monitor
IM_PKG=$IM_HOME/.venv/share/inferyx-monitoring

sudo -u inferyx vi $IM_HOME/.env
sudo -u inferyx vi $IM_HOME/batch_file.csv
sudo chmod 600 $IM_HOME/.env
sudo vi /etc/pipeline-monitor/auth.policy
sudo cp $IM_PKG/nginx-pipeline-monitor-admin.conf.example /etc/nginx/sites-available/pipeline-monitor-admin
sudo vi /etc/nginx/sites-available/pipeline-monitor-admin
sudo ln -sf /etc/nginx/sites-available/pipeline-monitor-admin /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
File You set
$IM_HOME/.env SMTP, API token, alert emails
$IM_HOME/batch_file.csv Batch names and schedules
/etc/pipeline-monitor/auth.policy OAuth, ui.enabled (true / false)
/etc/nginx/.../pipeline-monitor-admin server_name, SSL certs

Step 8 — Verify

sudo -u inferyx $IM_HOME/.venv/bin/inferyx-monitoring --once --work-dir $IM_HOME
sudo systemctl status inferyx-monitoring.service
curl -s http://127.0.0.1:8090/api/health

4. Upgrade

Existing server only. Do not re-run the install script.

One command

IM_PKG=/opt/pipeline-monitor/.venv/share/inferyx-monitoring
sudo bash $IM_PKG/upgrade_inferyx_monitoring.sh

What the script does (steps 1–5 — automatic)

Step Action
1 pip install --upgrade inferyx-monitoring
2 Merge missing .env keys (secrets unchanged)
3 Deploy UI → /var/www/pipeline-monitor-admin/ (every upgrade)
4 Update systemd unit (monitor + admin API)
5 Restart inferyx-monitoring.service

After upgrade — review manual config

File Auto on upgrade? You review when
$IM_HOME/.env Missing keys added Secrets, SMTP, API token
$IM_HOME/batch_file.csv Status column if missing Batch changes
/etc/pipeline-monitor/auth.policy Not changed OAuth, ui.enabled
/etc/nginx/.../pipeline-monitor-admin Not changed SSL, server_name
sudo systemctl status inferyx-monitoring.service
sudo journalctl -u inferyx-monitoring.service -n 50
curl -s http://127.0.0.1:8090/api/health
sudo nginx -t && sudo systemctl reload nginx   # only if you changed nginx

5. Manual config

Templates live in $IM_PKG/. Scripts never overwrite your secrets or auth.policy.

Disable browser UI only

Edit /etc/pipeline-monitor/auth.policy:

"ui": { "enabled": false }

Then:

sudo systemctl restart inferyx-monitoring.service
Still runs Stopped
Monitor (batch alerts) Browser login
Admin API process Browser config pages
/api/health Other /api/* (503)
UI files in /var/www/ (unchanged)

To re-enable UI: set "enabled": true and restart the service.

Google OAuth client secret

sudo mkdir -p /etc/pipeline-monitor
echo 'YOUR_CLIENT_SECRET' | sudo tee /etc/pipeline-monitor/google_client_secret
sudo chmod 600 /etc/pipeline-monitor/google_client_secret

Point auth.google.client_secret_path to that file in auth.policy.


6. Properties

6.1 .env file (/opt/pipeline-monitor/.env)

Required (set manually on first install)

Property Description
PIPELINE_SMTP_HOST SMTP server hostname
PIPELINE_SMTP_PORT SMTP port (e.g. 587)
PIPELINE_SMTP_USERNAME SMTP username
PIPELINE_SMTP_PASSWORD SMTP password
PIPELINE_FROM_NAME Sender display name
PIPELINE_MAIL_TO Alert email recipients (comma-separated)
PIPELINE_API_BASE_URL Inferyx API URL — no name= in URL
PIPELINE_API_TOKEN API authentication token
PIPELINE_API_TOKEN_HEADER Header name (e.g. token or Authorization)
PIPELINE_DEVOPS_EMAIL Recipient for no_data alerts

Optional — scheduling and API

Property Default Description Since
PIPELINE_CHECK_MODE schedule_windows schedule_windows (recommended) or full_window 1.0.15
PIPELINE_CHECK_WINDOW_MINUTES 10 Minutes around start/end to poll API 1.0.15
PIPELINE_CHECK_INTERVAL 60 Seconds between checks (legacy mode) original
PIPELINE_SCHEDULE_GRACE_MINUTES 5 Grace after expected start/end original
PIPELINE_POST_RUN_GRACE_MINUTES 60 Grace after run completes original
PIPELINE_API_FILTER_BY_SCHEDULE_DATE false Add schedule date to API query original
PIPELINE_API_RETRY_COUNT 3 API retry attempts original
PIPELINE_API_RETRY_BACKOFF_SEC 5 Seconds between API retries original
PIPELINE_ALERT_COOLDOWN_MINUTES 60 Minutes before repeat alert original
PIPELINE_FAILED_ALERT_ONCE_PER_DAY true Limit failed alerts per day original
PIPELINE_ALERT_ONCE_PER_DAY_ALL_SCENARIOS true Limit all alert types per day original
PIPELINE_MAIL_CC (empty) CC recipients original

Optional — Teams / Google Chat

Property Default Description Since
PIPELINE_TEAMS_ENABLED false Enable Teams webhooks 1.0.25
PIPELINE_TEAMS_WEBHOOK_URL (empty) Teams incoming webhook URL 1.0.25
PIPELINE_GCHAT_ENABLED false Enable Google Chat webhooks 1.0.25
PIPELINE_GCHAT_WEBHOOK_URL (empty) Google Chat webhook URL 1.0.25
PIPELINE_CHAT_ALERT_NO_DATA false Also send no_data to chat (default: email only) 1.0.25
PIPELINE_CHAT_GREETING (built-in) Chat message greeting 1.0.25
PIPELINE_CHAT_SIGNATURE (built-in) Chat message signature 1.0.25

Test chat: inferyx-monitoring --test-chat-alerts --work-dir /opt/pipeline-monitor

Optional — email subjects

Property Description Since
PIPELINE_MAIL_SUBJECT_DEFAULT Default subject template original
PIPELINE_MAIL_SUBJECT_NO_DATA Subject for no_data original
PIPELINE_MAIL_SUBJECT_FAILED Subject for failed original
PIPELINE_MAIL_SUBJECT_RUNNING Subject for running original
PIPELINE_MAIL_SUBJECT_MISSED Subject for missed original

Placeholders: {batch_name}, {issue_type_upper}, etc.

Optional — email layout (structured HTML)

Property Description Since
PIPELINE_MAIL_GREETING Email greeting line 1.0.17
PIPELINE_MAIL_ENVIRONMENT Environment label in email original
PIPELINE_MAIL_HEADING_BATCH_DETAILS Section heading 1.0.19
PIPELINE_MAIL_HEADING_ADDITIONAL_INFO Section heading 1.0.19
PIPELINE_MAIL_HEADING_ALERT_SUMMARY Section heading 1.0.19
PIPELINE_MAIL_HEADING_ACTION_REQUIRED Section heading 1.0.19
PIPELINE_MAIL_HEADING_CURRENT_STATUS Section heading 1.0.19
PIPELINE_MAIL_INTRO_RUNNING Body section (running) 1.0.19
PIPELINE_MAIL_SUMMARY_RUNNING Body section (running) 1.0.19
PIPELINE_MAIL_ACTION_RUNNING Body section (running) 1.0.19
PIPELINE_MAIL_STATUS_RUNNING Body section (running) 1.0.19
PIPELINE_MAIL_SIGNATURE Email signature HTML original
PIPELINE_MAIL_FOOTER_NOTE Footer note HTML original

Similar PIPELINE_MAIL_INTRO_*, SUMMARY_*, ACTION_*, STATUS_* exist for failed, missed, no_data (built-in defaults if omitted).

Retired (do not add — removed on upgrade)

Property Replaced by Since retired
PIPELINE_MAIL_BODY_DEFAULT Structured sections above 1.0.20
PIPELINE_MAIL_BODY_NO_DATA Structured sections above 1.0.20
PIPELINE_MAIL_BODY_FAILED Structured sections above 1.0.20
PIPELINE_MAIL_BODY_RUNNING Structured sections above 1.0.20
PIPELINE_MAIL_BODY_MISSED Structured sections above 1.0.20

6.2 auth.policy (/etc/pipeline-monitor/auth.policy)

Admin UI only. Not updated by pip. JSON format. Copy from $IM_PKG/auth.policy.example.

ui section

Property Example Description Since
enabled true Turn Admin UI on/off 1.0.27
session_timeout_minutes 60 Login session length 1.0.27
public_base_url https://monitor.example.com Public site URL (no trailing path) 1.0.27
ui_base_path /monitoring/admin/ UI path under public URL 1.0.35 (was /admin/)

Browser UI URL = public_base_url + ui_base_pathhttps://monitor.example.com/monitoring/admin/

auth section (OAuth)

Property Description Since
auth.google.enabled Enable Google login 1.0.27
auth.google.client_id Google OAuth client ID 1.0.27
auth.google.client_secret_path File containing client secret 1.0.27
auth.google.redirect_uri https://<host>/monitoring/api/auth/callback/google 1.0.35 path
auth.google.allowed_domains Allowed email domains 1.0.27
auth.aws_idc.* AWS IAM Identity Center OAuth (same pattern) 1.0.27

Register redirect_uri in Google / AWS OAuth console.

paths section

Property Default Description Since
work_dir /opt/pipeline-monitor Monitor install directory 1.0.27
env_file /opt/pipeline-monitor/.env Path to .env 1.0.27
csv_file /opt/pipeline-monitor/batch_file.csv Path to batch CSV 1.0.27
audit_log /var/log/pipeline-monitor/admin-audit.log Admin audit log 1.0.27
session_secret_path /etc/pipeline-monitor/session_secret Session signing secret file 1.0.27

branding section

Property Description Since
app_name UI title 1.0.27
logo_url Logo image URL 1.0.27
footer_text Footer text 1.0.27
primary_color Theme color (hex) 1.0.27
favicon_url Favicon URL 1.0.27

security section

Property Default Description Since
restart_service_on_save true Restart monitor after config save from UI 1.0.27
service_name inferyx-monitoring systemd service to restart 1.0.27
rate_limit_per_minute 120 API rate limit per IP 1.0.27

7. Batch CSV format

7.1 Current format (batch_file.csv)

File: /opt/pipeline-monitor/batch_file.csv (since 1.0.15; was jfl_batch.csv)

Example:

Name,Frequency,ExpectedStartTime,AvgExecutionTime,ExpectedDayOfMonth,Status
daily_report,Daily,9:00:00,"10 mins",,Active
monthly_close,Monthly,6:00:00,"45 mins",1,Active
adhoc_job,Once,14:30:00,"5 mins",,Suspended

Use 24-hour times. Quote values that contain commas.

7.2 Column reference

Column Required Description Since
Name Yes Batch name as known to Inferyx API original
Frequency Yes Daily, Weekly, Monthly, or Once original
ExpectedStartTime Yes Expected start time (HH:MM:SS) original
AvgExecutionTime Yes Typical duration (e.g. "10 mins", "1 hour") original
ExpectedDayOfMonth No Day of month for monthly jobs (1–31); empty for Daily/Weekly original
Status Yes* Active or Suspended — only Active rows are monitored 1.0.15

*If Status column is missing, upgrade/migration adds it and sets existing rows to Active.

7.3 Legacy format (before 1.0.15)

Item Old Current
Filename jfl_batch.csv batch_file.csv
Columns Same except no Status column Includes Status
systemd --csv-file May point to jfl_batch.csv Use batch_file.csv

Migration on start copies jfl_batch.csvbatch_file.csv (original kept) and adds Status if missing.

7.4 Frequency values

Value Meaning
Daily Runs every day at ExpectedStartTime
Weekly Runs weekly at ExpectedStartTime
Monthly Runs on ExpectedDayOfMonth at ExpectedStartTime
Once Single scheduled run

7.5 Status values

Value Meaning
Active Monitored
Suspended Ignored (not polled)

Also accepted by Admin UI: Inactive, Disabled (treated as not active).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inferyx_monitoring-1.0.43.tar.gz (173.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inferyx_monitoring-1.0.43-py3-none-any.whl (184.4 kB view details)

Uploaded Python 3

File details

Details for the file inferyx_monitoring-1.0.43.tar.gz.

File metadata

  • Download URL: inferyx_monitoring-1.0.43.tar.gz
  • Upload date:
  • Size: 173.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for inferyx_monitoring-1.0.43.tar.gz
Algorithm Hash digest
SHA256 65b175790ad50c8a9a603cc796f96b08a9d32db20417c091a90ca466748e9f4a
MD5 7662fedb34292dd7a1c7ec0100b8d4bb
BLAKE2b-256 0d4d20362ea26369b2de0c3669218ad4e2ff5e88a689015963b54a992d6f65fd

See more details on using hashes here.

File details

Details for the file inferyx_monitoring-1.0.43-py3-none-any.whl.

File metadata

File hashes

Hashes for inferyx_monitoring-1.0.43-py3-none-any.whl
Algorithm Hash digest
SHA256 aa7cea7dc5d0fa3dc2146e067682f924f6ec39424ba555440c9f4398f5fab468
MD5 b9673e8c8e84d1de801d78180c533aa0
BLAKE2b-256 783900c0a8cf8e3cc659d6552fd40ac0005661d417de659178fad204154a7c93

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page