Skip to main content

Monitor batch pipelines via API and email alerts — install and deploy on Linux servers from PyPI

Project description

inferyx-monitoring — Server deployment

Monitor batch jobs from CSV, poll status via API, and send email alerts for failures, missed runs, long-running jobs, and missing API data.

Package: inferyx-monitoring
CLI: inferyx-monitoring


What's new in 1.0.15

  • PIPELINE_CHECK_MODE=schedule_windows — Option B: check each batch only in two short windows per schedule slot (start + end), not all day.
  • PIPELINE_CHECK_WINDOW_MINUTES — how long each check window stays open (default 10 minutes).
  • Status columnActive monitors; Suspended skips the batch.
  • Timezone fix — API timestamps with timezone no longer cause compare errors.
  • CSV repair — auto-splits glued rows like Activebatch_other or Statusbatch_other.
  • API URL fix — always sends correct name= per batch; empty API body treated as no records.
  • PIPELINE_API_FILTER_BY_SCHEDULE_DATE=false — recommended for Inferyx API (default).

Check modes

PIPELINE_CHECK_MODE Behavior
schedule_windows Recommended. START window at ExpectedStartTime + grace; END window at ExpectedStartTime + AvgExecutionTime + grace. No API calls outside those windows.
full_window Legacy: poll from expected start until end of day.

Example — Daily, 7:00, "10 mins", Active, grace 5, window 10:

Window Time Alerts
START 7:05 – 7:15 missed, no_data, failed
END 7:15 – 7:25 long-running, failed

Paths

Item Path
Install directory /opt/pipeline-monitor
Config /opt/pipeline-monitor/.env
Batch list /opt/pipeline-monitor/jfl_batch.csv
Log file /opt/pipeline-monitor/pipeline_script.log
Python venv /opt/pipeline-monitor/.venv
Service user inferyx
systemd unit inferyx-monitoring.service

1. Install

sudo apt update
sudo apt install -y python3 python3-venv python3-pip

Service user — check first; create only if missing:

id inferyx

If the command returns no such user, create the user:

sudo useradd --system --home-dir /opt/pipeline-monitor --shell /usr/sbin/nologin inferyx

If the user already exists, skip the command above and continue.

sudo mkdir -p /opt/pipeline-monitor
sudo chown inferyx:inferyx /opt/pipeline-monitor

sudo -u inferyx python3 -m venv /opt/pipeline-monitor/.venv
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade pip inferyx-monitoring

2. Create config files (first time)

cd /opt/pipeline-monitor
sudo -u inferyx /opt/pipeline-monitor/.venv/bin/inferyx-monitoring --init-config --work-dir /opt/pipeline-monitor

Creates .env and jfl_batch.csv only if missing.
pip install --upgrade never overwrites your live .env or jfl_batch.csv.


3. Configure .env

sudo -u inferyx vi /opt/pipeline-monitor/.env
sudo chmod 600 /opt/pipeline-monitor/.env

Required

Variable Description
PIPELINE_SMTP_HOST SMTP server (e.g. smtp.office365.com)
PIPELINE_SMTP_PORT SMTP port (usually 587)
PIPELINE_SMTP_USERNAME SMTP login / sender email
PIPELINE_FROM_NAME Display name in From header
PIPELINE_SMTP_PASSWORD SMTP password
PIPELINE_MAIL_TO Primary alert recipients (comma-separated)
PIPELINE_API_BASE_URL API path only — no name= in URL
PIPELINE_API_TOKEN API authentication token
PIPELINE_API_TOKEN_HEADER Header name for token (usually token)
PIPELINE_DEVOPS_EMAIL Recipient for no_data and script-failure alerts

API example

PIPELINE_API_BASE_URL=http://your-host:8080/framework/metadata/getBaseEntityStatusByCriteria
PIPELINE_API_TOKEN=your_token_here
PIPELINE_API_TOKEN_HEADER=token
PIPELINE_API_FILTER_BY_SCHEDULE_DATE=false
  • Use the base path only. The monitor adds name=<batch> for each row in jfl_batch.csv.
  • Keep PIPELINE_API_FILTER_BY_SCHEDULE_DATE=false unless your API supports startDate / endDate.

Email templates (optional)

Variable Purpose
PIPELINE_MAIL_CC CC recipients
PIPELINE_MAIL_SUBJECT_* Subject per alert: DEFAULT, NO_DATA, FAILED, RUNNING, MISSED
PIPELINE_MAIL_BODY_* HTML body per alert (same keys as subjects)
PIPELINE_MAIL_SIGNATURE HTML footer on every alert

Placeholders: {batch_name}, {issue_type}, {issue_type_upper}, {status}, {error}, {error_line}, {expected_start_time}, {expected_end_time}, {frequency}, {avg_time}, {current_time}, {signature}

Scheduling (optional)

Variable Default Meaning
PIPELINE_CHECK_MODE full_window schedule_windows = check only at start/end windows; full_window = poll all day
PIPELINE_CHECK_WINDOW_MINUTES 10 Minutes each start/end check window stays open
PIPELINE_CHECK_INTERVAL 60 Seconds between service poll cycles
PIPELINE_SCHEDULE_GRACE_MINUTES 5 Minutes after expected start/end before alerts
PIPELINE_POST_RUN_GRACE_MINUTES 60 Used only when CHECK_MODE=full_window
PIPELINE_ALERT_COOLDOWN_MINUTES 60 Cooldown between repeat alerts
PIPELINE_FAILED_ALERT_ONCE_PER_DAY true One failed alert per day per batch
PIPELINE_ALERT_ONCE_PER_DAY_ALL_SCENARIOS true One alert per issue type per day

Recommended production settings:

PIPELINE_CHECK_MODE=schedule_windows
PIPELINE_CHECK_WINDOW_MINUTES=10
PIPELINE_SCHEDULE_GRACE_MINUTES=5

4. Configure jfl_batch.csv

sudo -u inferyx vi /opt/pipeline-monitor/jfl_batch.csv

Rules:

  • One batch per line — each row must end with Active or Suspended on its own line.
  • Do not glue rows (bad: Activebatch_other or Statusbatch_other).
  • Quote values with commas or spaces: "09:00,10:00", "3 mins".
  • Use 24-hour time: 0:30:00, 16:30:00.
Column Required Description
Name Yes Batch name as in the API
Frequency Yes Daily, Hourly, Monthly, weekday name, etc.
ExpectedStartTime Scheduled 9:30:00 or "09:00,10:00,11:00"
AvgExecutionTime Recommended "3 mins", "10 mins", "1 Hr"
ExpectedDayOfMonth Monthly Day 1–31
Status Optional Active or Suspended (default: Active)

Example

Name,Frequency,ExpectedStartTime,AvgExecutionTime,ExpectedDayOfMonth,Status
batch_appdb_events,Daily,"09:00,10:00,11:00,12:00","3 mins",,Active
batch_appsflyer,Daily,0:30:00,"10 mins",,Active
batch_appdb_jiosense_bronze_silver,Daily,"01:00,08:00,15:00,21:00","12 mins",,Active
dashboard_batch_my_finances,Daily,5:45:00,"10 mins",,Active
dashboard_batch_ppl_rewards,Daily,6:50:00,"4 mins",,Active
batch_ignosis_part1,Daily,11:00:00,"80 mins",,Active
dashboard_batch_agentic,Daily,8:40:00,"10 mins",,Suspended

5. Test

Run as one single line (do not use \ line breaks — they cause unrecognized arguments errors):

sudo -u inferyx /opt/pipeline-monitor/.venv/bin/inferyx-monitoring --once --work-dir /opt/pipeline-monitor --env-file /opt/pipeline-monitor/.env --csv-file /opt/pipeline-monitor/jfl_batch.csv
tail -100 /opt/pipeline-monitor/pipeline_script.log

6. Start service (systemd)

sudo tee /etc/systemd/system/inferyx-monitoring.service <<'EOF'
[Unit]
Description=Inferyx Pipeline Batch Monitor
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=inferyx
Group=inferyx
WorkingDirectory=/opt/pipeline-monitor
ExecStart=/opt/pipeline-monitor/.venv/bin/inferyx-monitoring --work-dir /opt/pipeline-monitor --env-file /opt/pipeline-monitor/.env --csv-file /opt/pipeline-monitor/jfl_batch.csv
Restart=always
RestartSec=10
Environment=PYTHONUNBUFFERED=1
Environment=PIPELINE_ENV_FILE=/opt/pipeline-monitor/.env
Environment=PIPELINE_LOG_FILE=/opt/pipeline-monitor/pipeline_script.log
Environment=PIPELINE_CSV_FILE=/opt/pipeline-monitor/jfl_batch.csv

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now inferyx-monitoring.service
sudo systemctl status inferyx-monitoring.service

Logs:

sudo journalctl -u inferyx-monitoring.service -f

7. Upgrade package

When a newer version is available on PyPI:

sudo -u inferyx /opt/pipeline-monitor/.venv/bin/pip install --upgrade inferyx-monitoring
sudo systemctl restart inferyx-monitoring.service

Your .env and jfl_batch.csv are not changed by upgrade.


Troubleshooting

Symptom Fix
unrecognized arguments: Run the test command as one line — no \ at end of lines
API error / empty response PIPELINE_API_FILTER_BY_SCHEDULE_DATE=false; remove name= from PIPELINE_API_BASE_URL
Wrong batch in API URL Base URL must not contain a fixed batch name
CSV errors / wrong batches One batch per line; fix glued rows like Activebatch_...
No email Check SMTP settings in .env
Skip a batch Status=Suspended in CSV

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inferyx_monitoring-1.0.14.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inferyx_monitoring-1.0.14-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file inferyx_monitoring-1.0.14.tar.gz.

File metadata

  • Download URL: inferyx_monitoring-1.0.14.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for inferyx_monitoring-1.0.14.tar.gz
Algorithm Hash digest
SHA256 63f3856b0549395e9c67d1937a302c94176d36dfe76a11219e374967831193ed
MD5 61bf883f494286f4df5e5fc3fcb63c32
BLAKE2b-256 7e799d34bd5f955b8ade8e499120783a00b35a927e9422930ce70c452787466d

See more details on using hashes here.

File details

Details for the file inferyx_monitoring-1.0.14-py3-none-any.whl.

File metadata

File hashes

Hashes for inferyx_monitoring-1.0.14-py3-none-any.whl
Algorithm Hash digest
SHA256 7210cfa88bf9a4a549cfd063361f0d1b8120baaa6b387ad9ed8a069ad9702249
MD5 ab725850ad9a9cf4cb7e404d8b1df385
BLAKE2b-256 4f728e6059b3e44b3542abee5107c7d3aebf20734c0b12a2666d5e69fd2794d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page