Skip to main content

Monitor batch pipelines via API and email alerts — install and deploy on Linux servers from PyPI

Project description

inferyx-monitoring — Server deployment

Monitor batch jobs from CSV, poll status via API, and send email alerts for failures, missed runs, long-running jobs, and missing API data.

Package: inferyx-monitoring
CLI: inferyx-monitoring


Order of steps

  1. Build machine — publish a new version to PyPI (see docs/PYPI_PUBLISHING.md).
  2. Target serverpip install / pip install --upgrade from PyPI.
  3. Target server — configure /opt/pipeline-monitor/.env and jfl_batch.csv, test, enable systemd.

pip install --upgrade updates code only. It never overwrites your .env or jfl_batch.csv.


Paths

Item Path
Install directory /opt/pipeline-monitor
Config /opt/pipeline-monitor/.env
Batch list /opt/pipeline-monitor/jfl_batch.csv
Log file /opt/pipeline-monitor/pipeline_script.log
Python venv /opt/pipeline-monitor/venv
Service user inferyx
systemd unit inferyx-monitoring.service

1. Install on server (from PyPI)

sudo apt update
sudo apt install -y python3 python3-venv python3-pip

sudo useradd --system --home-dir /opt/pipeline-monitor --shell /usr/sbin/nologin inferyx || true
sudo mkdir -p /opt/pipeline-monitor
sudo chown inferyx:inferyx /opt/pipeline-monitor

sudo -u inferyx python3 -m venv /opt/pipeline-monitor/venv
sudo -u inferyx /opt/pipeline-monitor/venv/bin/pip install --upgrade pip inferyx-monitoring

2. Create config files (first time)

cd /opt/pipeline-monitor
sudo -u inferyx /opt/pipeline-monitor/venv/bin/inferyx-monitoring \
  --init-config \
  --work-dir /opt/pipeline-monitor

Creates .env and jfl_batch.csv only if missing. Reference templates: .env.example, jfl_batch.csv.example.


3. Configure .env

sudo -u inferyx vi /opt/pipeline-monitor/.env
sudo chmod 600 /opt/pipeline-monitor/.env

Required

Variable Description
PIPELINE_SMTP_HOST SMTP server (e.g. smtp.office365.com)
PIPELINE_SMTP_PORT SMTP port (usually 587)
PIPELINE_SMTP_USERNAME SMTP login / sender email
PIPELINE_FROM_NAME Display name in From header
PIPELINE_SMTP_PASSWORD SMTP password
PIPELINE_MAIL_TO Primary alert recipients (comma-separated)
PIPELINE_API_BASE_URL API path only — no name= in URL (see example below)
PIPELINE_API_TOKEN API authentication token
PIPELINE_API_TOKEN_HEADER Header name for token (usually token)
PIPELINE_DEVOPS_EMAIL Recipient for no_data and script-failure alerts

API (important)

PIPELINE_API_BASE_URL=http://your-host:8080/framework/metadata/getBaseEntityStatusByCriteria
PIPELINE_API_TOKEN=your_token_here
PIPELINE_API_TOKEN_HEADER=token
PIPELINE_API_FILTER_BY_SCHEDULE_DATE=false
  • Use the base path only. The monitor adds name=<batch> per row in jfl_batch.csv.
  • Keep PIPELINE_API_FILTER_BY_SCHEDULE_DATE=false unless your API supports startDate / endDate filters.

Email templates (optional)

Variable Purpose
PIPELINE_MAIL_CC CC recipients
PIPELINE_MAIL_SUBJECT_DEFAULT Default subject template
PIPELINE_MAIL_SUBJECT_NO_DATA No-data alert subject
PIPELINE_MAIL_SUBJECT_FAILED Failed alert subject
PIPELINE_MAIL_SUBJECT_RUNNING Long-running alert subject
PIPELINE_MAIL_SUBJECT_MISSED Missed schedule subject
PIPELINE_MAIL_BODY_* HTML body per alert type (same keys as subjects)
PIPELINE_MAIL_SIGNATURE HTML footer appended to every alert

Placeholders: {batch_name}, {issue_type}, {issue_type_upper}, {status}, {error}, {error_line}, {expected_start_time}, {expected_end_time}, {frequency}, {avg_time}, {current_time}, {signature}

Scheduling / alerts (optional)

Variable Default Meaning
PIPELINE_CHECK_INTERVAL 60 Seconds between poll cycles
PIPELINE_SCHEDULE_GRACE_MINUTES 5 Minutes after expected start before missed/no-data alerts
PIPELINE_POST_RUN_GRACE_MINUTES 60 Minutes after expected end to keep polling
PIPELINE_ALERT_COOLDOWN_MINUTES 60 Cooldown between repeat alerts
PIPELINE_FAILED_ALERT_ONCE_PER_DAY true One failed alert per day per batch
PIPELINE_ALERT_ONCE_PER_DAY_ALL_SCENARIOS true One alert per issue type per day

4. Configure jfl_batch.csv

sudo -u inferyx vi /opt/pipeline-monitor/jfl_batch.csv

Rules:

  • One batch per line (each row ends with Active or Suspended on its own line).
  • Quote values that contain commas (multi-times, durations with spaces).
  • Use 24-hour time (0:30:00, 16:30:00).

Columns

Column Required Description
Name Yes Batch name exactly as in the API
Frequency Yes Daily, Hourly, Monthly, Monday, Thursday, etc.
ExpectedStartTime Scheduled Single or multiple times: 9:30:00 or "09:00,10:00,11:00"
AvgExecutionTime Recommended e.g. "3 mins", "10 mins", "1 Hr"
ExpectedDayOfMonth Monthly only Day 1–31
Status Optional Active = monitor, Suspended = skip

Example

Name,Frequency,ExpectedStartTime,AvgExecutionTime,ExpectedDayOfMonth,Status
batch_appdb_events,Daily,"09:00,10:00,11:00,12:00","3 mins",,Active
batch_appsflyer,Daily,0:30:00,"10 mins",,Active
batch_appdb_jiosense_bronze_silver,Daily,"01:00,08:00,15:00,21:00","12 mins",,Active
dashboard_batch_my_finances,Daily,5:45:00,"10 mins",,Active
dashboard_batch_ppl_rewards,Daily,6:50:00,"4 mins",,Active
batch_ignosis_part1,Daily,11:00:00,"80 mins",,Active
dashboard_batch_agentic,Daily,8:40:00,"10 mins",,Suspended

5. Test

sudo -u inferyx /opt/pipeline-monitor/venv/bin/inferyx-monitoring \
  --once \
  --work-dir /opt/pipeline-monitor \
  --env-file /opt/pipeline-monitor/.env \
  --csv-file /opt/pipeline-monitor/jfl_batch.csv

tail -100 /opt/pipeline-monitor/pipeline_script.log

6. systemd service

sudo tee /etc/systemd/system/inferyx-monitoring.service <<'EOF'
[Unit]
Description=Inferyx Pipeline Batch Monitor
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=inferyx
Group=inferyx
WorkingDirectory=/opt/pipeline-monitor
ExecStart=/opt/pipeline-monitor/venv/bin/inferyx-monitoring --work-dir /opt/pipeline-monitor --env-file /opt/pipeline-monitor/.env --csv-file /opt/pipeline-monitor/jfl_batch.csv
Restart=always
RestartSec=10
Environment=PYTHONUNBUFFERED=1
Environment=PIPELINE_ENV_FILE=/opt/pipeline-monitor/.env
Environment=PIPELINE_LOG_FILE=/opt/pipeline-monitor/pipeline_script.log
Environment=PIPELINE_CSV_FILE=/opt/pipeline-monitor/jfl_batch.csv

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now inferyx-monitoring.service
sudo systemctl status inferyx-monitoring.service

Logs:

sudo journalctl -u inferyx-monitoring.service -f

7. Upgrade (after new PyPI release)

On the build machine — publish first:

make pip-build
twine upload dist/inferyx_monitoring-*

On the target server:

sudo -u inferyx /opt/pipeline-monitor/venv/bin/pip install --upgrade inferyx-monitoring
sudo systemctl restart inferyx-monitoring.service

Troubleshooting

Symptom Fix
API error / empty response Set PIPELINE_API_FILTER_BY_SCHEDULE_DATE=false; remove name= from PIPELINE_API_BASE_URL
Wrong batch in API URL Base URL must not contain a fixed batch name
CSV parse errors One batch per line; quote "3 mins" and "09:00,10:00"
No email Check SMTP settings in .env
Skip a batch Set Status=Suspended in CSV

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inferyx_monitoring-1.0.11.tar.gz (23.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inferyx_monitoring-1.0.11-py3-none-any.whl (26.2 kB view details)

Uploaded Python 3

File details

Details for the file inferyx_monitoring-1.0.11.tar.gz.

File metadata

  • Download URL: inferyx_monitoring-1.0.11.tar.gz
  • Upload date:
  • Size: 23.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for inferyx_monitoring-1.0.11.tar.gz
Algorithm Hash digest
SHA256 bed89183481ad841e0896bad65da739898b8081663eda3610f509ed4ff27781a
MD5 144d5bb656a06e20ebef8215508b7cf6
BLAKE2b-256 80493e51cb04994bf1eb723509a535899f5fe542270fc47906ddd4b24668f0d3

See more details on using hashes here.

File details

Details for the file inferyx_monitoring-1.0.11-py3-none-any.whl.

File metadata

File hashes

Hashes for inferyx_monitoring-1.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 9e88770359408800e2d69c0efa3bb016896940d54d7b7e44480af73ef72340a8
MD5 5ab47e42086c017828f991cf656975de
BLAKE2b-256 b0986fd014f8fda47a7ad493f5302c3c751bea3dae02dca02dacee771266ca1e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page