Telegram Log Service — receive ML training logs via HTTP and send real-time alerts through a Telegram bot.

These details have not been verified by PyPI

Project links

Project description

telegram-log-service

telegram-log-service — a server that receives ML training logs via HTTP and sends real-time alerts through a Telegram bot. Designed to work with messenger-logger-callback.

Architecture

Training Script                      Telegram Log Service                 Telegram
┌──────────────────┐   HTTP POST    ┌─────────────────────┐             ┌──────────┐
│ MessengerLogger  │ ─────────────> │ /api/logs handler   │             │          │
│ or Callback      │   /api/logs    │   ↓                 │   Bot API   │ Telegram │
│ + heartbeat      │                │ global_state        │ ──────────> │ Users    │
└──────────────────┘                │   ↓                 │             │          │
                                    │ alerting → bot      │             └──────────┘
                                    │ staleness_checker   │
                                    └─────────────────────┘

Flow:

Training scripts send JSON events (logs, status updates, heartbeats) to POST /api/logs.
The web handler updates in-memory run state and triggers alerts when appropriate.
The Telegram bot sends alerts to subscribed users and responds to commands.
A background staleness checker detects crashed/stalled runs.

Prerequisites

Python 3.8+
A Telegram bot token (create one via @BotFather)

Installation

From source (pip)

git clone https://github.com/Riko0/telegram_log_service.git
cd telegram_log_service
pip install .

Configure

cp .env.example .env
# Edit .env and fill in your TELEGRAM_BOT_TOKEN and ADMIN_TELEGRAM_NAME

Run

After installing, the telegram-log-service command is available system-wide:

telegram-log-service

Or using the Python module:

python -m telegram_log_service

Docker

# From the telegram_log_service directory:
chmod +x deploy/docker/build_docker.sh deploy/scripts/startup.sh
./deploy/docker/build_docker.sh

The Docker image installs the package via pip install . and runs telegram-log-service as the entry point. Pass your .env file via --env-file.

Configuration

All settings are via environment variables (or .env file). See .env.example for a complete template.

Variable	Required	Default	Description
`TELEGRAM_BOT_TOKEN`	Yes	—	Telegram bot token from BotFather
`WEB_SERVER_HOST`	No	`0.0.0.0`	Bind address for the HTTP server
`WEB_SERVER_PORT`	No	`5000`	Port for the HTTP server
`WEB_AUTH_TOKEN`	No	—	If set, `/api/logs` requires `Authorization: Bearer <token>`
`STALL_ALERT_THRESHOLD_SECONDS`	No	`1800`	Seconds without logs before a run is considered stalled
`STALLED_RUN_AUTO_REMOVE_THRESHOLD_SECONDS`	No	`3600`	Seconds before a stalled run is auto-removed
`HEARTBEAT_STALL_THRESHOLD_SECONDS`	No	`300`	Stall threshold for runs sending heartbeats (shorter)
`BEST_METRIC_ALERT_COOLDOWN_SECONDS`	No	`300`	Minimum seconds between best-metric alerts per run
`ADMIN_TELEGRAM_NAME`	No	—	Telegram username (without @) for admin commands

API

`POST /api/logs`

Receives training events. Requires Authorization: Bearer <token> header if WEB_AUTH_TOKEN is set.

Required fields:

Field	Type	Description
`project_name`	string	Project identifier
`run_id`	string	Unique run identifier
`event_type`	string	One of: `training_started`, `trainer_log`, `epoch_ended`, `training_finished`, `custom_log`, `heartbeat`
`timestamp`	string	ISO 8601 timestamp

Optional fields:

Field	Type	Description
`author_username`	string	Who started the run
`trainer_state`	object	Training state (`global_step`, `epoch`, `is_training`, `best_metric`, etc.)
`logs`	object	Metric key-value pairs (for `trainer_log`)
`custom_data`	object	Arbitrary data (for `custom_log`)
`clearml_link`	string	URL to ClearML dashboard for this run

Any other top-level keys are stored as run metadata.

`GET /health`

Returns server status:

{"status": "ok", "active_runs": 3}

Bot Commands

User Commands

Command	Description
`/start`	Register with the bot, auto-subscribe to all runs
`/help`	Show available commands
`/status`	List all active training runs
`/status <project> <run_id>`	Get status of a specific run
`/full_status`	Detailed status for all runs
`/full_status <project> <run_id>`	Detailed status for a specific run
`/subscribe`	Subscribe to all current and future runs
`/subscribe <project> <run_id>`	Subscribe to a specific run
`/unsubscribe`	Unsubscribe from all alerts
`/unsubscribe <project> <run_id>`	Unsubscribe from a specific run
`/list_subscriptions`	List your current subscriptions

Admin Commands

Command	Description
`/add_user <username>`	Add a user to the whitelist
`/remove_user <username>`	Remove a user from the whitelist
`/list_users`	List all whitelisted users
`/remove_run <project> <run_id>`	Manually remove a training run

Alerts

The bot sends alerts to subscribed users when:

Alert	When
Training Started	A new run sends its first `training_started` event
Training Finished	A run sends `training_finished`
Training Stalled	No logs/heartbeats received beyond the threshold
Training Resumed	A stalled run starts sending logs again
Best Metric Changed	`best_metric` improves (with cooldown to avoid spam)
Run Removed	A stalled run is auto-removed after prolonged inactivity

If ClearML is detected, alerts include a direct link to the ClearML dashboard.

Heartbeat

When the client library sends heartbeat events (every ~60 seconds by default), the server uses a shorter stall threshold (HEARTBEAT_STALL_THRESHOLD_SECONDS, default 300s) for faster crash detection. Runs without heartbeats use the standard STALL_ALERT_THRESHOLD_SECONDS (default 1800s). This is fully backwards-compatible -- old clients work the same as before.

Data Persistence

Whitelist, subscribers, user info are saved to JSON files and survive restarts.
Training run data is saved to training_data.json on every meaningful event (not heartbeats) and restored on startup.

Related Projects

messenger-logger-callback — the client library that sends training logs to this service. pip install messenger-logger-callback

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Mar 13, 2026

0.1.0

Mar 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

telegram_log_service-0.1.1.tar.gz (20.9 kB view details)

Uploaded Mar 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

telegram_log_service-0.1.1-py3-none-any.whl (22.5 kB view details)

Uploaded Mar 13, 2026 Python 3

File details

Details for the file telegram_log_service-0.1.1.tar.gz.

File metadata

Download URL: telegram_log_service-0.1.1.tar.gz
Upload date: Mar 13, 2026
Size: 20.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for telegram_log_service-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`534ff25a57b436719b7e7a41040be1eadd4b6372d26acc559ad249e6722c729a`
MD5	`300fd2a8511ce626926af7ca1843fa90`
BLAKE2b-256	`a2c8eb04728df05d1a9770e92925ff2c6777586ee37451272779a07f4b511681`

See more details on using hashes here.

File details

Details for the file telegram_log_service-0.1.1-py3-none-any.whl.

File metadata

Download URL: telegram_log_service-0.1.1-py3-none-any.whl
Upload date: Mar 13, 2026
Size: 22.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for telegram_log_service-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1482d0c7dc96611547fb9c63af93777014b3c2da30fbbbca647d37cde48b9cdb`
MD5	`2315eb66b037fc6139259de945653d25`
BLAKE2b-256	`23732087c871000ee7329ce24141dfea1b748043c6e85f9a537f47bc37b04890`

See more details on using hashes here.

telegram-log-service 0.1.1

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

telegram-log-service

Architecture

Prerequisites

Installation

From source (pip)

Configure

Run

Docker

Configuration

API

POST /api/logs

GET /health

Bot Commands

User Commands

Admin Commands

Alerts

Heartbeat

Data Persistence

Related Projects

License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`POST /api/logs`

`GET /health`