Telegram preview page parser
Project description
Telegram Channels Monitor
Telegram monitoring tool for public channels that can be viewed via WEB preview. Extracts data about messages and media files and stores all data in a database. No tokens or bots are required for monitoring. Just launch the app and collect information non-stop in the database.
🌟 Features
- Parsing recent messages from public Telegram channels
- Extracting metadata and media attachments
- Storing data in SQLite database
- Support for forwarded messages and replies
- Configurable data collection parameters
🛠 Installation
- Ensure Python 3.12+ is installed (recommendation)
- Clone repository
git clone 'https://github.com/aIligat0r/tpm.git'
or
pip install telegram-pm
⚙️ Configuration
Configurations (file .env or telegram_pm/config.py)
Parsing configurations:
TELEGRAM_PARSE_REPEAT_COUNT- Number of requests (default5). 20 messages per request. (1 iter - last 20 messages)TELEGRAM_SLEEP_TIME_SECONDS- Number of seconds after which the next process of receiving data from channels will begin (default60seconds)TELEGRAM_SLEEP_AFTER_ERROR_REQUEST- Waiting after a failed requests (default30)
HTTP configurations:
HTTP_RETRIES- Number of repeated request attempts (default3)HTTP_BACKOFF- Delay between attempts for failed requests (default3seconds)HTTP_TIMEOUT- Waiting for a response (default30seconds)
🚀 Usage
1. Build application:
Build docker image:
docker build -t tpm .
Create poetry env:
- Install poetry:
pip install poetry
- Create poetry env and install packages:
poetry install
2. Launching the app
| Options | Description | Required |
|---|---|---|
--db-path |
Path to db file (if sqlite). Else path to dir (if csv) | ❌ required |
--channels-filepath/--chf |
File of channel usernames (file where in each line Telegram username) | ❌ required (or usernames --channel/--ch) |
--channel/--ch |
List of usernames that are passed by the parameter | ❌ required (or file of channels --channels-filepath/--chf) |
--verbose/--v |
Verbose mode | ➖ |
--format/--f |
Data saving format (csv, sqlite) | ➖ |
--help/--h |
Help information | ➖ |
Poetry:
poetry run tpm --ch freegaza --ch BREAKINGNewsTG --db-path .\tg.db --v
or
poetry run tpm --channels-filepath /path/to/monitoring_usernames.txt --db-path .\tg.db
Docker:
docker run -it --rm tpm --ch freegaza --db-path test_tg.db --v
or (if you want to transfer usernames in a file, then you need to mount the paths)
$ mkdir ~/tpm_data/ # create a folder for data
$ cp /path/to/channel/usernames.txt ~/tpm_data/usernames.txt # copy the file with the user names to the previously created folder
$ chmod 666 ~/tpm_data_dir/telegram_messages.sqlite && chmod 666 ~/tpm_data_dir/usernames.txt # grant access to use this folder from the container
docker run -it --rm \
-v ~/tpm_data_dir/telegram_messages.sqlite:/data/telegram_messages.sqlite \
-v ~/tpm_data_dir/usernames.txt:/data/usernames.txt \
tpm --db-path /data/telegram_messages.sqlite --chf /data/usernames.txt
Python:
from telegram_pm.run import run_tpm
run_tpm(
db_path="tg.db", # Path to db file (if sqlite). Else path to dir (if csv)
channels=["channel1", "channel2"], # Channels list
verbose=True, # Verbose mode
# Configuration (optional)
format="sqlite", # Data saving format (csv, sqlite)
tg_iteration_in_preview_count=5, # Number of requests (default 5). 20 messages per request. (1 iter - last 20 messages)
tg_sleep_time_seconds=60, # Number of seconds after which the next process of receiving data from channels will begin (default 60 seconds)
tg_sleep_after_error_request=30, # Waiting after a failed requests (default 30)
http_retries=3, # Number of repeated request attempts (default 3)
http_backoff=3, # Delay between attempts for failed requests (default 3 seconds)
http_timeout=60, # Waiting for a response (default 30 seconds)
http_headers={ # HTTP headers
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"
}
)
🗃️ Database Structure
The tables will be named as usernames. Each table is a username that was passed in the running parameters.
| Field | Type | Description |
|---|---|---|
id |
INTEGER | Channel ID |
url |
TEXT | Message URL |
username |
TEXT | Channel username |
date |
TEXT (ISO 8601) | Message date |
text |
TEXT | Message text |
replied_post_url |
TEXT | Replied message URL |
urls |
JSON | URLs from text |
photo_urls |
JSON | Photo URLs |
video_urls |
JSON | Video URLs |
created_at |
CURRENT_DATETIME (ISO 8601) | Record creation time |
url_preview |
TEXT | Text from preview URL |
round_video_url |
TEXT | URL to round video message |
files |
JSON | List of file names and their description |
tags |
JSON | List of tags from a message body |
forwarded_from_url |
TEXT | URL of the channel from which the message was forwarded |
forwarded_from_name |
TEXT | Name of the channel from which the message was forwarded |
⚠️ Limitations
Works only with public channels
🧮 Example of work
Verbose mode:
View database
📜 License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file telegram_pm-0.1.5.2.tar.gz.
File metadata
- Download URL: telegram_pm-0.1.5.2.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73b5453aed21d76fbef9ad88fb451dad2636b6a6518d7c07d7e8ac0bc2505e20
|
|
| MD5 |
1f6a5ceb0dd8ccdc27757c3989319072
|
|
| BLAKE2b-256 |
c2d250919905a9a5c0eb706ae8ed63664888f1fcf4232a8bcd0fc6dc77b2efc3
|
File details
Details for the file telegram_pm-0.1.5.2-py3-none-any.whl.
File metadata
- Download URL: telegram_pm-0.1.5.2-py3-none-any.whl
- Upload date:
- Size: 17.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64e27eee2130ebf8a7d0617a773f4a816c814f3ca36adc5b60ac4693f0892180
|
|
| MD5 |
38fe3683b42d4ee66991492cba4143b6
|
|
| BLAKE2b-256 |
a0b18586b09f6dffe33a080ef0bc4f379146fad8f9bd8f6e288d3c844531878f
|