SDK Router Tools — collection of utility tools for automation pipelines (telegram, logging, html cleaner, etc.)
Project description
sdkrouter-tools
SDK Router Tools — collection of utility tools for automation pipelines.
Installation
pip install sdkrouter-tools
Tools Included
- logging — Rich-powered logger with file persistence
- telegram — Rate-limited Telegram sender with priority queue
- html — HTML cleaner optimized for LLM pipelines
1. Logging (Rich-powered)
Universal Python logger with Rich console output and file persistence.
from sdkrouter_tools import get_logger
log = get_logger(__name__)
log.info("Hello world")
log.error("Something failed", exc_info=True)
# With custom level
log = get_logger(__name__, level="DEBUG")
log.debug("Debug details: %s", data)
Features
- Rich console output with colors and formatting
- Automatic file logging (daily rotation)
- Auto-detects project root for log directory
- Rich tracebacks with local variables
Convenience Functions
from sdkrouter_tools.logging import debug, info, warning, error, critical
info("Processing started")
warning("Low memory")
error("Failed to connect")
Configuration
from sdkrouter_tools import setup_logging
setup_logging(
level="DEBUG", # Log level
log_to_file=True, # Write to file
log_to_console=True, # Output to console
app_name="myapp", # App name for log file
rich_tracebacks=True, # Rich exception formatting
)
2. Telegram Sender
Rate-limited Telegram message sender with priority queue support.
from sdkrouter_tools import TelegramSender, ParseMode
sender = TelegramSender(
bot_token="YOUR_BOT_TOKEN",
chat_id="YOUR_CHAT_ID",
)
sender.send_message("Hello from sdkrouter-tools!")
sender.send_message("<b>Bold</b> message", parse_mode=ParseMode.HTML)
Convenience Functions
from sdkrouter_tools.telegram import (
send_error, send_success, send_warning,
send_info, send_stats, send_alert,
)
send_error("Something went wrong!", {"details": "error info"})
send_success("Task completed!", {"items_processed": 100})
send_warning("Disk space low", {"available": "10GB"})
send_alert("Critical: Server down!", {"server": "prod-1"})
Environment Variables
export TELEGRAM_BOT_TOKEN="your_bot_token"
export TELEGRAM_CHAT_ID="your_chat_id"
Priority Queue
Messages are processed with rate limiting (20 msg/sec):
from sdkrouter_tools import MessagePriority
# CRITICAL (1), HIGH (2), NORMAL (3), LOW (4)
sender.send_message("Important!", priority=MessagePriority.HIGH)
Sending Files
sender.send_photo("/path/to/image.jpg", caption="Check this out!")
sender.send_document("/path/to/file.pdf", caption="Report attached")
Queue Management
from sdkrouter_tools import telegram_queue
stats = telegram_queue.get_stats()
telegram_queue.flush(timeout=10.0) # Wait before script exit
3. HTML Cleaner
HTML cleaner optimized for LLM pipelines. Aggressive DOM cleaning, SSR hydration extraction, CSS class filtering, semantic chunking, and multiple output formats.
from sdkrouter_tools import HTMLCleaner, CleanerConfig, OutputFormat
cleaner = HTMLCleaner()
result = cleaner.clean(html)
print(result.output)
print(f"Reduction: {result.stats.reduction_percent}%")
print(f"Tokens: {result.stats.original_tokens} -> {result.stats.cleaned_tokens}")
Quick Functions
from sdkrouter_tools import clean, clean_to_json
# Quick clean
result = clean(html, max_tokens=5000, output_format="markdown")
# Get JSON if SSR data available, otherwise cleaned HTML
data = clean_to_json(html)
Configuration
from sdkrouter_tools import CleanerConfig, OutputFormat
config = CleanerConfig(
max_tokens=5000,
output_format=OutputFormat.MARKDOWN, # HTML, MARKDOWN, AOM, XTREE
filter_classes=True,
class_threshold=0.5,
try_hydration=True,
)
cleaner = HTMLCleaner(config)
result = cleaner.clean(html)
SSR Hydration Extraction
Extract structured data from server-side rendered pages:
from sdkrouter_tools.html import extract_hydration, detect_framework
framework = detect_framework(html) # NEXTJS_APP, NUXT3, etc.
data = extract_hydration(html)
if data.has_data:
products = data.page_props.get("products", [])
Supported: Next.js, Nuxt 2/3, SvelteKit, Remix, Gatsby, Qwik, Astro
CSS Class Filtering
from sdkrouter_tools.html import score_class, filter_classes, detect_css_framework
# Score classes by semantic relevance
result = score_class("product-card") # High score
result = score_class("css-abc123") # Low score (hash)
# Filter list of classes
classes = ["product-card", "css-abc123", "flex", "MuiButton-root"]
kept = filter_classes(classes, threshold=0.5) # ["product-card"]
# Detect CSS framework
framework = detect_css_framework(html) # "tailwind", "bootstrap", etc.
Output Formats
from sdkrouter_tools.html import to_markdown, to_aom_yaml, to_xtree
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, "lxml")
# Markdown
md = to_markdown(soup)
# AOM YAML (Playwright-style aria snapshot)
yaml = to_aom_yaml(soup)
# - navigation:
# - link "Home"
# - link "Products"
# XTree (hierarchical tree)
tree = to_xtree(soup)
# ROOT
# ├─ nav#main-nav
# │ └─ a.nav-link → "Home"
# └─ main
Pipeline API
from sdkrouter_tools import clean_html, clean_for_llm
result = clean_html(html, max_tokens=5000, output_format="markdown")
output = clean_for_llm(html) # Returns dict (SSR) or str (cleaned HTML)
Advanced Features
from sdkrouter_tools.html import (
# Shadow DOM
flatten_shadow_dom,
# Downsampling
downsample_html, estimate_tokens,
# Semantic Chunking
SemanticChunker, ChunkConfig,
# Context Extraction
extract_context, generate_selector,
# Helpers
json_to_toon, html_to_text, extract_links, extract_images,
)
Requirements
- Python >= 3.10
- rich >= 13.0
- pyTelegramBotAPI >= 4.14
- beautifulsoup4 >= 4.12
- lxml >= 5.3
- pydantic >= 2.10
- markdownify >= 0.14
- tiktoken >= 0.8
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sdkrouter_tools-0.1.2.tar.gz.
File metadata
- Download URL: sdkrouter_tools-0.1.2.tar.gz
- Upload date:
- Size: 65.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b124d691c038b56cc3dcbb7f8b90fd101c9d663e21f5ec137a544573c3f7caaf
|
|
| MD5 |
089250ceb622f12c5be26852eb79316a
|
|
| BLAKE2b-256 |
9d387a035b1b2f30cf0b68e15d581c181438311cedfd3feb218a68112f088268
|
File details
Details for the file sdkrouter_tools-0.1.2-py3-none-any.whl.
File metadata
- Download URL: sdkrouter_tools-0.1.2-py3-none-any.whl
- Upload date:
- Size: 87.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ce7cc2f576d6991e1133baa49084fb0ecc30225d48f9675acb3fbe82fbcb748
|
|
| MD5 |
d15bd0d052452f420c12e7ded62f4915
|
|
| BLAKE2b-256 |
597ad4e40581762621e963997384cc6d720b1eadd2e8ea9b09d53b58a8a619e4
|