Transcription and summarization tool for YouTube, Twitter/X.com videos, and local files with automatic translation and language model processing

These details have not been verified by PyPI

Project links

Project description

TRNS — Video Transcription & Summarization

Transcribe YouTube, Twitter/X, and local video files. Automatic translation to Russian, LLM summaries via OpenRouter. Works as a CLI tool or Telegram bot.

Tech Stack

Component	Technology
Speech-to-text	faster-whisper
Video download	yt-dlp (YouTube, Twitter/X, 1000+ sites)
Subtitles	youtube-transcript-api
Translation	deep-translator (Google Translate)
LLM processing	OpenRouter.ai via OpenAI client
Telegram bot	Pyrogram (MTProto, up to 2 GB file downloads)
Webhook server	FastAPI + Uvicorn

Quick Start

Install

pip install trns
# Also need FFmpeg:
# macOS: brew install ffmpeg
# Linux: sudo apt install ffmpeg

CLI

trns https://www.youtube.com/watch?v=VIDEO_ID
trns https://twitter.com/user/status/1234567890
trns /path/to/video.mp4

# With options
trns https://youtu.be/abc --whisper-model medium --debug

Telegram Bot

export BOT_TOKEN=...            # from @BotFather
export TELEGRAM_API_ID=...      # from my.telegram.org
export TELEGRAM_API_HASH=...    # from my.telegram.org
export AUTH_KEY=secret123       # users authenticate with this once

python -m trns.bot.server

Then set up a webhook pointing to https://your-domain/webhook. See Setup Guide.

Configuration

TRNS uses a JSON config file (config.json). On first run, a default is created automatically. You can also pass everything via CLI flags — explicit CLI flags always win over config values, which in turn override parser defaults. If you don't pass a flag, the config value is used; if the config doesn't set it either, the built-in default applies.

Configuration Reference

Every key in config.json maps to a CLI flag. Here's what each one does:

Key	CLI Flag	Type	Default	Description
`url`	positional arg	string	`""`	Video URL or local file path. Empty = must provide on command line.
`method`	`--method`	`"auto"` \| `"subtitles"` \| `"whisper"`	`"auto"`	auto: try YouTube captions first, fall back to Whisper. subtitles: captions only (fails if unavailable). whisper: always use speech-to-text.
`interval`	`--interval`	integer (seconds)	`30`	Chunk duration for live/chunked processing. Each chunk is this many seconds of audio.
`language`	`--language`	string (ISO 639-1)	`"en"`	Expected language of the video. Used for subtitle extraction and as a hint for Whisper.
`whisper_model`	`--whisper-model`	`"auto"` \| `"tiny"` \| `"base"` \| `"small"` \| `"medium"` \| `"large"`	`"auto"`	Whisper model size. `auto` (default) picks tiny for English, small for other languages. Explicit values override auto-selection. Larger = more accurate but slower and uses more RAM.
`use_faster_whisper`	`--use-faster-whisper` / `--no-faster-whisper`	boolean	`true`	Use the faster-whisper library (CTranslate2 backend). Use `--no-faster-whisper` to fall back to openai-whisper.
`translation_output`	`--translation-output`	`"russian-only"` \| `"both"` \| `"original-only"`	`"russian-only"`	What to print for transcription output. russian-only: only the Russian translation. both: original + Russian. original-only: no translation.
`save_transcript`	`--save-transcript`	string (file path) \| `null`	`null`	If set, appends all output to this file. Relative paths resolve against `TRNS_HOME` / CWD.
`overlap`	`--overlap`	integer (seconds)	`2`	Overlap between audio chunks. Prevents words from being cut at chunk boundaries.
`process_mode`	`--process-mode`	`"auto"` \| `"chunked"` \| `"full"`	`"auto"`	auto: full for regular videos, chunked for live streams. full: download entire video, transcribe with progress bar. chunked: process in `interval`-second pieces (required for live).
`lm_window_seconds`	`--lm-window-seconds`	integer (seconds)	`120`	How much transcription context the LLM sees. It gets the last `ceil(window_seconds / interval)` chunks.
`lm_interval`	`--lm-interval`	integer (seconds)	`30`	How often the LLM processes accumulated text. Can differ from `interval`.
`lm_output_mode`	`--lm-output-mode`	`"both"` \| `"transcriptions-only"` \| `"lm-only"`	`"both"`	both: print transcriptions AND LLM summaries. transcriptions-only: skip LLM entirely. lm-only: only show LLM output.
`lm_api_key_file`	`--lm-api-key-file`	string (file path)	`"api_key.txt"`	File containing OpenRouter API key.
`lm_prompt_file`	`--lm-prompt-file`	string (file path)	`"prompt.md"`	Prompt template for Russian-language LLM processing.
`lm_model`	`--lm-model`	string	`"google/gemma-3-27b-it:free"`	OpenRouter model identifier. See openrouter.ai/models for options. Free models have `:free` suffix.
`debug`	`--debug`	boolean	`false`	false (production): logs go to `logs.txt`, stdout shows only transcription/LLM output. true: verbose logs go to stderr, useful for troubleshooting.
`context`	`--context`	string	`""`	Additional context passed to the LLM (e.g. "This is a Sberbank earnings call"). Helps the model produce better summaries.
`allowed_user_ids`	—	array of integers	`[]`	Telegram user IDs allowed to use the bot. Users can also authenticate at runtime via `AUTH_KEY`.

Example config.json

{
  "url": "",
  "method": "auto",
  "interval": 30,
  "language": "en",
  "whisper_model": "medium",
  "use_faster_whisper": true,
  "translation_output": "russian-only",
  "save_transcript": null,
  "overlap": 2,
  "process_mode": "full",
  "lm_window_seconds": 120,
  "lm_interval": 30,
  "lm_output_mode": "both",
  "lm_api_key_file": "api_key.txt",
  "lm_prompt_file": "prompt.md",
  "lm_model": "google/gemma-3-27b-it:free",
  "debug": false,
  "context": "",
  "allowed_user_ids": []
}

Environment Variables

These are primarily for the Telegram bot server, not the CLI:

Variable	Purpose	Fallback file
`BOT_TOKEN`	Telegram bot token	`bot_key.txt`
`TELEGRAM_API_ID`	Pyrogram MTProto API ID (from my.telegram.org)	—
`TELEGRAM_API_HASH`	Pyrogram MTProto API hash	—
`AUTH_KEY`	One-time auth key for new bot users	`key.txt`
`OPENROUTER_API_KEY`	OpenRouter API key (alternative to `api_key.txt`)	`api_key.txt`
`HOST`	Server bind address	`0.0.0.0`
`PORT`	Server port	`8000`
`CONFIG_PATH`	Path to `config.json`	`config.json`
`METADATA_PATH`	Path to `metadata.json`	`metadata.json`
`TRNS_HOME`	Base directory for resolving all relative paths	CWD

File Layout

When you run TRNS, it expects these files relative to TRNS_HOME (defaults to your current working directory):

your-project/
├── config.json          # Main configuration (auto-created on first run)
├── metadata.json        # Localization strings + daily capacity counter
├── api_key.txt          # OpenRouter API key(s), one per line
├── prompt.md            # LLM prompt template (Russian output)
├── prompt_original.md   # LLM prompt template (original language output)
├── bot_key.txt          # Telegram bot token (alternative to env var)
├── key.txt              # Auth key (alternative to env var)
└── logs.txt             # Production logs (auto-created)

How It Works

Pipeline Flow

Video URL or file
    │
    ├─ 1. Try YouTube auto-captions (if method=auto and it's YouTube)
    │     └─ Success? Skip Whisper, go to step 3
    │
    ├─ 2. Download audio → Whisper speech-to-text
    │     ├─ Language auto-detection
    │     ├─ Chunk overlap to prevent word loss
    │     └─ Progress bar (full mode) or streaming (chunked mode)
    │
    ├─ 3. Translate to Russian (if source ≠ Russian)
    │     └─ Google Translate via deep-translator
    │
    └─ 4. LLM summarization (if lm_output_mode ≠ transcriptions-only)
          ├─ Sends last N seconds of transcription (lm_window_seconds)
          ├─ Bilingual mode: separate prompts for original + Russian
          └─ Output: structured summary per prompt template

Processing Modes

Mode	When	How
full	Regular videos	Downloads entire video first, transcribes with progress bar. Best quality.
chunked	Live streams	Processes audio in `interval`-second chunks. Real-time output.
auto	Default	Picks `full` for regular videos, `chunked` for live streams.

Whisper Models

Model	Size	Speed	Quality	RAM
`tiny`	39M	~32x realtime	Basic	~1 GB
`base`	74M	~16x realtime	OK	~1 GB
`small`	244M	~6x realtime	Good	~2 GB
`medium`	769M	~2x realtime	Very good	~5 GB
`large`	1550M	~1x realtime	Best	~10 GB

API Key Setup

Put your OpenRouter API key in api_key.txt. The system tracks daily usage capacity in metadata.json and resets it at UTC midnight.

Architecture

┌─────────────────────────────────────────────────────┐
│                   User Interface                     │
├────────────────────┬────────────────────────────────┤
│   CLI (trns cmd)   │  Telegram Bot (Pyrogram+FastAPI)│
└─────────┬──────────┴──────────────┬─────────────────┘
          └────────────┬────────────┘
                       │
         ┌─────────────▼──────────────┐
         │   TranscriptionPipeline    │
         │   (orchestration + threads)│
         └─────────────┬──────────────┘
                       │
     ┌─────────────────┼─────────────────┐
     │                 │                 │
┌────▼─────┐   ┌──────▼──────┐   ┌──────▼──────┐
│  yt-dlp  │   │   faster-   │   │ OpenRouter  │
│  audio   │   │   whisper   │   │    LLM      │
│ download │   │   STT       │   │  summaries  │
└──────────┘   └─────────────┘   └─────────────┘

Threading (Telegram Bot)

The bot uses a queue-based architecture for thread-safe output:

Webhook arrives → FastAPI handler → spawns background thread
Background thread runs TranscriptionPipeline with an output_callback
output_callback puts text into a queue.Queue
Async loop drains the queue and sends messages to Telegram

No global state mutation — each pipeline instance is independent.

Development

git clone https://github.com/kakoyvostorg/trns.git
cd trns
pip install -e ".[dev]"
pytest                    # 117 tests
ruff format .             # code formatting

Docker

docker build -f docker/Dockerfile -t trns .
docker-compose -f docker/docker-compose.yml up

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.1

Apr 2, 2026

0.2.0

Mar 26, 2026

0.1.1

Nov 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trns-0.2.1.tar.gz (74.3 kB view details)

Uploaded Apr 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

trns-0.2.1-py3-none-any.whl (65.4 kB view details)

Uploaded Apr 2, 2026 Python 3

File details

Details for the file trns-0.2.1.tar.gz.

File metadata

Download URL: trns-0.2.1.tar.gz
Upload date: Apr 2, 2026
Size: 74.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for trns-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`a5b8e085cf8d4984f1cc32590733931b21000187e6624ea3efa7b1a1a352fbbd`
MD5	`a5dafef4ecc59bf03bed6ff65be76a3b`
BLAKE2b-256	`e0eb891958013225abd2c8c87a70e6ab28b3d7ea2cd73d45b3a023ab93b58d1b`

See more details on using hashes here.

File details

Details for the file trns-0.2.1-py3-none-any.whl.

File metadata

Download URL: trns-0.2.1-py3-none-any.whl
Upload date: Apr 2, 2026
Size: 65.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for trns-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`295535a00b26a2db4954c31aa096a0dbbff5215cd68ee6f0ce91dde367b1907e`
MD5	`67c6cd3272d246372e14ec866721da5d`
BLAKE2b-256	`7866b59843d47f8987414096866b4cc104f9601ad0c884871d38f0b1871f7570`

See more details on using hashes here.

trns 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TRNS — Video Transcription & Summarization

Tech Stack

Quick Start

Install

CLI

Telegram Bot

Configuration

Configuration Reference

Example config.json

Environment Variables

File Layout

How It Works

Pipeline Flow

Processing Modes

Whisper Models

API Key Setup

Architecture

Threading (Telegram Bot)

Development

Docker

Further Reading

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes