Convert YouTube educational video to crisp PDF notes
Project description
Glimpsify (ytvideo2pdf)
Glimpsify extracts slide-like frames from educational videos and builds a PDF of the key visuals (diagrams, formulas, charts). It is optimized for lecture-style videos where text appears on screen over time.
Try now (without setup)
Try it out here: https://colab.research.google.com/drive/1xz6uHeY0QAzMTR8DbXJY8BSvNmKhI24Q?usp=sharing
Quick start
-
Install OCR engine (required for text detection)
- Windows: install Tesseract OCR and make sure
tesseractis on PATH. - macOS:
brew install tesseract - Debian/Ubuntu:
sudo apt-get install tesseract-ocr
- Windows: install Tesseract OCR and make sure
-
Install the package
pip install ytvideo2pdf
- Run the CLI
ytvideo2pdf --input=youtube --url="https://youtu.be/Z_MLrbI1s2E"
Common usage
Extract from a local folder (expects a single video file in the directory):
ytvideo2pdf --input=local --dir="C:\path\to\video_dir"
Run with a specific extraction strategy:
ytvideo2pdf --input=youtube --url="https://youtu.be/Z_MLrbI1s2E" --extraction=prominent_peaks
Extract a fixed number of frames:
ytvideo2pdf --input=youtube --url="https://youtu.be/Z_MLrbI1s2E" --k=10
Extract frames at explicit timestamps (seconds):
ytvideo2pdf --input=youtube --url="https://youtu.be/Z_MLrbI1s2E" --extraction=timestamps --timestamps="30, 95.5, 120"
What you get
- A PDF file in
output/with the extracted frames. - A JSON metadata file alongside the PDF (same name,
.json). - Intermediate folders (unless
--no-cleanup) for extracted frames and cached objects.
Key features
- Multiple extraction strategies to pick the most informative frames.
- OCR-based signal processing (Tesseract by default).
- Optional caching of processed frames for reuse.
- Optional plots of the OCR signal (for debugging and tuning).
CLI options (summary)
--input:youtube | local | pickle--url: YouTube video or playlist URL (foryoutubeinput)--dir: local directory path (forlocalorpickleinput)--ocr:tesseract | easy_ocr | paddleocr--ocr_approval:phash | pixel_comparison | approve_all | reject_all--extraction:prominent_peaks | k_transactions | key_moments | timestamps | rate_change_threshold--k: number of frames to extract, orauto--timestamps: comma-separated seconds (fortimestampsextraction)--threshold: integer threshold forrate_change_threshold--cache-frames/--no-cache-frames--skip-plot/--no-skip-plot--cleanup/--no-cleanup
For Python API usage, see LIBRARY.md.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ytvideo2pdf-0.1.0.tar.gz.
File metadata
- Download URL: ytvideo2pdf-0.1.0.tar.gz
- Upload date:
- Size: 390.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6575fc463404547191e8cbc85e73e3abb5c40f43f7ca7092cb946347da9f24be
|
|
| MD5 |
87fb6bd1c8a7afa907a730374913438f
|
|
| BLAKE2b-256 |
0f5f454f275836b3ab063b4f87891c94800e02d6c486b0013409f29718c0ec84
|
File details
Details for the file ytvideo2pdf-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ytvideo2pdf-0.1.0-py3-none-any.whl
- Upload date:
- Size: 38.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
98221f4eaa1b96e977381f27ded3da798f6e067aa0f04a687f88ebc1802da26c
|
|
| MD5 |
860148cd2c51ac9e8d4edd422d4dc308
|
|
| BLAKE2b-256 |
35c82b449f01927c68ddc6bb8430160ebd1737b3c0c676a5b3d8ce7d7733dad5
|