Extract, refine, and analyze YouTube video segments with precision
Project description
SegScript
A command-line tool for managing, enhancing, and interacting with YouTube transcripts.
- Overview
- Features
- Installation
- Dependencies
- Usage
- File Structure
- Examples
- Next TODOs
- Contributing
- License
- Acknowledgments
Overview
SegScript allows you to download, view, and query YouTube video transcripts directly from your terminal. It provides a clean interface for working with transcripts, including the ability to extract specific time ranges and view enhanced transcript content. I've used the langchain-google-genai package in conjunction with Google's Gemini Flash 2.0 model, which has delivered exceptional results in transcript enhancement.
Features
- Download transcripts from any YouTube video using its ID
- List all downloaded transcripts stored in your local collection
- View full transcripts or segments based on time ranges
- Interactive mode for browsing and working with your transcript collection
- Rich text formatting for improved readability in the terminal
Installation
pip install segscript
For testing purposes,
# Clone the repository
git clone https://github.com/keshavsharma25/segscript.git
cd segscript
# Install dependencies
pip install -r pyproject.toml
# Install the package (optional)
pip install -e .
Dependencies
- youtube-transcript-api: Fetch youtube transcripts with ease
- click: Command-line interface creation kit
- rich: Terminal formatting and styling
- python-dotenv: Load
GOOGLE_API_KEYfrom the command line environment - pathlib: Object-oriented filesystem paths
- langchain-google-genai: For synthesizing transcript into a well structured format
Usage
Basic Commands
# List all downloaded transcripts
segscript list
# Download a transcript for a YouTube video
segscript download VIDEO_ID
# Get a transcript (downloads if not already available)
segscript get VIDEO_ID
# Get a transcript for a specific time range
segscript get VIDEO_ID --time-range "10:00;20:00"
# Start interactive mode
segscript prompt
Interactive Mode
Interactive mode provides a user-friendly interface for:
- Browsing your transcript collection
- Selecting a transcript to work with
- Viewing full transcripts or specific segments
- Querying transcripts by time range
File Structure
Transcripts are stored in the ~/.segscript/ directory with the following structure:
~/.segscript/
├── .env # Environment variables file
├── VIDEO_ID_1/
│ ├── VIDEO_ID_1.json # Raw transcript data
│ └── metadata.json # Video metadata
├── VIDEO_ID_2/
│ ├── VIDEO_ID_2.json
│ └── metadata.json
└── ...
Examples
Download a transcript
segscript download dQw4w9WgXcQ
View a transcript for a specific section of a video
segscript get dQw4w9WgXcQ --time-range "1:30;2:45"
Interactive browsing
segscript prompt
Next TODOs
- Add transcript summary support.
- Add a prompt to make the each sentence be have its own line for better readibility.
- In
prompt, improve the UX by clearing the screen before running the command (like after download inprompt). - Improve the copy of the segscript prompt for better understanding.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- A huge thanks to Youtube Transcript API for making transcript retrieval so easy and accessible.
- Also kudos to Langchain Google for the
langchain-google-genai. - Built with Rich for beautiful terminal output.
- Uses Click for command-line interface.
Note: SegScript is not affiliated with YouTube or Google.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file segscript-0.1.7.tar.gz.
File metadata
- Download URL: segscript-0.1.7.tar.gz
- Upload date:
- Size: 15.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a36d35c225f53e3dfc6e4eb518178cf170cc40c8838e33d2f73514ad53634195
|
|
| MD5 |
83aaa249d660f86981bac1f7236562b5
|
|
| BLAKE2b-256 |
48d45c435c0a72fd53d50abc6f06648facc2a6cc5caac0bb27c7b3e20284755f
|
File details
Details for the file segscript-0.1.7-py3-none-any.whl.
File metadata
- Download URL: segscript-0.1.7-py3-none-any.whl
- Upload date:
- Size: 15.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
459cf89671a54c0b6dd9e059b7f8b4b6e0e5c7ff1a56d8779dd8cbb6e23a03ec
|
|
| MD5 |
956dd26f8acd77c528888d16b41d7802
|
|
| BLAKE2b-256 |
b1ca1401d35458354d69768a5bce9f4f2967278ba3b6d0c87719f7371ca43887
|