Skip to main content

Extract and process YouTube video transcripts with AI

Project description

GetOutVideo API

Transform YouTube videos into professional documents with AI

GetOutVideo is a Python API that converts YouTube videos into structured, readable documents. Simply provide a YouTube URL, and it extracts transcripts and transforms them into professional-quality materials using OpenAI's GPT models.

What it does

Turn any YouTube video or playlist into:

  • Summaries - Quick overviews and key points
  • Educational materials - Structured lessons and tutorials
  • Documentation - Technical guides and how-tos
  • Study notes - Q&A format and bullet points
  • Research content - Comprehensive analysis

Perfect for students, researchers, content creators, and professionals who want to convert video content into text-based learning materials.

Python 3.8+ License: MIT PyPI version

Features

  • YouTube Integration: Extract transcripts from individual videos or entire playlists
  • AI Processing: Transform raw transcripts using OpenAI's GPT models
  • Multiple Styles: Generate summaries, educational content, Q&A, key points, and more
  • Flexible Configuration: Customize processing parameters, languages, and output formats
  • Fallback Transcription: Uses OpenAI's audio transcription when YouTube transcripts aren't available
  • Batch Processing: Handle multiple videos efficiently
  • Clean API: Simple interface for both basic and advanced use cases

Installation

pip install getoutvideo

System Requirements

  • Python 3.8 or higher
  • FFmpeg (required for audio processing fallback)

Installing FFmpeg

Windows:

# Using chocolatey
choco install ffmpeg

# Or download from https://ffmpeg.org/download.html

macOS:

# Using homebrew
brew install ffmpeg

Linux:

# Ubuntu/Debian
sudo apt update && sudo apt install ffmpeg

# CentOS/RHEL
sudo yum install ffmpeg

Quick Start

Basic Usage

from getoutvideo import process_youtube_playlist

# Process a single video
files = process_youtube_playlist(
    url="https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    output_dir="./output",
    openai_api_key="your-openai-api-key"
)

print(f"Generated {len(files)} documents!")
# Creates: video_title [Summary].md, video_title [Educational].md, etc.

Process Specific Styles

from getoutvideo import GetOutVideoAPI

api = GetOutVideoAPI(openai_api_key="your-openai-api-key")

# Generate only summaries and key points
files = api.process_youtube_url(
    url="https://www.youtube.com/watch?v=VIDEO_ID",
    output_dir="./summaries",
    styles=["Summary", "Key Points"]
)

Process Playlists

# Process entire playlist
files = process_youtube_playlist(
    url="https://www.youtube.com/playlist?list=PLAYLIST_ID",
    output_dir="./course_materials",
    openai_api_key="your-openai-api-key",
    start_index=1,    # First video
    end_index=5       # Stop after 5 videos (0 = all)
)

How It Works

  1. Extract - Downloads transcripts from YouTube (with AI fallback when needed)
  2. Process - Uses OpenAI's GPT models to format and structure content
  3. Generate - Creates professional markdown documents in multiple styles

Available Processing Styles

GetOutVideo creates different document types from the same video:

Style Best For Output Format
Summary Quick overviews Concise main points
Educational Learning materials Structured lessons with examples
Key Points Study notes Bullet-pointed highlights
Q&A Training materials Question and answer format
Technical Documentation Step-by-step instructions
Balanced Comprehensive reports Full detailed coverage
# Get all available styles
from getoutvideo import GetOutVideoAPI
api = GetOutVideoAPI(openai_api_key="your-key")
print(api.get_available_styles())

Configuration

Using Environment Variables

export OPENAI_API_KEY="your-openai-api-key"
export LANGUAGE="English"
from getoutvideo import load_api_from_env
api = load_api_from_env()  # Uses environment variables

API Reference

Main Functions

process_youtube_playlist()

One-line processing for videos and playlists:

from getoutvideo import process_youtube_playlist

files = process_youtube_playlist(
    url="https://www.youtube.com/watch?v=VIDEO_ID",
    output_dir="./output",
    openai_api_key="your-openai-api-key",
    styles=["Summary", "Key Points"],  # Optional
    start_index=1,                     # Optional: playlist start
    end_index=0,                       # Optional: end (0 = all)
    output_language="English"          # Optional
)

GetOutVideoAPI Class

For advanced control:

from getoutvideo import GetOutVideoAPI

api = GetOutVideoAPI(openai_api_key="your-openai-api-key")

# Process videos
files = api.process_youtube_url(url, output_dir, styles=["Summary"])

# Extract transcripts only
transcripts = api.extract_transcripts(url)

# Process existing transcripts
results = api.process_with_ai(transcripts, output_dir, styles=["Technical"])

extract_transcripts_only()

Get raw transcripts without AI processing:

from getoutvideo import extract_transcripts_only

transcripts = extract_transcripts_only(
    url="https://www.youtube.com/watch?v=VIDEO_ID",
    openai_api_key="your-openai-api-key"
)

Use Cases

Course Materials

# Convert lectures to study materials
study_files = process_youtube_playlist(
    url="https://www.youtube.com/playlist?list=COURSE_PLAYLIST",
    output_dir="./course_materials",
    openai_api_key="your-key",
    styles=["Educational", "Key Points"]
)

Technical Documentation

# Turn tutorial videos into documentation
api = GetOutVideoAPI(openai_api_key="your-key")
transcripts = api.extract_transcripts("https://www.youtube.com/watch?v=TUTORIAL_ID")
docs = api.process_with_ai(transcripts, "./docs", styles=["Technical"])

Research and Analysis

# Process conference talks for research
files = process_youtube_playlist(
    url="https://www.youtube.com/watch?v=CONFERENCE_TALK",
    output_dir="./research",
    openai_api_key="your-key",
    styles=["Balanced", "Summary"]
)

Output Files

Generated files follow this naming pattern:

{video_title} [{style_name}].md

Example output for "Python Tutorial":

📁 output/
├── Python_Tutorial [Summary].md
├── Python_Tutorial [Educational].md  
├── Python_Tutorial [Key Points].md
└── Python_Tutorial [Technical].md

Each file contains:

  • Original video URL
  • Structured content in markdown format
  • Style-specific formatting (bullets, sections, Q&A, etc.)

Error Handling

from getoutvideo import GetOutVideoAPI, GetOutVideoError

try:
    api = GetOutVideoAPI(openai_api_key="your-key")
    files = api.process_youtube_url(url="...", output_dir="./output")
    print(f"Success: {len(files)} files generated")
except GetOutVideoError as e:
    print(f"API Error: {e}")
except Exception as e:
    print(f"Error: {e}")

Rate Limits and Costs

  • Respects OpenAI rate limits automatically
  • Costs depend on transcript length and models used
  • Use specific styles parameter to reduce processing
  • Adjust chunk_size for cost optimization

Development

git clone https://github.com/yourusername/getoutvideo.git
cd getoutvideo
pip install -e ".[dev]"
pytest tests/          # Run tests
black getoutvideo/     # Format code

License

MIT License - see LICENSE file for details.

Support

  • Issues: GitHub Issues
  • Documentation: Full API docs and examples available

Credits

Built with OpenAI GPT models, YouTube Transcript API, and FFmpeg for audio processing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

getoutvideo-1.0.0.tar.gz (35.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

getoutvideo-1.0.0-py3-none-any.whl (30.2 kB view details)

Uploaded Python 3

File details

Details for the file getoutvideo-1.0.0.tar.gz.

File metadata

  • Download URL: getoutvideo-1.0.0.tar.gz
  • Upload date:
  • Size: 35.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for getoutvideo-1.0.0.tar.gz
Algorithm Hash digest
SHA256 09d1aa62e7cf2c9dd6e9d4d59226db7f5b953c5823f68c891daef5c7d777e47b
MD5 18715c1c79957fa56285a6391d421a5c
BLAKE2b-256 4ba013e05b86c40177cf320852a457f9170f3d9efd9644a95d67710dfcda3d00

See more details on using hashes here.

File details

Details for the file getoutvideo-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: getoutvideo-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 30.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for getoutvideo-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a1b3258e14c74b34242194ed5fa32010a8262644a3040849ba1c1827c9d4bc42
MD5 e37bdb6f1ccf946bd656fab7b5fb9326
BLAKE2b-256 47c9d8dcf8bbd85ca9770d3f791b5ee0cc451d53c89eb9798e318bafc0d8d40b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page