llama-index readers youtube transcript integration
Project description
Youtube Transcript Loader
This loader fetches the text transcript of Youtube videos using the youtube_transcript_api
Python package.
Usage
To use this loader, you will need to first pip install youtube_transcript_api
.
Then, simply pass an array of YouTube links into load_data
:
from llama_hub.youtube_transcript import YoutubeTranscriptReader
loader = YoutubeTranscriptReader()
documents = loader.load_data(
ytlinks=["https://www.youtube.com/watch?v=i3OYlaoj-BM"]
)
Supported URL formats: + youtube.com/watch?v={video_id} (with or without 'www.') + youtube.com/embed?v={video_id} (with or without 'www.') + youtu.be/{video_id} (never includes www subdomain)
To programmatically check if a URL is supported:
from llama_hub.youtube_transcript import is_youtube_video
is_youtube_video("https://youtube.com/watch?v=j83jrh2") # => True
is_youtube_video("https://vimeo.com/272134160") # => False
This loader is designed to be used as a way to load data into LlamaIndex and/or subsequently used as a Tool in a LangChain Agent. See here for examples.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for flying_delta_readers_youtube_transcript-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4a6c6bbc7a1b34dfb3dd7d5f756ea204ab018c38c24aee8e98146c377b6abf0 |
|
MD5 | 62370f94a0543e9c1556ca255dfebcfc |
|
BLAKE2b-256 | c6862412904a963e85f418697cc2c74057da3794ce7cc1c5672add0335be8699 |
Hashes for flying_delta_readers_youtube_transcript-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b591ea97a558ce07769ef49f33d7fd476d9ec3c4b3f354527511b3f7eb7b455 |
|
MD5 | b46ce0d44cd5f75de87d4036ec4409d6 |
|
BLAKE2b-256 | a6acfdfe4362ad705838b3f5f7a5df51cbcd9ffa6de56af996e439e2ca31f512 |