YouTube loader for LangChain using yt-dlp
Project description
langchain-yt-dlp
langchain-yt-dlp
is a Python package that extends LangChain by providing an improved YouTube integration using yt-dlp
.
This package addresses a critical limitation in the existing LangChain YoutubeLoader. The original implementation, which relied on pytube
, became unable to fetch YouTube metadata due to changes in YouTube's structure. langchain-yt-dlp
resolves this by leveraging the robust yt-dlp
library, providing a more reliable YouTube document loader.
Key Features
- Retrieve metadata (e.g., title, description, author, view count, publish date) using the
yt-dlp
library. - Maintain compatibility with LangChain's existing loader interface.
Installation
To install the package, use:
pip install langchain-yt-dlp
Ensure you have the following dependencies installed:
langchain
yt-dlp
Install them with:
pip install langchain yt-dlp
Usage
Here’s how you can use the YoutubeLoader
from langchain-yt-dlp
:
Basic Example
Loading From a YouTube URL
from langchain_yt_dlp.youtube_loader import YoutubeLoaderDL
# Initialize using a YouTube URL
loader = YoutubeLoaderDL.from_youtube_url(
youtube_url="https://www.youtube.com/watch?v=dQw4w9WgXcQ",
add_video_info=True
)
documents = loader.load()
print(documents)
Parameters
YoutubeLoaderDL
Constructor
Parameter | Type | Default | Description |
---|---|---|---|
video_id |
str |
None | The YouTube video ID to fetch data for. |
add_video_info |
bool |
False |
Whether to fetch additional metadata. |
Testing
To run the tests:
-
Clone the repository:
git clone https://github.com/aqib0770/langchain-yt-dlp cd langchain-yt-dlp
-
Install development dependencies:
pip install -r requirements.txt
-
Run the tests:
pytest tests/test_youtube_loader.py
Contributing
Contributions are welcome! If you have ideas for new features or spot a bug, feel free to:
- Open an issue on GitHub.
- Submit a pull request.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Acknowledgements
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file langchain_yt_dlp-0.0.8.tar.gz
.
File metadata
- Download URL: langchain_yt_dlp-0.0.8.tar.gz
- Upload date:
- Size: 5.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
10f77ad8ca86dcaf9d94a118eed26999e63071b543f6a765da10daa773001e43
|
|
MD5 |
72a965edd7640d6a75e54e8b92436f1d
|
|
BLAKE2b-256 |
0c90f09dde067ea4c836a3f4af83307310cc2624e8690f5ce3d2f78e6d525d21
|
File details
Details for the file langchain_yt_dlp-0.0.8-py3-none-any.whl
.
File metadata
- Download URL: langchain_yt_dlp-0.0.8-py3-none-any.whl
- Upload date:
- Size: 4.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.10.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
8226fd95edbf3cc70607640b50c3e09f407a113a085f25c0f7575ed9602f4098
|
|
MD5 |
5edb0ffaff0c2d558f3a01af726d052b
|
|
BLAKE2b-256 |
12c1e4721f6381e16dd69f1f119fcefa27666276d90d24ac9e8b4494cf550c27
|