A lightweight YouTube metadata library.

These details have not been verified by PyPI

Project links

Homepage

Project description

yt-meta

A Python library for finding video and channel metadata from YouTube.

Purpose

This library is designed to provide a simple and efficient way to collect metadata for YouTube videos and channels, such as titles, view counts, likes, and descriptions. It is built to support data analysis, research, or any application that needs structured information from YouTube.

Installation

This project uses uv for package management. You can install yt-meta from PyPI:

uv pip install yt-meta

To enable persistent caching, you need to install an optional dependency:

# For disk-based caching
uv pip install "yt-meta[persistent_cache]"

Inspiration

This project extends the great youtube-comment-downloader library, inheriting its session management while adding additional metadata capabilities.

Core Features

The library offers several ways to fetch metadata.

1. Get Video Metadata

Fetches metadata for a specific YouTube video.

Example:

from yt_meta import YtMeta

client = YtMeta()
video_url = "https://www.youtube.com/watch?v=B68agR-OeJM"
metadata = client.get_video_metadata(video_url)
print(f"Title: {metadata['title']}")

2. Get Channel Metadata

Fetches metadata for a specific YouTube channel.

Example:

from yt_meta import YtMeta

client = YtMeta()
channel_url = "https://www.youtube.com/@samwitteveenai"
channel_metadata = client.get_channel_metadata(channel_url)
print(f"Channel Name: {channel_metadata['title']}")

3. Get All Videos from a Channel

Returns a generator that yields metadata for all videos on a channel's "Videos" tab, handling pagination automatically.

Example:

import itertools
from yt_meta import YtMeta

client = YtMeta()
channel_url = "https://www.youtube.com/@AI-Makerspace/videos"
videos_generator = client.get_channel_videos(channel_url)

# Print the first 5 videos
for video in itertools.islice(videos_generator, 5):
    print(f"- {video['title']} (ID: {video['video_id']})")

4. Get All Videos from a Playlist

Returns a generator that yields metadata for all videos in a playlist, handling pagination automatically.

Example:

import itertools
from yt_meta import YtMeta

client = YtMeta()
playlist_id = "PL-osiE80TeTt2d9bfVyTiXJA-UTHn6WwU"
videos_generator = client.get_playlist_videos(playlist_id)

# Print the first 5 videos
for video in itertools.islice(videos_generator, 5):
    print(f"- {video['title']} (ID: {video['video_id']})")

5. Get All Shorts from a Channel

Similar to videos, you can fetch all "Shorts" from a channel. This also supports a fast path (basic metadata) and a slow path (full metadata).

Fast Path Example:

This is the most efficient way to get a list of shorts, but it provides limited metadata.

import itertools
from yt_meta import YtMeta

client = YtMeta()
channel_url = "https://www.youtube.com/@bashbunni"
shorts_generator = client.get_channel_shorts(channel_url)

# Print the first 5 shorts
for short in itertools.islice(shorts_generator, 5):
    print(f"- {short['title']} (ID: {short['video_id']})")

Slow Path Example (Full Metadata):

Set fetch_full_metadata=True to retrieve all details for each short, such as like_count and publish_date.

import itertools
from yt_meta import YtMeta

client = YtMeta()
channel_url = "https://www.youtube.com/@bashbunni"
shorts_generator = client.get_channel_shorts(
    channel_url,
    fetch_full_metadata=True
)

# Print the first 5 shorts with full metadata
for short in itertools.islice(shorts_generator, 5):
    likes = short.get('like_count', 'N/A')
    print(f"- {short['title']} (Likes: {likes})")

6. Get Comments from a Video

Fetches comments for a specific video, with options for sorting and filtering. This method returns a generator that yields standardized comment data.

Example:

import itertools
from yt_meta import YtMeta, SORT_BY_POPULAR

client = YtMeta()
video_url = "https://www.youtube.com/watch?v=B68agR-OeJM"

# Find the most popular comments that have been liked by the creator
comment_filters = {
    "is_hearted_by_owner": {"eq": True}
}

comments_generator = client.get_video_comments(
    video_url,
    sort_by=SORT_BY_POPULAR,
    filters=comment_filters
)

print(f"Top 5 hearted comments for video: {video_url}")
for comment in itertools.islice(comments_generator, 5):
    likes = comment.get('like_count', 0)
    print(f"- \"{comment['text']}\" (Likes: {likes})")

Caching

yt-meta includes a flexible caching system to improve performance and avoid re-fetching data from YouTube.

Default In-Memory Cache

By default, YtMeta uses a simple in-memory dictionary to cache results. This cache is temporary and only lasts for the lifetime of the client instance.

client = YtMeta()
# The first call will fetch from the network
meta1 = client.get_video_metadata("some_url") 
# This second call will be instant, served from the in-memory cache
meta2 = client.get_video_metadata("some_url")

Persistent Caching

For caching results across different runs or scripts, you can provide a persistent, dictionary-like object to the client. The library provides an optional diskcache integration for this purpose.

First, install the necessary extra:

uv pip install "yt-meta[persistent_cache]"

Then, instantiate a diskcache.Cache object and pass it to the client:

from yt_meta import YtMeta
from diskcache import Cache

# The cache object can be any dict-like object.
# Here, we use diskcache for a persistent, file-based cache.
persistent_cache = Cache(".my_yt_meta_cache")

client = YtMeta(cache=persistent_cache)

# The first time this script runs, it will be slow (fetches from network).
# Subsequent runs will be very fast, reading directly from the disk cache.
metadata = client.get_video_metadata("some_url")

Any object that implements the MutableMapping protocol (e.g., __getitem__, __setitem__, __delitem__) can be used as a cache. See examples/features/19_alternative_caching_sqlite.py for a demonstration using sqlitedict.

Advanced Features

Filtering Videos, Shorts, and Comments

The library provides a powerful filtering system via the filters argument, available on methods like get_channel_videos, get_channel_shorts, and get_video_comments. This allows you to find items matching specific criteria on the server side.

Robust Filter Validation

To improve the developer experience and prevent errors, yt-meta validates your filters dictionary before making any network requests. If you provide a filter field that doesn't exist, an invalid operator for a field, or an incorrect value type, the library will immediately raise a ValueError or TypeError.

This "fail-fast" approach saves you from waiting for a long-running query to complete only to find out there was a typo in your request. See examples/features/23_filter_validation.py for a demonstration.

Two-Stage Filtering: Fast vs. Slow

The library uses an efficient two-stage filtering process for videos and shorts:

Fast Filters: Applied first, using metadata that is available on the main channel or playlist page (e.g., title, view_count). This is very efficient.
Slow Filters: Applied second, only on items that pass the fast filters. This requires fetching full metadata for each item individually, which is much slower.

The client automatically detects when a slow filter is used and sets fetch_full_metadata=True for you.

[!NOTE] Comment filtering does not use the fast/slow system. All comment filters are applied after fetching the comment data.

Supported Fields and Operators

The following table lists all supported fields and their valid operators. The validation system will enforce these rules.

Field	Supported Operators	Content Type(s)	Filter Speed
`title`	`contains`, `re`, `eq`	Video, Short	Fast
`description_snippet`	`contains`, `re`, `eq`	Video	Fast
`view_count`	`gt`, `gte`, `lt`, `lte`, `eq`	Video, Short	Fast
`duration_seconds`	`gt`, `gte`, `lt`, `lte`, `eq`	Video, Short	Fast
`publish_date`	`gt`, `gte`, `lt`, `lte`, `eq`	Video, Short, Comment	Fast (Video), Slow (Short, Playlist)
`like_count`	`gt`, `gte`, `lt`, `lte`, `eq`	Video, Short, Comment	Slow
`category`	`contains`, `re`, `eq`	Video, Short	Slow
`keywords`	`contains_any`, `contains_all`	Video, Short	Slow
`full_description`	`contains`, `re`, `eq`	Video	Slow
`text`	`contains`, `re`, `eq`	Comment	N/A
`is_by_owner`	`eq`	Comment	N/A
`is_hearted_by_owner`	`eq`	Comment	N/A

[!NOTE] Some fields like publish_date can be "fast" for channel videos but "slow" for shorts or playlists because the basic metadata is not always available on those pages.

Example: Basic Filtering (Fast)

This example finds popular, short videos. Since both view_count and duration_seconds are fast filters, this query is very efficient.

import itertools
from yt_meta import YtMeta

client = YtMeta()
channel_url = "https://www.youtube.com/@TED/videos"

# Find videos over 1M views AND shorter than 5 minutes (300s)
adv_filters = {
    "view_count": {"gt": 1_000_000},
    "duration_seconds": {"lt": 300}
}

# This is fast because both view_count and duration are available
# in the basic metadata returned from the main channel page.
videos = client.get_channel_videos(
    channel_url,
    filters=adv_filters
)

for video in itertools.islice(videos, 5):
    views = video.get('view_count', 0)
    duration = video.get('duration_seconds', 0)
    print(f"- {video.get('title')} ({views:,} views, {duration}s)")

Example: Filtering by Date

The easiest way to filter by date is to use the start_date and end_date arguments. The library also optimizes this for channels by stopping the search early once videos are older than the specified start_date.

You can provide datetime.date objects or a relative date string (e.g., "30d", "6 months ago").

Using datetime.date objects:

from datetime import date
from yt_meta import YtMeta
import itertools

client = YtMeta()
channel_url = "https://www.youtube.com/@samwitteveenai/videos"

# Get videos from a specific window
start = date(2024, 1, 1)
end = date(2024, 3, 31)

videos = client.get_channel_videos(
    channel_url,
    start_date=start,
    end_date=end
)

for video in itertools.islice(videos, 5):
    p_date = video.get('publish_date', 'N/A')
    print(f"- {video.get('title')} (Published: {p_date})")

Using relative date strings:

from yt_meta import YtMeta
import itertools

client = YtMeta()
channel_url = "https://www.youtube.com/@samwitteveenai/videos"

recent_videos = client.get_channel_videos(
    channel_url,
    start_date="6 months ago"
)

for video in itertools.islice(recent_videos, 5):
    p_date = video.get('publish_date', 'N/A')
    print(f"- {video.get('title')} (Published: {p_date})")

Important Note on Playlist Filtering: When filtering a playlist by date, the library must fetch metadata for all videos first, as playlists are not guaranteed to be chronological. This can be very slow for large playlists.

Important Note on Shorts Filtering: Similarly, the Shorts feed does not provide a publish date on its fast path. Any date-based filter on get_channel_shorts will automatically trigger the slower, full metadata fetch for each short.

Logging

yt-meta uses Python's logging module to provide insights into its operations. To see the log output, you can configure a basic logger.

Example:

import logging

# Configure logging to print INFO-level messages
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')

# Now, when you use the client, you will see logs
# ...

API Reference

`YtMeta(cache: Optional[MutableMapping] = None)`

The main client for interacting with the library. It inherits from youtube-comment-downloader and handles session management.

cache: An optional dictionary-like object to use for caching. If None, a temporary in-memory cache is used.

`get_video_metadata(youtube_url: str) -> dict`

Fetches metadata for a single YouTube video.

youtube_url: The full URL of the YouTube video.
Returns: A dictionary containing metadata such as title, description, view_count, like_count, publish_date, category, and more.
Raises: VideoUnavailableError if the video page cannot be fetched or the video is private/deleted.

`get_video_comments(youtube_url: str, sort_by: int = SORT_BY_RECENT, limit: int = -1, filters: Optional[dict] = None) -> Generator[dict, None, None]`

Fetches comments for a specific YouTube video. This is an "enrichment" call and is slower than fetching bulk metadata.

youtube_url: The full URL of the YouTube video.
sort_by: The sort order for comments. Use SORT_BY_RECENT (default) or SORT_BY_POPULAR.
limit: The maximum number of comments to fetch. -1 means no limit.
filters: A dictionary of filter conditions to apply (see filter table below).
Returns: A generator that yields a standardized dictionary for each comment.

`get_channel_metadata(channel_url: str) -> dict`

Fetches metadata for a specific channel. Results are cached.

channel_url: The URL of the channel.
Returns: A dictionary with channel metadata like title, description, subscriber_count, vanity_url, etc.
Raises: VideoUnavailableError, MetadataParsingError.

`get_channel_videos(channel_url: str, ..., stop_at_video_id: str = None, max_videos: int = -1) -> Generator[dict, None, None]`

Yields metadata for videos from a channel.

start_date: The earliest date for videos to include (e.g., date(2023, 1, 1) or "30d").
end_date: The latest date for videos to include.
fetch_full_metadata: If True, fetches detailed metadata for every video. Automatically enabled if a "slow filter" is used.
filters: A dictionary of advanced filter conditions (see above).
stop_at_video_id: Stops fetching when this video ID is found.
max_videos: The maximum number of videos to return.

`get_playlist_videos(playlist_id: str, ..., stop_at_video_id: str = None, max_videos: int = -1) -> Generator[dict, None, None]`

Yields metadata for videos from a playlist.

start_date: The earliest date for videos to include (e.g., date(2023, 1, 1) or "30d").
end_date: The latest date for videos to include.
fetch_full_metadata: If True, fetches detailed metadata for every video.
filters: A dictionary of advanced filter conditions.
stop_at_video_id: Stops fetching when this video ID is found.
max_videos: The maximum number of videos to return.

`clear_cache()`

Clears all items from the configured cache (both in-memory and persistent).

Error Handling

The library uses custom exceptions to signal specific error conditions.

`YtMetaError`

The base exception for all errors in this library.

`MetadataParsingError`

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.4.0

Jul 3, 2025

0.3.1

Jun 27, 2025

0.3.0

Jun 27, 2025

This version

0.2.8

Jun 27, 2025

0.2.7

Jun 27, 2025

0.2.6

Jun 26, 2025

0.2.5

Jun 26, 2025

0.2.4

Jun 26, 2025

0.2.3

Jun 26, 2025

0.2.2

Jun 24, 2025

0.2.1

Jun 24, 2025

0.2.0

Jun 23, 2025

0.1.1

Jun 22, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt_meta-0.2.8.tar.gz (36.1 kB view details)

Uploaded Jun 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

yt_meta-0.2.8-py3-none-any.whl (23.1 kB view details)

Uploaded Jun 27, 2025 Python 3

File details

Details for the file yt_meta-0.2.8.tar.gz.

File metadata

Download URL: yt_meta-0.2.8.tar.gz
Upload date: Jun 27, 2025
Size: 36.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for yt_meta-0.2.8.tar.gz
Algorithm	Hash digest
SHA256	`7d4b1665960df88379161239da31ce6d2fedc5ba7028610c9a8059f51c80926d`
MD5	`8b82eb3ab0f3fbee5d4b2980edafd8a7`
BLAKE2b-256	`594b27b15b0ddf584d867025f5aba81932e21750f18d989f78bd4637b9dd6c84`

See more details on using hashes here.

File details

Details for the file yt_meta-0.2.8-py3-none-any.whl.

File metadata

Download URL: yt_meta-0.2.8-py3-none-any.whl
Upload date: Jun 27, 2025
Size: 23.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for yt_meta-0.2.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`022be78d0cb80b00b9af136a5878aedf45cea85753b0e8b94c5f2439c1c90221`
MD5	`005a65d225a4f184f6b11e76308106d2`
BLAKE2b-256	`b56cae5a6f417ce93380d1a70ea07a8158409daa8b454f64ec6b33051317940c`

See more details on using hashes here.

yt-meta 0.2.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

yt-meta

Purpose

Installation

Inspiration

Core Features

1. Get Video Metadata

2. Get Channel Metadata

3. Get All Videos from a Channel

4. Get All Videos from a Playlist

5. Get All Shorts from a Channel

6. Get Comments from a Video

Caching

Default In-Memory Cache

Persistent Caching

Advanced Features

Filtering Videos, Shorts, and Comments

Robust Filter Validation

Two-Stage Filtering: Fast vs. Slow

Supported Fields and Operators

Example: Basic Filtering (Fast)

Example: Filtering by Date

Logging

API Reference

YtMeta(cache: Optional[MutableMapping] = None)

get_video_metadata(youtube_url: str) -> dict

get_video_comments(youtube_url: str, sort_by: int = SORT_BY_RECENT, limit: int = -1, filters: Optional[dict] = None) -> Generator[dict, None, None]

get_channel_metadata(channel_url: str) -> dict

get_channel_videos(channel_url: str, ..., stop_at_video_id: str = None, max_videos: int = -1) -> Generator[dict, None, None]

get_playlist_videos(playlist_id: str, ..., stop_at_video_id: str = None, max_videos: int = -1) -> Generator[dict, None, None]

clear_cache()

Error Handling

YtMetaError

MetadataParsingError

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`YtMeta(cache: Optional[MutableMapping] = None)`

`get_video_metadata(youtube_url: str) -> dict`

`get_video_comments(youtube_url: str, sort_by: int = SORT_BY_RECENT, limit: int = -1, filters: Optional[dict] = None) -> Generator[dict, None, None]`

`get_channel_metadata(channel_url: str) -> dict`

`get_channel_videos(channel_url: str, ..., stop_at_video_id: str = None, max_videos: int = -1) -> Generator[dict, None, None]`

`get_playlist_videos(playlist_id: str, ..., stop_at_video_id: str = None, max_videos: int = -1) -> Generator[dict, None, None]`

`clear_cache()`

`YtMetaError`

`MetadataParsingError`