Scrape YouTube search results, channels, transcripts, and playlists — no API key needed. Sync + async. CLI + REST API included.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

zaidkx37

These details have not been verified by PyPI

Project description

TubeScrape

Scrape YouTube search results, channels, transcripts, and playlists — no API key needed.

Built on YouTube's internal InnerTube API. Three interfaces: Python SDK, CLI, and REST API.

Features

Search with filters — type, duration, upload date, sort order, features (4K, HDR, live, CC)
Channel browsing — videos, shorts, playlists, and in-channel search
Transcripts — fetch, translate, format (SRT / WebVTT / JSON), save to file
Playlists — full pagination with position tracking and metadata
Async-first — every method has an async variant for concurrent workloads
JSON-ready — every result object has .to_dict() for instant serialization
Proxy rotation — built-in support for residential proxy lists
Three interfaces — Python SDK, CLI with Rich tables, REST API with Swagger docs
Zero config — no API key, no OAuth, no setup
Lightweight — only httpx as a core dependency

Installation

recommended to install using pip

You can also integrate it into an existing project or use it via a CLI.

# Core SDK only
pip install tubescrape

# With CLI (adds click + rich)
pip install "tubescrape[cli]"

# With REST API server (adds fastapi + uvicorn)
pip install "tubescrape[api]"

# Everything
pip install "tubescrape[all]"

Requirements: Python 3.10+

Quick Start

import json
from tubescrape import YouTube

with YouTube() as yt:
    # Search YouTube
    results = yt.search('python tutorial', max_results=5)
    for video in results.videos:
        print(f'{video.title} — {video.duration} — {video.view_count}')

    # Browse a channel
    channel = yt.get_channel_videos('@lexfridman', max_results=10)
    for video in channel.videos:
        print(f'{video.title} ({video.published_text})')

    # Get a transcript and save as subtitles
    transcript = yt.get_transcript('dQw4w9WgXcQ')
    print(transcript.text)
    transcript.save('subtitles.srt')

    # Every result serializes to JSON instantly
    data = results.to_dict()
    print(json.dumps(data, indent=2))

Every method accepts plain IDs, full URLs, or @handles — parsed automatically.

Why tubescrape?

	tubescrape	youtube-transcript-api	pytube / pytubefix	yt-dlp
Search videos	Yes	No	No	Limited
Channel browse	Yes	No	No	Yes
Transcripts	Yes	Yes	No	Yes
Playlists	Yes	No	Yes	Yes
Async support	Yes	No	No	No
Built-in REST API	Yes	No	No	No
CLI tool	Yes	No	No	Yes
Core dependencies	1 (httpx)	1 (requests)	0	Many
API key needed	No	No	No	No

Search

results = yt.search('python tutorial', max_results=5)

for video in results.videos:
    print(f'{video.title} — {video.url}')
    print(f'  {video.view_count} | {video.duration} | {video.published_text}')
    print(f'  Channel: {video.channel} (verified: {video.is_verified})')

Search Filters

All filters can be combined in a single call:

results = yt.search(
    'podcast interview',
    max_results=10,
    type='video',              # video | channel | playlist | movie
    duration='long',           # short (<4m) | medium (4-20m) | long (>20m)
    upload_date='this_month',  # last_hour | today | this_week | this_month | this_year
    sort_by='view_count',      # relevance | upload_date | view_count | rating
    features=['hd', 'subtitles'],  # live | 4k | hd | subtitles | cc | creative_commons | 360 | vr180 | 3d | hdr
)

Channel Browsing

All channel methods accept @handle, channel ID (UC...), or full URL.

# Videos (newest first, with pagination)
videos = yt.get_channel_videos('@lexfridman', max_results=10)
all_videos = yt.get_channel_videos('@lexfridman', max_results=0)  # fetch ALL

# Shorts
shorts = yt.get_channel_shorts('@lexfridman')
for short in shorts.shorts:
    print(f'{short.title} — {short.view_count} — {short.url}')

# Playlists
playlists = yt.get_channel_playlists('@lexfridman')
for pl in playlists.playlists:
    print(f'{pl.title} — {pl.video_count} — {pl.url}')

# Search within a channel
results = yt.search_channel('@lexfridman', 'artificial intelligence', max_results=10)

Transcripts

# Fetch transcript (auto-detects best language)
transcript = yt.get_transcript('dQw4w9WgXcQ')
print(transcript.text)  # full text as a single string

# Choose language (priority order fallback)
transcript = yt.get_transcript('dQw4w9WgXcQ', languages=['de', 'en'])

# Translate to any language
transcript = yt.get_transcript('dQw4w9WgXcQ', translate_to='es')

# Without timestamps (plain text blob)
transcript = yt.get_transcript('dQw4w9WgXcQ', timestamps=False)

# List available languages
languages = yt.list_transcripts('dQw4w9WgXcQ')
for entry in languages:
    print(f'{entry.language} ({entry.language_code}) — {"auto" if entry.is_generated else "manual"}')

Formatting & Saving

transcript = yt.get_transcript('dQw4w9WgXcQ')

# Format as SRT, WebVTT, JSON, or plain text
srt = YouTube.format_transcript(transcript, fmt='srt')
vtt = YouTube.format_transcript(transcript, fmt='vtt')

# Save to file (format auto-detected from extension)
transcript.save('subtitles.srt')
transcript.save('subtitles.vtt')
transcript.save('transcript.json')
transcript.save('transcript.txt')

Playlists

# Accepts playlist ID or full URL
playlist = yt.get_playlist('PLrAXtmErZgOeiKm4sgNOknGvNjby9efdf')

print(f'{playlist.title} by {playlist.channel} — {len(playlist.videos)} videos')

for entry in playlist.videos:
    print(f'#{entry.position}  {entry.title} — {entry.duration}')

Serialization (`.to_dict()`)

Every result object converts to a plain Python dictionary — ready for JSON, databases, or APIs:

import json

# Works on every result type
results.to_dict()       # SearchResult → dict
video.to_dict()         # VideoResult → dict
channel.to_dict()       # BrowseResult → dict
shorts.to_dict()        # ShortsResult → dict
playlists.to_dict()     # ChannelPlaylistsResult → dict
playlist.to_dict()      # PlaylistResult → dict
transcript.to_dict()    # Transcript → dict

# Sparse output: optional fields excluded when empty/default
# is_verified=False → omitted | badges=[] → omitted | None fields → omitted
print(json.dumps(results.to_dict(), indent=2, ensure_ascii=False))

Async Support

Every method has an async variant prefixed with a. Use in FastAPI, Discord bots, or any async application:

import asyncio
from tubescrape import YouTube

async def main():
    async with YouTube() as yt:
        # All methods have async variants
        results = await yt.asearch('python', max_results=5)
        transcript = await yt.aget_transcript('dQw4w9WgXcQ')

        # Run multiple requests concurrently
        r1, r2, r3 = await asyncio.gather(
            yt.asearch('python'),
            yt.asearch('javascript'),
            yt.asearch('rust'),
        )

asyncio.run(main())

Proxy Support

# Single proxy
yt = YouTube(proxy='http://user:pass@proxy.example.com:8080')

# Proxy rotation (round-robin per request)
yt = YouTube(proxies=[
    'http://user:pass@proxy1:8080',
    'http://user:pass@proxy2:8080',
])

# SOCKS5
yt = YouTube(proxy='socks5://user:pass@proxy:1080')

# Custom timeout and retries
yt = YouTube(proxy='http://proxy:8080', timeout=60.0, max_retries=5)

Tip: YouTube blocks cloud IPs aggressively. Use rotating residential proxies (BrightData, SmartProxy, Oxylabs) for production.

CLI

Install with pip install "tubescrape[cli]".

tubescrape search tubescrape channel search tubescrape transcript tubescrape transcript srt

tubescrape search "python tutorial" -n 5
tubescrape search "podcast" --type video --duration long --sort-by view_count
tubescrape search "python" --json                    # JSON output

tubescrape channel @lexfridman                       # videos (default)
tubescrape channel @lexfridman shorts                # shorts
tubescrape channel @lexfridman playlists             # playlists
tubescrape channel @lexfridman search "podcast"      # search within channel

tubescrape playlist PLrAXtmErZgOeiKm4sgNOknGvNjby9efdf

tubescrape transcript dQw4w9WgXcQ                    # plain text
tubescrape transcript dQw4w9WgXcQ --format srt       # SRT subtitles
tubescrape transcript dQw4w9WgXcQ --translate es      # translate
tubescrape transcript dQw4w9WgXcQ --save output.srt  # save to file
tubescrape transcript dQw4w9WgXcQ --list-languages   # available languages

tubescrape --proxy http://user:pass@host:port search "python"  # with proxy
export TUBESCRAPE_PROXY="http://user:pass@host:port"           # env variable

REST API

Install with pip install "tubescrape[api]".

tubescrape serve                          # starts on localhost:8000
tubescrape serve --host 0.0.0.0 --port 3000

Interactive Swagger docs at http://localhost:8000/docs.

Method	Endpoint	Description
GET	`/api/v1/search?q=python`	Search videos
GET	`/api/v1/channel/{id}/videos`	Channel videos
GET	`/api/v1/channel/{id}/shorts`	Channel shorts
GET	`/api/v1/channel/{id}/playlists`	Channel playlists
GET	`/api/v1/channel/{id}/search?q=...`	Search within channel
GET	`/api/v1/playlist/{id}`	Fetch playlist
GET	`/api/v1/transcript/{video_id}`	Fetch transcript
GET	`/api/v1/transcript/{video_id}/languages`	List languages
GET	`/health`	Health check

curl "http://localhost:8000/api/v1/search?q=python+tutorial&max_results=5"
curl "http://localhost:8000/api/v1/transcript/dQw4w9WgXcQ?format=srt&translate_to=es"

Error Handling

All exceptions inherit from YouTubeError:

YouTubeError
├── RequestError
│   ├── RateLimitError          # HTTP 429
│   └── BotDetectedError        # HTTP 403
├── VideoUnavailableError       # private, deleted, region-locked
│   └── AgeRestrictedError
├── TranscriptsDisabledError
├── TranscriptsNotAvailableError
├── TranscriptFetchError
├── TranslationNotAvailableError
├── ChannelNotFoundError
├── PlaylistNotFoundError
├── APIKeyNotFoundError
└── ParsingError

from tubescrape import YouTube, YouTubeError, RateLimitError

try:
    results = yt.search('python')
except RateLimitError:
    print('Rate limited — use a proxy')
except YouTubeError as e:
    print(f'YouTube error: {e}')

Full Documentation

For detailed examples, all field references, and advanced usage, see the Complete Usage Guide.

Warning

This library uses YouTube's undocumented InnerTube API. It may break if YouTube changes their internal API. If it does, please open an issue.

Contributing

git clone https://github.com/zaidkx37/tubescrape.git
cd tubescrape
pip install -e ".[all,dev]"

pytest                    # run tests
ruff check src/           # lint
mypy src/tubescrape/      # type check

License

MIT License. See LICENSE for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

zaidkx37

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.2

Mar 9, 2026

0.1.1

Mar 9, 2026

0.1.0

Mar 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tubescrape-0.1.2.tar.gz (644.7 kB view details)

Uploaded Mar 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tubescrape-0.1.2-py3-none-any.whl (52.8 kB view details)

Uploaded Mar 9, 2026 Python 3

File details

Details for the file tubescrape-0.1.2.tar.gz.

File metadata

Download URL: tubescrape-0.1.2.tar.gz
Upload date: Mar 9, 2026
Size: 644.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tubescrape-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`043e37bd0c9a59293882c4da731342b64cbf1608734a2bd659b72a92a5e64208`
MD5	`84e4777edd3a46c9e5a9034132b867fd`
BLAKE2b-256	`08aa1fa9b5d432b958ed25d0cd5ab1d656c70a51cc5668c249daef704594e7af`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tubescrape-0.1.2.tar.gz:

Publisher: publish.yml on zaidkx37/tubescrape

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tubescrape-0.1.2.tar.gz
- Subject digest: 043e37bd0c9a59293882c4da731342b64cbf1608734a2bd659b72a92a5e64208
- Sigstore transparency entry: 1066427477
- Sigstore integration time: Mar 9, 2026
Source repository:
- Permalink: zaidkx37/tubescrape@9b806726fa17448aa59632cf61d54397136f3096
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/zaidkx37
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9b806726fa17448aa59632cf61d54397136f3096
- Trigger Event: release

File details

Details for the file tubescrape-0.1.2-py3-none-any.whl.

File metadata

Download URL: tubescrape-0.1.2-py3-none-any.whl
Upload date: Mar 9, 2026
Size: 52.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tubescrape-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e068fe4037fe73c822909664ea83520ddbc3f20cb1164a1b3db26d885e5996bd`
MD5	`cd0e9552b865af8847c9dd2267ba5a1f`
BLAKE2b-256	`fe19d1b8805a0dd7a24e7174a11c9abeff481a311a1401dc302f582de52cde7d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for tubescrape-0.1.2-py3-none-any.whl:

Publisher: publish.yml on zaidkx37/tubescrape

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: tubescrape-0.1.2-py3-none-any.whl
- Subject digest: e068fe4037fe73c822909664ea83520ddbc3f20cb1164a1b3db26d885e5996bd
- Sigstore transparency entry: 1066427488
- Sigstore integration time: Mar 9, 2026
Source repository:
- Permalink: zaidkx37/tubescrape@9b806726fa17448aa59632cf61d54397136f3096
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/zaidkx37
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9b806726fa17448aa59632cf61d54397136f3096
- Trigger Event: release

tubescrape 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

TubeScrape

Features

Installation

Quick Start

Why tubescrape?

Search

Search Filters

Channel Browsing

Transcripts

Formatting & Saving

Playlists

Serialization (.to_dict())

Async Support

Proxy Support

CLI

REST API

Error Handling

Full Documentation

Warning

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Serialization (`.to_dict()`)