Python library for extracting downloadable content metadata from file hosting platforms
Project description
[pkg]: megaloader (core)
Library for extracting downloadable content metadata from file hosting platforms. Provides automatic URL detection and a plugin architecture for multi-platform support.
Installation
pip install megaloader
The library has minimal dependencies: requests for HTTP, beautifulsoup4 and
lxml for HTML parsing.
Basic usage
Call extract() with any supported URL. The function detects the platform
automatically and returns a generator of file metadata:
from megaloader import extract
for item in extract("https://pixeldrain.com/l/abc123"):
print(f"{item.filename} - {item.download_url}")
Each item contains the download URL, filename, and optional metadata like collection name and file size. Network requests happen lazily during iteration.
Downloading files
The library extracts metadata only. Use requests or similar to download:
import requests
from pathlib import Path
from megaloader import extract
def download_file(item, output_dir):
dest = Path(output_dir) / item.filename
response = requests.get(item.download_url, headers=item.headers, stream=True)
response.raise_for_status()
with open(dest, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
for item in extract("https://pixeldrain.com/l/abc123"):
download_file(item, "./downloads")
The headers attribute contains any required HTTP headers for the download
request.
Supported platforms
Four core platforms receive active development. Seven extended platforms are maintained best-effort and may break without immediate fixes.
Core platforms:
Bunkr (bunkr.si, bunkr.la, bunkr.is, bunkr.ru, bunkr.su), PixelDrain (pixeldrain.com), Cyberdrop (cyberdrop.cr, cyberdrop.me, cyberdrop.to), GoFile (gofile.io).
Extended platforms:
Fanbox ({creator}.fanbox.cc), Pixiv (pixiv.net), Rule34 (rule34.xxx), ThotsLife (thotslife.com), ThotHub.VIP (thothub.vip), ThotHub.TO (thothub.to), Fapello (fapello.com).
All platforms support albums, galleries, or lists. Single-file URLs work where applicable.
Extended platforms marked as working as of November 2025.
Platform-specific features
GoFile supports password-protected folders through the password parameter:
items = extract("https://gofile.io/d/folder", password="secret123")
Fanbox and Pixiv require session cookies for full results. Without authentication, only limited data is returned:
items = extract("https://creator.fanbox.cc", session_id="your_session_cookie")
items = extract("https://pixiv.net/artworks/12345", session_id="your_session_cookie")
Rule34 accepts optional API credentials for higher rate limits:
items = extract(
"https://rule34.xxx/index.php?page=post&s=list&tags=example",
api_key="your_api_key",
user_id="your_user_id"
)
Authentication improves results but is not required.
[!WARNING]
Free-tier accounts on Pixiv and Fanbox may still return incomplete file sets.
Working with items
The DownloadItem dataclass contains file metadata:
for item in extract(url):
item.download_url # Direct download URL (required)
item.filename # Original filename (required)
item.collection_name # Album/gallery name (optional)
item.source_id # Platform-specific ID (optional)
item.size_bytes # File size in bytes (optional)
item.headers # Required HTTP headers (optional)
Required fields are always populated. Optional fields may be None depending on
platform and content type.
Direct plugin usage
Import plugin classes directly when you need fine-grained control or want to force a specific plugin:
from megaloader.plugins import Cyberdrop
plugin = Cyberdrop("https://cyberdrop.me/a/album_id")
items = list(plugin.extract())
print(f"Found {len(items)} files")
This bypasses automatic detection. Useful when a platform introduces new domains before the package updates.
Error handling
Handle extraction failures as needed:
from megaloader import extract, ExtractionError, UnsupportedDomainError
try:
items = list(extract(url))
except UnsupportedDomainError:
print("Platform not supported")
except ExtractionError as e:
print(f"Extraction failed: {e}")
except ValueError:
print("Invalid URL format")
Network failures raise ExtractionError. Unsupported URLs raise
UnsupportedDomainError. Malformed URLs raise ValueError.
API reference
The extract() function takes a URL and platform-specific options. Returns a
generator of DownloadItem objects. Raises ValueError for invalid URLs,
UnsupportedDomainError when no plugin exists, ExtractionError on network or
parsing failures.
The DownloadItem dataclass has required fields download_url and filename.
Optional fields are collection_name, source_id, size_bytes, and headers.
The BasePlugin abstract class defines the plugin interface. Override
extract() to yield items. Override _configure_session() to add custom
headers or authentication. The session property provides a configured requests
session. The url and options properties contain constructor arguments.
Exception hierarchy: ExtractionError for network and parsing failures,
UnsupportedDomainError for unknown domains, both inherit from Exception.
Contributing
The project welcomes contributions. Install dependencies with uv sync from the
repository root. Run uv run ruff format . and uv run mypy packages/core
before committing. See the repository contributing guide for plugin development
details.
Report bugs and request features through GitHub Discussions. Include Python version, error messages, and problematic URLs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file megaloader-0.1.0.tar.gz.
File metadata
- Download URL: megaloader-0.1.0.tar.gz
- Upload date:
- Size: 19.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6770efe27d176db91420f46f7776c2b0af583071fb6d0f5cfa759e87138dfdbe
|
|
| MD5 |
b34986889e92ede4a9eb1b956f7e6c56
|
|
| BLAKE2b-256 |
09f6ce0b9dbb505737351dfbb1ff743aa04085f0842d3d50719f8448b7f5ebfd
|
Provenance
The following attestation bundles were made for megaloader-0.1.0.tar.gz:
Publisher:
release-core.yml on totallynotdavid/megaloader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
megaloader-0.1.0.tar.gz -
Subject digest:
6770efe27d176db91420f46f7776c2b0af583071fb6d0f5cfa759e87138dfdbe - Sigstore transparency entry: 715921014
- Sigstore integration time:
-
Permalink:
totallynotdavid/megaloader@8e2b68df11b11cbebe2a827e2eff91f9cdfc3077 -
Branch / Tag:
refs/tags/vcore-0.1.0 - Owner: https://github.com/totallynotdavid
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-core.yml@8e2b68df11b11cbebe2a827e2eff91f9cdfc3077 -
Trigger Event:
push
-
Statement type:
File details
Details for the file megaloader-0.1.0-py3-none-any.whl.
File metadata
- Download URL: megaloader-0.1.0-py3-none-any.whl
- Upload date:
- Size: 25.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d8e982d93661818ca4a481db2f13d063bb234044a1a8cf0ce048f08ae8f3f0b
|
|
| MD5 |
b3ec334a5ffe23aeead7cddeed36f167
|
|
| BLAKE2b-256 |
69328fb4019ffa2349a60a48eb130ee96d46fe242324e28af3fb33eef5880ba2
|
Provenance
The following attestation bundles were made for megaloader-0.1.0-py3-none-any.whl:
Publisher:
release-core.yml on totallynotdavid/megaloader
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
megaloader-0.1.0-py3-none-any.whl -
Subject digest:
9d8e982d93661818ca4a481db2f13d063bb234044a1a8cf0ce048f08ae8f3f0b - Sigstore transparency entry: 715921015
- Sigstore integration time:
-
Permalink:
totallynotdavid/megaloader@8e2b68df11b11cbebe2a827e2eff91f9cdfc3077 -
Branch / Tag:
refs/tags/vcore-0.1.0 - Owner: https://github.com/totallynotdavid
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-core.yml@8e2b68df11b11cbebe2a827e2eff91f9cdfc3077 -
Trigger Event:
push
-
Statement type: