Skip to main content

Python library for extracting downloadable content metadata from file hosting platforms

Project description

[pkg]: megaloader (core)

PyPI version CodeQL lint and format check codecov

Library for extracting downloadable content metadata from file hosting platforms. Provides automatic URL detection and a plugin architecture for multi-platform support.

Installation

pip install megaloader

The library has minimal dependencies: requests for HTTP, beautifulsoup4 and lxml for HTML parsing.

Basic usage

Call extract() with any supported URL. The function detects the platform automatically and returns a generator of file metadata:

from megaloader import extract

for item in extract("https://pixeldrain.com/l/abc123"):
    print(f"{item.filename} - {item.download_url}")

Each item contains the download URL, filename, and optional metadata like collection name and file size. Network requests happen lazily during iteration.

Downloading files

The library extracts metadata only. Use requests or similar to download:

import requests
from pathlib import Path
from megaloader import extract

def download_file(item, output_dir):
    dest = Path(output_dir) / item.filename
    response = requests.get(item.download_url, headers=item.headers, stream=True)
    response.raise_for_status()

    with open(dest, 'wb') as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)

for item in extract("https://pixeldrain.com/l/abc123"):
    download_file(item, "./downloads")

The headers attribute contains any required HTTP headers for the download request.

Supported platforms

Four core platforms receive active development. Seven extended platforms are maintained best-effort and may break without immediate fixes.

Core platforms:

Bunkr (bunkr.si, bunkr.la, bunkr.is, bunkr.ru, bunkr.su), PixelDrain (pixeldrain.com), Cyberdrop (cyberdrop.cr, cyberdrop.me, cyberdrop.to), GoFile (gofile.io).

Extended platforms:

Fanbox ({creator}.fanbox.cc), Pixiv (pixiv.net), Rule34 (rule34.xxx), ThotsLife (thotslife.com), ThotHub.VIP (thothub.vip), ThotHub.TO (thothub.to), Fapello (fapello.com).

All platforms support albums, galleries, or lists. Single-file URLs work where applicable.

Extended platforms marked as working as of November 2025.

Platform-specific features

GoFile supports password-protected folders through the password parameter:

items = extract("https://gofile.io/d/folder", password="secret123")

Fanbox and Pixiv require session cookies for full results. Without authentication, only limited data is returned:

items = extract("https://creator.fanbox.cc", session_id="your_session_cookie")
items = extract("https://pixiv.net/artworks/12345", session_id="your_session_cookie")

Rule34 accepts optional API credentials for higher rate limits:

items = extract(
    "https://rule34.xxx/index.php?page=post&s=list&tags=example",
    api_key="your_api_key",
    user_id="your_user_id"
)

Authentication improves results but is not required.

[!WARNING]
Free-tier accounts on Pixiv and Fanbox may still return incomplete file sets.

Working with items

The DownloadItem dataclass contains file metadata:

for item in extract(url):
    item.download_url     # Direct download URL (required)
    item.filename         # Original filename (required)
    item.collection_name  # Album/gallery name (optional)
    item.source_id        # Platform-specific ID (optional)
    item.size_bytes       # File size in bytes (optional)
    item.headers          # Required HTTP headers (optional)

Required fields are always populated. Optional fields may be None depending on platform and content type.

Direct plugin usage

Import plugin classes directly when you need fine-grained control or want to force a specific plugin:

from megaloader.plugins import Cyberdrop

plugin = Cyberdrop("https://cyberdrop.me/a/album_id")
items = list(plugin.extract())
print(f"Found {len(items)} files")

This bypasses automatic detection. Useful when a platform introduces new domains before the package updates.

Error handling

Handle extraction failures as needed:

from megaloader import extract, ExtractionError, UnsupportedDomainError

try:
    items = list(extract(url))
except UnsupportedDomainError:
    print("Platform not supported")
except ExtractionError as e:
    print(f"Extraction failed: {e}")
except ValueError:
    print("Invalid URL format")

Network failures raise ExtractionError. Unsupported URLs raise UnsupportedDomainError. Malformed URLs raise ValueError.

API reference

The extract() function takes a URL and platform-specific options. Returns a generator of DownloadItem objects. Raises ValueError for invalid URLs, UnsupportedDomainError when no plugin exists, ExtractionError on network or parsing failures.

The DownloadItem dataclass has required fields download_url and filename. Optional fields are collection_name, source_id, size_bytes, and headers.

The BasePlugin abstract class defines the plugin interface. Override extract() to yield items. Override _configure_session() to add custom headers or authentication. The session property provides a configured requests session. The url and options properties contain constructor arguments.

Exception hierarchy: ExtractionError for network and parsing failures, UnsupportedDomainError for unknown domains, both inherit from Exception.

Contributing

The project welcomes contributions. Install dependencies with uv sync from the repository root. Run uv run ruff format . and uv run mypy packages/core before committing. See the repository contributing guide for plugin development details.

Report bugs and request features through GitHub Discussions. Include Python version, error messages, and problematic URLs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

megaloader-0.1.0.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

megaloader-0.1.0-py3-none-any.whl (25.0 kB view details)

Uploaded Python 3

File details

Details for the file megaloader-0.1.0.tar.gz.

File metadata

  • Download URL: megaloader-0.1.0.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for megaloader-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6770efe27d176db91420f46f7776c2b0af583071fb6d0f5cfa759e87138dfdbe
MD5 b34986889e92ede4a9eb1b956f7e6c56
BLAKE2b-256 09f6ce0b9dbb505737351dfbb1ff743aa04085f0842d3d50719f8448b7f5ebfd

See more details on using hashes here.

Provenance

The following attestation bundles were made for megaloader-0.1.0.tar.gz:

Publisher: release-core.yml on totallynotdavid/megaloader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file megaloader-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: megaloader-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 25.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for megaloader-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9d8e982d93661818ca4a481db2f13d063bb234044a1a8cf0ce048f08ae8f3f0b
MD5 b3ec334a5ffe23aeead7cddeed36f167
BLAKE2b-256 69328fb4019ffa2349a60a48eb130ee96d46fe242324e28af3fb33eef5880ba2

See more details on using hashes here.

Provenance

The following attestation bundles were made for megaloader-0.1.0-py3-none-any.whl:

Publisher: release-core.yml on totallynotdavid/megaloader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page