Skip to main content

Download Google Drive files/folders and upload them to the Internet Archive

Project description

IAdrive

Lint Unit Tests License Button PyPI Button

IAdrive is A tool for archiving google drive files/folders and uploading them to the Internet Archive, it downloads the google drive's content, makes the metadata, and then uploads to IA

  • this project is heavily based of off tubeup by bibanon, credits to them

Features

  • Downloads files and/or folders from Google Drive using gdown
  • Preserves folder structure when uploading (can be disabled with --disable-slash-files)
  • Extract file modification dates to determine the creation date for the item
  • Pass custom metadata to Archive.org using --metadata=<key:value>
  • Supports quiet mode (--quiet) and debug mode (--debug) for log output
  • Automatically cleans up downloaded files after upload
  • Sanitizes identifiers and truncates subject tags to fit Archive.org requirements
  • Falls back to "IAdrive" as publisher since Google Drive collaborators fetching is not yet implemented
  • Improved error handling and debug output

Installation

Requires Python 3.9 or newer

pip install iadrive

The package makes a console script named iadrive once installed, You can also install from the source using pip install .

Configuration

ia configure

You're gonna be prompted to enter your IA account's email and password

Optional envs:

  • GOOGLE_API_KEY – if set, the tool attempts to look up the owner names of the Google Drive file or folder for the creator field in metadata (not yet implemented)

Usage

iadrive <url> [--metadata=<key:value>...] [--disable-slash-files] [--quiet] [--debug]

Arguments:

  • <url> – Google Drive file or folder URL to mirror (required)

Options:

  • --metadata=<key:value> – custom metadata to add to the Archive.org item (can be used multiple times)
  • --disable-slash-files – upload files without preserving folder structure
  • --quiet – only print errors
  • --debug – print all logs to stdout (for troubleshooting)

Examples:

# Upload with folder structure preserved (default)
iadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder

# Upload with flat structure
iadrive https://drive.google.com/drive/folders/placeholder --disable-slash-files

# Debug mode with custom metadata
iadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder \
        --metadata=mediatype:data --debug

Folder Structure Preservation

By default, IAdrive preserves the folder structure from Google Drive when uploading to Internet Archive, For example, if your Google Drive link contains:

placeholder.txt
placeholder.mp3
folder/
  ├── placeholder.pdf
  └── folder/
      └── placeholder.mp4

The files will be uploaded to Internet Archive as:

  • placeholder.txt
  • placeholder.mp3
  • folder/placeholder.pdf
  • folder/folder/placeholder.mp4

If you use the --disable-slash-files command argument, all files will be uploaded to the root level:

  • file1.txt
  • file2.txt
  • document.pdf
  • data.csv

Note: When using flat structure, duplicate filenames are automatically handled by adding a number (e.g., placeholder.pdf, placeholder_1.pdf).

How it works

  1. iadrive uses gdown to fetch the specified Google Drive file or folder
  2. It walks the downloaded directory and extracts file extensions and modification dates
  3. Metadata is assembled including a file listing (with sizes), oldest file modification date, and original URL. Identifiers are sanitized and subject tags are truncated to fit Archive.org requirements. Publisher defaults to "IAdrive" since collaborator fetching is not yet implemented.
  4. The directory is uploaded to an Archive.org item using the internetarchive library with a fixed identifier format drive-{drive-id}, collection opensource, and mediatype data, Folder structure is preserved by default (can be disabled with --disable-slash-files)
  5. Downloaded files are automatically cleaned up after upload
  6. Errors are handled gracefully, and debug output is available with --debug

To-do list

  • Google Drive collaborator fetching to use as creator metadata through the Google API

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iadrive-1.0.2.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iadrive-1.0.2-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file iadrive-1.0.2.tar.gz.

File metadata

  • Download URL: iadrive-1.0.2.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for iadrive-1.0.2.tar.gz
Algorithm Hash digest
SHA256 220e9472902dab96067c036af712877866aae428c75eb9cc73ac49dad0cc7106
MD5 be3229cf8b7bbdaadf7a69914ffcb3b1
BLAKE2b-256 a10f3c173e54a913824323b46308388fda9f4890a4eb5b57bb98d09d0f9830c6

See more details on using hashes here.

File details

Details for the file iadrive-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: iadrive-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for iadrive-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9ee855c46758768ba9f91e421d4c74919ddcdec4ceb0a74b9285c1f0fcb332c0
MD5 92f92e5f5d32c2629a8f0625d4534013
BLAKE2b-256 84c8e309eaaedf22c13018d761b428b10963d3c0eb5166218742d6d90aaf0caa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page