Skip to main content

Download Google Drive files/folders and upload them to the Internet Archive

Project description

IAdrive

Lint Unit Tests License Button PyPI Button

IAdrive is A tool for archiving google drive files/folders and uploading them to the Internet Archive, it downloads the google drive's content, makes the metadata, and then uploads to IA

  • this project is heavily based of off tubeup by bibanon, credits to them

Features

  • Downloads files and/or folders from Google Drive using gdown
  • Preserves folder structure when uploading (can be disabled with --disable-slash-files)
  • Extract file modification dates to determine the creation date for the item
  • Pass custom metadata to Archive.org using --metadata=<key:value>
  • Supports quiet mode (--quiet) and debug mode (--debug) for log output
  • Automatically cleans up downloaded files after upload
  • Sanitizes identifiers and truncates subject tags to fit Archive.org requirements
  • Falls back to "IAdrive" as publisher since Google Drive collaborators fetching is not yet implemented
  • Improved error handling and debug output

Installation

Requires Python 3.9 or newer

pip install iadrive

The package makes a console script named iadrive once installed, You can also install from the source using pip install .

Configuration

ia configure

You're gonna be prompted to enter your IA account's email and password

Optional envs:

  • GOOGLE_API_KEY – if set, the tool attempts to look up the owner names of the Google Drive file or folder for the creator field in metadata (not yet implemented)

Usage

iadrive <url> [--metadata=<key:value>...] [--disable-slash-files] [--quiet] [--debug]

Arguments:

  • <url> – Google Drive file or folder URL to mirror (required)

Options:

  • --metadata=<key:value> – custom metadata to add to the Archive.org item (can be used multiple times)
  • --disable-slash-files – upload files without preserving folder structure
  • --quiet – only print errors
  • --debug – print all logs to stdout (for troubleshooting)

Examples:

# Upload with folder structure preserved (default)
iadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder

# Upload with flat structure
iadrive https://drive.google.com/drive/folders/placeholder --disable-slash-files

# Debug mode with custom metadata
iadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder \
        --metadata=mediatype:data --debug

Folder Structure Preservation

By default, IAdrive preserves the folder structure from Google Drive when uploading to Internet Archive, For example, if your Google Drive link contains:

placeholder.txt
placeholder.mp3
folder/
  ├── placeholder.pdf
  └── folder/
      └── placeholder.mp4

The files will be uploaded to Internet Archive as:

  • placeholder.txt
  • placeholder.mp3
  • folder/placeholder.pdf
  • folder/folder/placeholder.mp4

If you use the --disable-slash-files command argument, all files will be uploaded to the root level:

  • file1.txt
  • file2.txt
  • document.pdf
  • data.csv

Note: When using flat structure, duplicate filenames are automatically handled by adding a number (e.g., placeholder.pdf, placeholder_1.pdf).

How it works

  1. iadrive uses gdown to fetch the specified Google Drive file or folder
  2. It walks the downloaded directory and extracts file extensions and modification dates
  3. Metadata is assembled including a file listing (with sizes), oldest file modification date, and original URL. Identifiers are sanitized and subject tags are truncated to fit Archive.org requirements. Publisher defaults to "IAdrive" since collaborator fetching is not yet implemented.
  4. The directory is uploaded to an Archive.org item using the internetarchive library with a fixed identifier format drive-{drive-id}, collection opensource, and mediatype data, Folder structure is preserved by default (can be disabled with --disable-slash-files)
  5. Downloaded files are automatically cleaned up after upload
  6. Errors are handled gracefully, and debug output is available with --debug

Supported Platforms

For a list of supported platforms for archiving, please see SUPPORTEDPLATFORMS.md

To-do list

  • Google Drive collaborator fetching to use as creator metadata through the Google API

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iadrive-1.0.4.tar.gz (15.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iadrive-1.0.4-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file iadrive-1.0.4.tar.gz.

File metadata

  • Download URL: iadrive-1.0.4.tar.gz
  • Upload date:
  • Size: 15.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for iadrive-1.0.4.tar.gz
Algorithm Hash digest
SHA256 3956170a639ca3c1cc8b99caa76a85848413080c621865d75779a9723e95343e
MD5 cf531b85c90ecc41db7cf65b6320fa50
BLAKE2b-256 250ce31d9f63ece25ba3f2817c25e6f73a3ef227e1d9c19af5767417d1cb5281

See more details on using hashes here.

File details

Details for the file iadrive-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: iadrive-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for iadrive-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e115a0584fd0cfc128ef02b924a3208e05cb2898fc1540dc33a52ffebbb2bb62
MD5 2a6d57fa557064df94bce353c679d63b
BLAKE2b-256 eb51aae0e3f551cf8e1e068b4e6ce3899950d9c43548f05dcc8ebf8a783a8753

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page