Skip to main content

Download Google Drive files/folders and upload them to the Internet Archive

Project description

IAdrive

Lint Unit Tests License Button PyPI Button

IAdrive is A tool for archiving google drive files/folders and uploading them to the Internet Archive, it downloads the google drive's content, makes the metadata, and then uploads to IA

  • this project is heavily based of off tubeup by bibanon, credits to them

Features

  • Downloads files and/or folders from Google Drive using gdown
  • Preserves folder structure when uploading (can be disabled with --disable-slash-files)
  • Extract file modification dates to determine the creation date for the item
  • Pass custom metadata to Archive.org using --metadata=<key:value>
  • Supports quiet mode (--quiet) and debug mode (--debug) for log output
  • Automatically cleans up downloaded files after upload
  • Sanitizes identifiers and truncates subject tags to fit Archive.org requirements
  • Falls back to "IAdrive" as publisher since Google Drive collaborators fetching is not yet implemented
  • Improved error handling and debug output

Installation

Requires Python 3.9 or newer

pip install iadrive

The package makes a console script named iadrive once installed, You can also install from the source using pip install .

Configuration

ia configure

You're gonna be prompted to enter your IA account's email and password

Optional envs:

  • GOOGLE_API_KEY – if set, the tool attempts to look up the owner names of the Google Drive file or folder for the creator field in metadata (not yet implemented)

Usage

iadrive <url> [--metadata=<key:value>...] [--disable-slash-files] [--quiet] [--debug]

Arguments:

  • <url> – Google Drive file or folder URL to mirror (required)

Options:

  • --metadata=<key:value> – custom metadata to add to the Archive.org item (can be used multiple times)
  • --disable-slash-files – upload files without preserving folder structure
  • --quiet – only print errors
  • --debug – print all logs to stdout (for troubleshooting)

Examples:

# Upload with folder structure preserved (default)
iadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder

# Upload with flat structure
iadrive https://drive.google.com/drive/folders/placeholder --disable-slash-files

# Debug mode with custom metadata
iadrive https://drive.google.com/drive/folders/placeholder --metadata=collection:placeholder \
        --metadata=mediatype:data --debug

Folder Structure Preservation

By default, IAdrive preserves the folder structure from Google Drive when uploading to Internet Archive, For example, if your Google Drive link contains:

placeholder.txt
placeholder.mp3
folder/
  ├── placeholder.pdf
  └── folder/
      └── placeholder.mp4

The files will be uploaded to Internet Archive as:

  • placeholder.txt
  • placeholder.mp3
  • folder/placeholder.pdf
  • folder/folder/placeholder.mp4

If you use the --disable-slash-files command argument, all files will be uploaded to the root level:

  • file1.txt
  • file2.txt
  • document.pdf
  • data.csv

Note: When using flat structure, duplicate filenames are automatically handled by adding a number (e.g., placeholder.pdf, placeholder_1.pdf).

How it works

  1. iadrive uses gdown to fetch the specified Google Drive file or folder
  2. It walks the downloaded directory and extracts file extensions and modification dates
  3. Metadata is assembled including a file listing (with sizes), oldest file modification date, and original URL. Identifiers are sanitized and subject tags are truncated to fit Archive.org requirements. Publisher defaults to "IAdrive" since collaborator fetching is not yet implemented.
  4. The directory is uploaded to an Archive.org item using the internetarchive library with a fixed identifier format drive-{drive-id}, collection opensource, and mediatype data, Folder structure is preserved by default (can be disabled with --disable-slash-files)
  5. Downloaded files are automatically cleaned up after upload
  6. Errors are handled gracefully, and debug output is available with --debug

To-do list

  • Google Drive collaborator fetching to use as creator metadata through the Google API

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iadrive-1.0.3.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iadrive-1.0.3-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file iadrive-1.0.3.tar.gz.

File metadata

  • Download URL: iadrive-1.0.3.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for iadrive-1.0.3.tar.gz
Algorithm Hash digest
SHA256 65e5cca2afa1f0eb6445784e409933ac4c34f860fc3659785dbae4ad884a1f89
MD5 6092099940e7fd2f344868212e18575e
BLAKE2b-256 9794a74435a7b7d72a02049585af73e3b946b42b1fb5a57072f4d7125a444def

See more details on using hashes here.

File details

Details for the file iadrive-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: iadrive-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.3

File hashes

Hashes for iadrive-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1fa5f465736e8e7e4490fe5cb07d8131108d50252bc539377dce8dfe89cfc80e
MD5 05c22aafe91e62cbce5436a3d5a9ee77
BLAKE2b-256 33818da7d5c8f926e178de7e99be9290d3230b40539217e9ea03c8c87ec90efa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page