Skip to main content

A Python package to scrape and download files from the Internet Archive.

Project description

Internet Archive Manager (iadl)

A Python package to scrape and download files from the Internet Archive.

Installation

You can install the package using pip:

pip install ia-manager

Usage

Basic Usage

To download all files from an Internet Archive collection:

iadl --url https://archive.org/details/some-collection --dest ./downloads

Filter by File Type

You can filter files by specific types using the following arguments:

  • Download only archive files (e.g., .zip, .rar):

    iadl --url https://archive.org/details/some-collection --dest ./downloads --archive
    
  • Download only video files (e.g., .mp4, .avi):

    iadl --url https://archive.org/details/some-collection --dest ./downloads --video
    
  • Download only audio files (e.g., .mp3, .flac):

    iadl --url https://archive.org/details/some-collection --dest ./downloads --audio

Limit the Number of Files

To limit the number of files downloaded:

iadl --url https://archive.org/details/some-collection --dest ./downloads --limit 5

Show File Links

To display the direct file links of each file in the terminal:

iadl --url https://archive.org/details/some-collection --show-links

Combine Filters

You can combine multiple filters. For example, to download only video and audio files:

iadl --url https://archive.org/details/some-collection --dest ./downloads --video --audio

Simultaneous Downloads

To download multiple files at the same time through separate processes. Setting the number will determine how many files at any one given moment until its finished. Recommend 2-3, be nice to the servers.

iadl --url https://archive.org/details/some-collection --dest ./downloads --audio --concurrent 3

Help

For a full list of options, use the --help flag:

iadl --help

Uninstall

If you wish to remove the dependencies (must be first otherwise pip uninstall will remove the uninstaller, IF YOU WANT THE DEPENDENCIES TO STAY, just skip this command.):

iadl-cleanup

To remove the module:

pip uninstall ia-manager

IF you messed up and ran pip uninstall iadl first, and still want the dependencies removed, no problem just reinstall the package again pip install iadl and repeat the above 2 commands in order.

Install in Virtual Enviroment

For Windows

python -m venv env
env\Scripts\Activate.ps1
pip install ia-manager

When done:

deactivate

For Linux

python3 -m venv env
source env/Scripts/activate
pip install ia-manager

When done:

deactivate

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ia_manager-1.0.4.tar.gz (9.0 kB view details)

Uploaded Source

File details

Details for the file ia_manager-1.0.4.tar.gz.

File metadata

  • Download URL: ia_manager-1.0.4.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ia_manager-1.0.4.tar.gz
Algorithm Hash digest
SHA256 c590adae2dc7aac3ba420e53aa49e6d09ea09a4c76aa6363ab8736c008e3c639
MD5 f612d24605e1bc4fd7658c24212655df
BLAKE2b-256 a4c907d03689038e52f58812e9ddc9d5815097d549f1127a3a006fe86b144e71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page