Skip to main content

A Python package to scrape and download files from the Internet Archive.

Project description

Internet Archive Manager (iadl)

A Python package to scrape and download files from the Internet Archive.

Installation

You can install the package using pip:

pip install ia-manager

Usage

Basic Usage

To download all files from an Internet Archive collection:

iadl --url https://archive.org/details/some-collection --dest ./downloads

Filter by File Type

You can filter files by specific types using the following arguments:

  • Download only archive files (e.g., .zip, .rar):

    iadl --url https://archive.org/details/some-collection --dest ./downloads --archive
    
  • Download only video files (e.g., .mp4, .avi):

    iadl --url https://archive.org/details/some-collection --dest ./downloads --video
    
  • Download only audio files (e.g., .mp3, .flac):

    iadl --url https://archive.org/details/some-collection --dest ./downloads --audio

Limit the Number of Files

To limit the number of files downloaded:

iadl --url https://archive.org/details/some-collection --dest ./downloads --limit 5

Show File Links

To display the direct file links of each file in the terminal:

iadl --url https://archive.org/details/some-collection --show-links

Combine Filters

You can combine multiple filters. For example, to download only video and audio files:

iadl --url https://archive.org/details/some-collection --dest ./downloads --video --audio

Simultaneous Downloads

To download multiple files at the same time through separate processes. Setting the number will determine how many files at any one given moment until its finished. Recommend 2-3, be nice to the servers.

iadl --url https://archive.org/details/some-collection --dest ./downloads --audio --concurrent 3

Help

For a full list of options, use the --help flag:

iadl --help

Uninstall

If you wish to remove the dependencies (must be first otherwise pip uninstall will remove the uninstaller, IF YOU WANT THE DEPENDENCIES TO STAY, just skip this command.):

iadl-cleanup

To remove the module:

pip uninstall ia-manager

IF you messed up and ran pip uninstall iadl first, and still want the dependencies removed, no problem just reinstall the package again pip install iadl and repeat the above 2 commands in order.

Install in Virtual Enviroment

For Windows

python -m venv env
env\Scripts\Activate.ps1
pip install ia-manager

When done:

deactivate

For Linux

python3 -m venv env
source env/Scripts/activate
pip install ia-manager

When done:

deactivate

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ia_manager-1.0.2.tar.gz (8.6 kB view details)

Uploaded Source

File details

Details for the file ia_manager-1.0.2.tar.gz.

File metadata

  • Download URL: ia_manager-1.0.2.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for ia_manager-1.0.2.tar.gz
Algorithm Hash digest
SHA256 ce9ff28c68ad87da12a9a67286625bc7b9918db7a643723aad973d6f6a5be60f
MD5 496b32136f9b7d478362596f5f528dae
BLAKE2b-256 ce77ccce9104fb27030d326292fa7163e4dc6e09be81396a254d056cb53ab368

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page