Skip to main content

A Python library to interact with the Archive-It's API

Project description

🚨 THIS LIBRARY IS UNDER ACTIVE DEVELOPMENT. USE AT YOUR OWN RISK. 🚨

Pyarchiveit

Pyarchiveit is a Python library designed to interact with the Internet Archive's Archive-it API. It provides a simple interface to manage the seeds and collections within Archive-it accounts.

Features

  • Create and update seeds with metadata validation
  • Retrieve seed lists with their metadata for single or multiple collections

Example usage

First, you will need to initialize the Archive-it API client with your account credentials.

from pyArchiveit.api import ArchiveItAPI

# Initialize the Archive-it API client with your credentials
archive_it_client = ArchiveItAPI(
    account_name='your_username',
    account_password='your_password'
)

To create a new seed with metadata, or update an existing seed's metadata, you can use the following code:

# Create a new seed with metadata
metadata = [
    {"value": "Example Metadata 1"},
    {"value": "Example Metadata 2"}
]
new_seed = archive_it_client.create_seed(
    collection_id=123456,
    url='http://example.com',
    crawl_definition_id=789012,
    other_params=None,
    metadata=metadata
)

To update an existing seed's metadata:

# Update an existing seed's metadata
updated_metadata = [
    {"value": "Updated Metadata 1"},
    {"value": "Updated Metadata 2"}
]
updated_seed = archive_it_client.update_seed_metadata(
    seed_id=123456,
    metadata=updated_metadata
)

To retrieve the seed list of a collection or multiple collections:

# Get seed list of a collection
seeds = archive_it_client.get_seeds(collection_ids=123456)

# Or get seeds from multiple collections
seeds = archive_it_client.get_seeds(collection_ids=[123456, 789012])

⚫ Issues

For questions or support, please open an issue on the GitHub repository.

🖊️ Author

Ken Lui - Data Curation Specialist at Map & Data Library, University of Toronto

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyarchiveit-0.1.0.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyarchiveit-0.1.0-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file pyarchiveit-0.1.0.tar.gz.

File metadata

  • Download URL: pyarchiveit-0.1.0.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.7

File hashes

Hashes for pyarchiveit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c9334110825094e299906347f1041a91666ceb36176dd552a04940da7ff5868f
MD5 796b0f93201a3c61392855ab1c0b574a
BLAKE2b-256 faf9bc3e99acbf80f576dd37645abb7f877c36f7fcceb0bc01b825a19c7efe77

See more details on using hashes here.

File details

Details for the file pyarchiveit-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pyarchiveit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 02655c5c1da7a1df53cc103e46ba262324bcca7a33cd9af12448a8d96f6c941f
MD5 b190e6c52c2e256f8f9a11d93235318e
BLAKE2b-256 6f26470ad858d88e583b308379e5e35352a7c1b1476422952142c4cf74257092

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page