A Python library to interact with the Archive-It's API
Project description
$\color{Red}\Huge{\textsf{🚨THIS LIBRARY IS UNDER ACTIVE DEVELOPMENT. USE AT YOUR OWN RISK.🚨}}$
📦 Pyarchiveit
Pyarchiveit is a Python library designed to interact with the Internet Archive's Archive-it API. It provides a simple interface to manage the seeds and collections within Archive-it accounts.
✨ Features
- Create and update seeds with metadata validation
- Retrieve seed lists with their metadata for single or multiple collections
📥 Installation
You can install the library using pip:
pip install pyarchiveit
Or use uv if you have it installed:
uv add pyarchiveit
💡 Example usage
First, you will need to initialize the Archive-it API client with your account credentials.
from pyArchiveit.api import ArchiveItAPI
# Initialize the Archive-it API client with your credentials
archive_it_client = ArchiveItAPI(
account_name='your_username',
account_password='your_password'
)
To create a new seed with metadata, or update an existing seed's metadata, you can use the following code:
# Create a new seed with metadata
metadata = [
{"value": "Example Metadata 1"},
{"value": "Example Metadata 2"}
]
new_seed = archive_it_client.create_seed(
collection_id=123456,
url='http://example.com',
crawl_definition_id=789012,
other_params=None,
metadata=metadata
)
To update an existing seed's metadata:
# Update an existing seed's metadata
updated_metadata = [
{"value": "Updated Metadata 1"},
{"value": "Updated Metadata 2"}
]
updated_seed = archive_it_client.update_seed_metadata(
seed_id=123456,
metadata=updated_metadata
)
To retrieve the seed list of a collection or multiple collections:
# Get seed list of a collection
seeds = archive_it_client.get_seeds(collection_ids=123456)
# Or get seeds from multiple collections
seeds = archive_it_client.get_seeds(collection_ids=[123456, 789012])
⚫ Issues
For questions or support, please open an issue on the GitHub repository.
🖊️ Author
Ken Lui - Data Curation Specialist at Map & Data Library, University of Toronto
📄 License
This project is licensed under the GNU GPLv3 - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyarchiveit-0.1.1.tar.gz.
File metadata
- Download URL: pyarchiveit-0.1.1.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
facf878c45a7ff5f260a7659d795c5790d4fe3971860b29fa601341d7f0431ac
|
|
| MD5 |
c2aec6f73095282b55d1d9b5c6fd8560
|
|
| BLAKE2b-256 |
591fbd671bf5bc7edfcc7fea746bd6b3dc0003f93651e0721b831f350d752641
|
File details
Details for the file pyarchiveit-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pyarchiveit-0.1.1-py3-none-any.whl
- Upload date:
- Size: 6.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9318873a10a047f071c018d2e254babdee99c05af4cf5662171ea08e836df0eb
|
|
| MD5 |
2fe4c3944293c7dc5918fea3fed77fdd
|
|
| BLAKE2b-256 |
47142227f48a4b1c2707c5186c87569357241a01e461392935e5c8edd9cf84a9
|