Skip to main content

A database builder for digital preservation information.

Project description

Preservation Status Database Builder

Returns the preservation status of a Crossref DOI matched against mainstream digital preservation platforms.

license activity Code style: black

Django Git GitHub Linux Python

This application allows you to build a database of digital preservation sources and then to match a DOI against common digital preservation systems.

Installation

The easiest install is via pip:

pip install preservation-database

Then add "preservationdatabase" (no hyphen) to your list of INSTALLED_APPS.

Usage

export DJANGO_SETTINGS_MODULE=import_settings.settings

Usage: python -m preservationdatabase.cli [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
    clear-cache                    Clear the import cache
    import-all                     Download and import all data (excluding...
    import-cariniana               Download and import data from Cariniana
    import-clockss                 Download and import data from CLOCKSS
    import-hathi                   Import data from Hathi (requires local...
    import-internet-archive        Import data from Internet Archive...
    import-internet-archive-items  Import item data from Internet Archive
    import-issnl                   Import ISSN-L mappings
    import-lockss                  Download and import data from LOCKSS
    import-ocul                    Import data from Ocul (requires local...
    import-pkp                     Download and import data from PKP's...
    import-portico                 Download and import data from Portico
    random-samples                 Return random samples that occur in and...
    show-archives                  Clear the import cache
    show-cache                     Show last fill date/times and cache status
    show-issn                      Show preservation items that match an ISSN
    show-preservation              Determine whether a DOI is preserved
    stamp-cache-today              Mark the latest imports as today

Features

  • Cariniana import.
  • CLOCKSS import.
  • HathiTrust import.
  • Internet Archive import.
  • Internet Archive item-level import.
  • LOCKSS import.
  • PKP PLN import.
  • Portico import.
  • Crossref DOI lookup.

First-Run Setup

First, copy example_settings.py to settings.py and check settings.py to ensure that the database you want to use is set correctly. The default is db.sqlite. You should carefully read and check all of settings.py.

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.sqlite3',
        'NAME': BASE_DIR / 'db.sqlite3',
    }
}

Next, run the database build commands:

python3 manage.py makemigrations
python3 manage.py makemigrations preservation-database
python3 manage.py migrate 

You should then have a working database into which you can import new preservation data.

Archive Notes

Internet Archive

The Internet Archive gives a KBART file for the Keepers Registry that we use as a primary ingest source: https://archive.org/details/ia-keepers-registry-kbart. However, this source is not the total coverage of the Internet Archive. However, sadly, the Internet Archive snapshots do not contain external identifiers and the container-level snapshots do not present coverage extent. While it is possible to download the entire 217GB FATCAT database snapshot, this will not be viable for many users. We have therefore stuck with the KBART file that Keepers uses. Extent of coverage in the Internet Archive may, therefore, be under-reported.

Credits

© Crossref 2023

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

preservation_database-0.0.106.tar.gz (28.6 kB view details)

Uploaded Source

Built Distribution

preservation_database-0.0.106-py3-none-any.whl (31.3 kB view details)

Uploaded Python 3

File details

Details for the file preservation_database-0.0.106.tar.gz.

File metadata

File hashes

Hashes for preservation_database-0.0.106.tar.gz
Algorithm Hash digest
SHA256 225514afeaedfe5e0bd93d7f9fc24e2de422340083c844fb44ee891c6086a70d
MD5 477dd4152df0a9e9d57f3bad76d64a3e
BLAKE2b-256 716091715c4624be8b219a3bb43ee3e5d3dd727fa4d3e15506753b01cf2259fd

See more details on using hashes here.

File details

Details for the file preservation_database-0.0.106-py3-none-any.whl.

File metadata

File hashes

Hashes for preservation_database-0.0.106-py3-none-any.whl
Algorithm Hash digest
SHA256 ca4a91a2305829a58e7817e138a7012e4f3d02cc8e3dd328aeb5f9d3355418bb
MD5 e0019bd6db01e7520a89bc4ade1a37d6
BLAKE2b-256 f90fcd255f2d9fd94598ab8c33a0b0cf0531a5278ca040fc8a62ceeff73b9cd3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page