Skip to main content

a simple utility to check and harvest metadata records from an OAI request when they meet theDLTN requirements

Project description

TravisCI badge PyPI badge

About

Tests whether records from an OAI-PMH feed pass minimum requirements of DLTN and optionally harvests only the good records from a request to disk so that they can be added to Repox and included in the DPLA.

Install

Running with Builtin Argument Parsing from a CLI

If you want to do it this way, you’re going to need to clone this. It’s also suggested to build this with pipenv.

$ git clone https://github.com/DigitalLibraryofTennessee/check_and_harvest
$ cd check_and_harvest
$ pipenv install
$ pipenv shell

Using OAIChecker from the dltnchecker module

If you’re cool :sunglasses: :

$ pipenv install dltn_checker

Otherwise:

$ pip install dltn_checker

Examples with the Built In Argument Parser

  1. Check for bad DC records in an entire OAI-PMH feed.
$ python run -e http://my-oai-endpoint:8080/OAIHandler -m oai_dc
  1. Check and harvest good DC records from an entire OAI-PMH feed.
$ python run -e http://my-oai-endpoint:8080/OAIHandler -m oai_dc -H True
  1. Check and harvest good xoai records from a specifc set.
$ python run -e http://my-oai-endpoint:8080/OAIHandler -m xoai -s my_awesome_xoai_set -H True
  1. Check and harvest good MODS records from an entire provider in Repox.
$ python run -e http://my-oai-endpoint:8080/OAIHandler -m MODS -p CrossroadstoFreedomr0 -H True

Examples using the OAIChecker Class from dltnchecker

Check a set to see if there are any bad files in a set.

from dltnchecker.harvest import OAIChecker
request = OAIChecker("https://dpla.lib.utk.edu/repox/OAIHandler", "crossroads_sanitation", "MODS")
request.list_records()
print(request.bad_records)

By default, this will try to download the good files to a directory called output. If you don’t want to download, you need to pass an additional parameter called harvest and set to False.

from dltnchecker.harvest import OAIChecker
request = OAIChecker("https://dpla.lib.utk.edu/repox/OAIHandler", "crossroads_sanitation", "MODS", harvest=False)
request.list_records()
print(request.bad_records)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for dltn-checker, version 0.0.2
Filename, size File type Python version Upload date Hashes
Filename, size dltn_checker-0.0.2-py3-none-any.whl (9.1 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size dltn_checker-0.0.2.tar.gz (5.4 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page