a simple utility to check and harvest metadata records from an OAI request when they meet theDLTN requirements
Tests whether records from an OAI-PMH feed pass minimum requirements of DLTN and optionally harvests only the good records from a request to disk so that they can be added to Repox and included in the DPLA.
Running with Builtin Argument Parsing from a CLI
If you want to do it this way, you’re going to need to clone this. It’s also suggested to build this with pipenv.
$ git clone https://github.com/DigitalLibraryofTennessee/check_and_harvest $ cd check_and_harvest $ pipenv install $ pipenv shell
Using OAIChecker from the dltnchecker module
If you’re cool :sunglasses: :
$ pipenv install dltn_checker
$ pip install dltn_checker
Examples with the Built In Argument Parser
- Check for bad DC records in an entire OAI-PMH feed.
$ python run -e http://my-oai-endpoint:8080/OAIHandler -m oai_dc
- Check and harvest good DC records from an entire OAI-PMH feed.
$ python run -e http://my-oai-endpoint:8080/OAIHandler -m oai_dc -H True
- Check and harvest good xoai records from a specifc set.
$ python run -e http://my-oai-endpoint:8080/OAIHandler -m xoai -s my_awesome_xoai_set -H True
- Check and harvest good MODS records from an entire provider in Repox.
$ python run -e http://my-oai-endpoint:8080/OAIHandler -m MODS -p CrossroadstoFreedomr0 -H True
Examples using the OAIChecker Class from dltnchecker
Check a set to see if there are any bad files in a set.
from dltnchecker.harvest import OAIChecker request = OAIChecker("https://dpla.lib.utk.edu/repox/OAIHandler", "crossroads_sanitation", "MODS") request.list_records() print(request.bad_records)
By default, this will try to download the good files to a directory called output. If you don’t want to download, you need to pass an additional parameter called harvest and set to False.
from dltnchecker.harvest import OAIChecker request = OAIChecker("https://dpla.lib.utk.edu/repox/OAIHandler", "crossroads_sanitation", "MODS", harvest=False) request.list_records() print(request.bad_records)
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size dltn_checker-0.0.2-py3-none-any.whl (9.1 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size dltn_checker-0.0.2.tar.gz (5.4 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for dltn_checker-0.0.2-py3-none-any.whl