Skip to main content

a simple utility to check and harvest metadata records from an OAI request when they meet theDLTN requirements

Project description

TravisCI badge PyPI badge

About

Tests whether records from an OAI-PMH feed pass minimum requirements of DLTN and optionally harvests only the good records from a request to disk so that they can be added to Repox and included in the DPLA.

Install

Running with Builtin Argument Parsing from a CLI

If you want to do it this way, you’re going to need to clone this. It’s also suggested to build this with pipenv.

$ git clone https://github.com/DigitalLibraryofTennessee/check_and_harvest
$ cd check_and_harvest
$ pipenv install
$ pipenv shell

Using OAIChecker from the dltnchecker module

If you’re cool :sunglasses: :

$ pipenv install dltn_checker

Otherwise:

$ pip install dltn_checker

Examples with the Built In Argument Parser

  1. Check for bad DC records in an entire OAI-PMH feed.

$ python run -e http://my-oai-endpoint:8080/OAIHandler -m oai_dc
  1. Check and harvest good DC records from an entire OAI-PMH feed.

$ python run -e http://my-oai-endpoint:8080/OAIHandler -m oai_dc -H True
  1. Check and harvest good xoai records from a specifc set.

$ python run -e http://my-oai-endpoint:8080/OAIHandler -m xoai -s my_awesome_xoai_set -H True
  1. Check and harvest good MODS records from an entire provider in Repox.

$ python run -e http://my-oai-endpoint:8080/OAIHandler -m MODS -p CrossroadstoFreedomr0 -H True

Examples using the OAIChecker Class from dltnchecker

Check a set to see if there are any bad files in a set.

from dltnchecker.harvest import OAIChecker
request = OAIChecker("https://dpla.lib.utk.edu/repox/OAIHandler", "crossroads_sanitation", "MODS")
request.list_records()
print(request.bad_records)

By default, this will try to download the good files to a directory called output. If you don’t want to download, you need to pass an additional parameter called harvest and set to False.

from dltnchecker.harvest import OAIChecker
request = OAIChecker("https://dpla.lib.utk.edu/repox/OAIHandler", "crossroads_sanitation", "MODS", harvest=False)
request.list_records()
print(request.bad_records)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dltn_checker-0.0.2.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

dltn_checker-0.0.2-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file dltn_checker-0.0.2.tar.gz.

File metadata

  • Download URL: dltn_checker-0.0.2.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for dltn_checker-0.0.2.tar.gz
Algorithm Hash digest
SHA256 2151796d8aca7dd9ad5ad5f6f5de40d537e0652e62611c104ca6c90599e5dcea
MD5 329b31e3271898ac134f28cf315c4722
BLAKE2b-256 7a7c252f4142acc4b4c21ca60f8cf30fe2a3c94759d4fea8e3a6149f5e9402b7

See more details on using hashes here.

File details

Details for the file dltn_checker-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: dltn_checker-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.7.1

File hashes

Hashes for dltn_checker-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 bab11e305c1333dd4d665caf80c6dfae80fae8b15075325209c83f7c30ab9610
MD5 b908383ec7683c3160c391d38802ad2f
BLAKE2b-256 4c7c67ce41a8040f6f3de221c35b8763f6d196fc8f050f19405e58e79e9b1c3e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page