Skip to main content

E-ARK Python Information Package Validation

Project description

E-ARK Python Information Package Validator

Core package and command line utility for E-ARK Information Package validation.

The validation core component implements validation rules defined by E-ARK specifications which can be found on the website of the Digital Information LifeCycle Interoperability Standards Board (DILCIS Board):

https://dilcis.eu/specifications/

Quick Start

Pre-requisites

Python 3.10 or later is required to run the E-ARK Python Information Package Validator.

You must be running either a Debian/Ubuntu Linux distribution or Windows Subsystem for Linux on Windows to follow these commands. If you are running a different Linux distribution you must change the apt commands to your package manager. For getting Windows Subsystem for Linux up and running, please follow the guide further down and then come back to this step.

Getting up and running with the E-ARK Python Information Package Validator

Setting up the environment

It is recommended that you create a directory for your EARK work. Write the following:

mkdir EARK

To enter the directory use the following command

cd EARK/

To retrieve the source code from Github use the following command:

git clone https://github.com/E-ARK-Software/eark-validator.git

To enter the new directory containing the source code do:

cd eark-validator/

It is recommended that you create a virtual environment for Python. By doing that you avoid "polluting" the host operating system with dynamically fetched dependencies and at the same time it creates a reproducible environment for your validator.

To create a virtual environment we need to install virtualenv (not to be confused with the venv package). But we also need python3-pip to handle our Python packages. Install this by issuing the following command:

sudo apt install python3-pip

It will list a number of dependencies. Confirm that you wish to install python3-pip by pressing Y followed by ENTER

Now we can install the virtual environment with the following command:

sudo apt install python3-virtualenv

It will list a number of dependencies. Confirm that you wish to install python3-pip by pressing Y followed by ENTER

Finally we will need unzip. Install that by doing:

sudo apt install unzip

It will list a number of dependencies. Confirm that you wish to install python3-pip by pressing Y followed by ENTER

Installing the application

Set up a local virtual environment by issuing the following commands (one line at the time):

virtualenv -p python3 venv
source venv/bin/activate

Update pip to ensure you have the latest and install all the packages required:

pip install -U pip
pip install .

You are now able to run the application "eark-validator". It will validate an Information Package for you.

Testing a valid package.

You can test a valid package by first retrieving it from the test corpus:

wget https://github.com/DILCISBoard/eark-ip-test-corpus/raw/integration/corpora/csip/metadata/metshdr/CSIP12/valid/mets-xml_metsHdr_agent_TYPE_exist.zip

Unzip the package:

unzip mets-xml_metsHdr_agent_TYPE_exist.zip

Delete the .zip-file you just downloaded:

rm mets-xml_metsHdr_agent_TYPE_exist.zip

Run the eark-validator:

eark-validator mets-xml_metsHdr_agent_TYPE_exist/

Result:

('Path mets-xml_metsHdr_agent_TYPE_exist/ is dir, struct result is: '
 'StructureStatus.WellFormed')

A note on testing a directory

If the path passed is a directory, it must contain a single folder which contains the information package (and no other files or folders):

user@machine:~$ tree input
<path to directory>
  ├── documentation
  ├── metadata
  ├── METS.ipxml
  ├── representations
     └── rep1
         ├── data
         ├── metadata
         └── METS.ipxml
  └── schemas

Installing Windows Subsystem for Linux (WSL)

If you do not have Linux and have not previously used WSL please perform the following steps. You must either be logged in as Administrator on the machine or as a user with Administrator rights on the machine.

Start a command prompt (cmd.exe) and then enter the following command:

wsl --install

Confirm that the app is allowed to make changes to your device. Installation begins.

Confirm once more that an app is allowed to make changes to your device.

Retrieving and installing the necessary components take a while. Please do not reboot or shutdown your computer during this process. Even if it seems stalled, it is working.

Installation concludes with the message: "The requested operation is successful. Changes will not be effective until the system is rebooted."

Please reboot your computer.

After reboot

You will be prompted to create a new "UNIX username". By convention this is often a less than nine character long all-lowercase username. It does not need to match your Windows username.

You will be prompted to set a password.

You are now logged into Ubuntu (the default Linux distribution used by Windows Subsystem for Linux).

Update the system

No matter how fresh the install, there will almost always be updates available. To fetch them write the following:

sudo apt update

And to install them:

sudo apt upgrade

Confirm that you wish to upgrade your packages by pressing Y followed by ENTER

Please resume the guide above.

For Developers

Developers should install the testing dependencies as well, e.g. pytest and using the --editable flag:

pip install -U pip
pip install --editable ".[testing]"

Running tests

You can run unit tests from the project root: pytest ./tests/, or generate test coverage figures by: pytest --cov=ip_validation ./tests/. If you want to see which parts of your code aren't tested then: pytest --cov=ip_validation --cov-report=html ./tests/. After this you can open the file <projectRoot>/htmlcov/index.html in your browser and survey the gory details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eark-validator-1.1.2.tar.gz (109.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eark_validator-1.1.2-py3-none-any.whl (150.3 kB view details)

Uploaded Python 3

File details

Details for the file eark-validator-1.1.2.tar.gz.

File metadata

  • Download URL: eark-validator-1.1.2.tar.gz
  • Upload date:
  • Size: 109.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for eark-validator-1.1.2.tar.gz
Algorithm Hash digest
SHA256 5630bfe3cca3b3e0a3dc99159800f51d1cb6d7bf436db3bdad3400e683d81682
MD5 067c6e8af65c3494107bca2f41649f5c
BLAKE2b-256 5f2180d1e96880c26f1c1465d461b2478f59935f968a6f4fcc18b29f20af243f

See more details on using hashes here.

File details

Details for the file eark_validator-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: eark_validator-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 150.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for eark_validator-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fc64d9c53f34892df3b73d9454209355f13776698ec9a188aab7782e164fd35f
MD5 f7f7f55fe7d6f5ed91b2f99d3bc44183
BLAKE2b-256 e3fc2e8896d9b127305b05343cef073f846602418d349ea0224fcd609d537af1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page