Skip to main content

MOAI, A Open Access Server Platform for Institutional Repositories

Project description

MOAI — Open Access Server Platform for Institutional Repositories

Tests Python Version License

MOAI is a platform for aggregating content from different sources and publishing it through the Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH). It can harvest data from various sources — OAI feeds, SQL databases, XML files, Fedora Commons, EPrints, DSpace — and serve multiple OAI feeds from a single server, each with independent configuration.

Support graciously provided by

IPL Web

About this fork

This is a maintained fork of MOAI by Infrae, adding Python 3 support, modern packaging (pyproject.toml, uv), and GitHub Actions CI. Changes were offered upstream via PR #5.

Note: Other than modernizing the tooling, there are no major functional changes. Some parts of the documentation below may be outdated. Patches welcome.

Installation

Supported Python versions

Python 3.9 3.10 3.11 3.12 3.13

We recommend using uv for dependency management. Instructions below are for Unix, but MOAI should also work on Windows.

Using uv (recommended)

cd moai
uv sync

Using pip

pip install MOAI-iplweb

Running tests

uv sync --extra test
uv run pytest

Running in development mode

The development server should never be used in production. It is convenient for testing and development.

cd moai
uv run paster serve settings.ini

This will print something like:

Starting server in PID 7306.
Starting HTTP server on http://127.0.0.1:8080

You can now visit localhost:8080/oai to view the MOAI OAI-PMH feed.

Configuring MOAI

Configuration is done in the settings.ini file. The default settings file uses the Paste#urlmap application to map WSGI applications to a URL.

In the [composite:main] section there is a line:

/oai = moai_example

Which maps the /oai URL to a MOAI instance. This makes it easy to run many MOAI instances in one server, each with its own configuration.

The [app:moai_example] configuration lets you specify the following options:

Option Description
name The name of the OAI feed (returned in Identify verb)
url The URL of the OAI feed (returned in OAI-PMH XML output)
admin_email The email address of the admin (returned in Identify verb)
formats Available metadata formats
disallow_sets List of setspecs that are not allowed in the output of this feed
allow_sets If used, only sets listed here will be returned
database SQLAlchemy URI to identify the database used for storage
provider Provider identifier where MOAI retrieves content from
content Class that maps metadata from provider format to MOAI format

Adding content

The MOAI system is designed to periodically fetch content from a provider, and convert it to MOAI's internal format, which can then be translated to the different metadata formats for the OAI-PMH feed.

MOAI comes with an example that shows this principle:

In the moai/moai directory there are two XML files. Let's pretend these files are from a remote system, and we want to publish them with MOAI.

In the settings.ini file, the following option is specified:

provider = file://moai/example-*.xml

This tells MOAI that we want to use a file provider, with some files located in moai/example-*.xml.

The following option points to the class that we want to use for converting the example content XML data to MOAI's internal format:

content = moai_example

The last option tells MOAI where to store its data, this is usually a SQLite database:

database = sqlite:///moai-example.db

Now let's try to add these two XML files. First visit the OAI-PMH feed to make sure nothing is already being served:

http://localhost:8080/oai?verb=ListRecords&metadataPrefix=oai_dc

This should return a noRecordsMatch error.

To add the content, run the update_moai script with the section name from the settings.ini as argument:

uv run update_moai moai_example

This will produce the following output:

/ Updating content provider: example-2345.xml
Content provider returned 2 new/modified objects

100.0%[====================================================================>] 2
Updating database with 2 objects took 0 seconds

Now when you visit the OAI-PMH feed again you should see the two records:

http://localhost:8080/oai?verb=ListRecords&metadataPrefix=oai_dc

When you run the update_moai script again, it will create a new database with all the records. It is also possible to specify a date with the --date switch. When a date is specified, only records that were modified after this date will be added. The update_moai script can be run from a daily or hourly cron job to update the database.

Adding your own Provider / Content and Metadata classes

It's possible — and most of the time, needed — to extend MOAI for your use-cases. The Provider and Content classes from the example might be a good starting point. All your customizations should be registered with MOAI through entry_points. Have a look at MOAI's pyproject.toml for more information.

The best approach would be to create your own Python package with pyproject.toml and install it in the same environment as MOAI. This will let MOAI find your customizations. Note that when you change something in your package metadata, you have to reinstall the package for MOAI to pick up the changes.

The moai.interfaces file contains documentation about the different classes that you can implement.

Adding your own database

Instead of writing your own provider/content classes, you can also register your own custom database. Implementing a replacement for moai.database.SQLDatabase can be more complicated than writing a provider/content class, but it has the advantage that MOAI is always up to date and you don't need a second SQLite database.

Have a look at the pyproject.toml file — it registers several databases. You could use this mechanism to register your own database from your own Python package.

In the settings.ini configuration you can then reference your database (mydb://some+config+variables).

For the database, have a look at the generic database provider in database.py. The only methods that you need to implement are: oai_sets, oai_earliest_datestamp and oai_query.

The oai_query method returns dictionaries with record data. The keys of these dictionaries are defined in the metadata files (for example metadata.py) — have a look at the source.

For oai_dc there are the following names:

title, creator, subject, description, publisher, contributor, type, format, identifier, source, language, date, relation, coverage, rights

So a return value would look like:

{'id': '<oai record id>',
 'deleted': '<bool>',
 'modified': '<utc datetime>',
 'sets': ['<list of setspecs>'],
 'metadata': {
   'title': ['<list with publication title>'],
   'creator': ['<list of creator names>'],
   ...}
}

License

BSD-3-Clause — see LICENSE.txt for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moai_iplweb-2.0.1.tar.gz (36.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

moai_iplweb-2.0.1-py3-none-any.whl (43.8 kB view details)

Uploaded Python 3

File details

Details for the file moai_iplweb-2.0.1.tar.gz.

File metadata

  • Download URL: moai_iplweb-2.0.1.tar.gz
  • Upload date:
  • Size: 36.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.2

File hashes

Hashes for moai_iplweb-2.0.1.tar.gz
Algorithm Hash digest
SHA256 31266801424410994731cee79f897179ed65881948646bd4a6529cde1351e1ad
MD5 13784590bfd0ba18577f998d9aa62508
BLAKE2b-256 3357771558d6cbd4868652d0284f0e36bed7f408a34c3e794a58da294cf7a7ff

See more details on using hashes here.

File details

Details for the file moai_iplweb-2.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for moai_iplweb-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 396a954780c9bb6a566dc24446eb8e98bcdb479bea9b8eb7978455f7d5db06b5
MD5 374edb1f3aadc0aa62061e88c1b45a31
BLAKE2b-256 e3988ef9f7ab6af9a06209dc8e0228b4fbf7418051ef49d613aa6db5b81df47d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page