Skip to main content

MOAI, A Open Access Server Platform for Institutional Repositories

Project description

MOAI — Open Access Server Platform for Institutional Repositories

Tests Python Version License

MOAI is a platform for aggregating content from different sources and publishing it through the Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH). It can harvest data from various sources — OAI feeds, SQL databases, XML files, Fedora Commons, EPrints, DSpace — and serve multiple OAI feeds from a single server, each with independent configuration.

Support graciously provided by

IPL Web

About this fork

This is a maintained fork of MOAI by Infrae, adding Python 3 support, modern packaging (pyproject.toml, uv), and GitHub Actions CI. Changes were offered upstream via PR #5.

Note: Other than modernizing the tooling, there are no major functional changes. Some parts of the documentation below may be outdated. Patches welcome.

Installation

Supported Python versions

Python 3.9 3.10 3.11 3.12 3.13

We recommend using uv for dependency management. Instructions below are for Unix, but MOAI should also work on Windows.

Using uv (recommended)

cd moai
uv sync

Using pip

pip install MOAI-iplweb

Running tests

uv sync --extra test
uv run pytest

Running in development mode

The development server should never be used in production. It is convenient for testing and development.

cd moai
uv run paster serve settings.ini

This will print something like:

Starting server in PID 7306.
Starting HTTP server on http://127.0.0.1:8080

You can now visit localhost:8080/oai to view the MOAI OAI-PMH feed.

Configuring MOAI

Configuration is done in the settings.ini file. The default settings file uses the Paste#urlmap application to map WSGI applications to a URL.

In the [composite:main] section there is a line:

/oai = moai_example

Which maps the /oai URL to a MOAI instance. This makes it easy to run many MOAI instances in one server, each with its own configuration.

The [app:moai_example] configuration lets you specify the following options:

Option Description
name The name of the OAI feed (returned in Identify verb)
url The URL of the OAI feed (returned in OAI-PMH XML output)
admin_email The email address of the admin (returned in Identify verb)
formats Available metadata formats
disallow_sets List of setspecs that are not allowed in the output of this feed
allow_sets If used, only sets listed here will be returned
database SQLAlchemy URI to identify the database used for storage
provider Provider identifier where MOAI retrieves content from
content Class that maps metadata from provider format to MOAI format

Adding content

The MOAI system is designed to periodically fetch content from a provider, and convert it to MOAI's internal format, which can then be translated to the different metadata formats for the OAI-PMH feed.

MOAI comes with an example that shows this principle:

In the moai/moai directory there are two XML files. Let's pretend these files are from a remote system, and we want to publish them with MOAI.

In the settings.ini file, the following option is specified:

provider = file://moai/example-*.xml

This tells MOAI that we want to use a file provider, with some files located in moai/example-*.xml.

The following option points to the class that we want to use for converting the example content XML data to MOAI's internal format:

content = moai_example

The last option tells MOAI where to store its data, this is usually a SQLite database:

database = sqlite:///moai-example.db

Now let's try to add these two XML files. First visit the OAI-PMH feed to make sure nothing is already being served:

http://localhost:8080/oai?verb=ListRecords&metadataPrefix=oai_dc

This should return a noRecordsMatch error.

To add the content, run the update_moai script with the section name from the settings.ini as argument:

uv run update_moai moai_example

This will produce the following output:

/ Updating content provider: example-2345.xml
Content provider returned 2 new/modified objects

100.0%[====================================================================>] 2
Updating database with 2 objects took 0 seconds

Now when you visit the OAI-PMH feed again you should see the two records:

http://localhost:8080/oai?verb=ListRecords&metadataPrefix=oai_dc

When you run the update_moai script again, it will create a new database with all the records. It is also possible to specify a date with the --date switch. When a date is specified, only records that were modified after this date will be added. The update_moai script can be run from a daily or hourly cron job to update the database.

Adding your own Provider / Content and Metadata classes

It's possible — and most of the time, needed — to extend MOAI for your use-cases. The Provider and Content classes from the example might be a good starting point. All your customizations should be registered with MOAI through entry_points. Have a look at MOAI's pyproject.toml for more information.

The best approach would be to create your own Python package with pyproject.toml and install it in the same environment as MOAI. This will let MOAI find your customizations. Note that when you change something in your package metadata, you have to reinstall the package for MOAI to pick up the changes.

The moai.interfaces file contains documentation about the different classes that you can implement.

Adding your own database

Instead of writing your own provider/content classes, you can also register your own custom database. Implementing a replacement for moai.database.SQLDatabase can be more complicated than writing a provider/content class, but it has the advantage that MOAI is always up to date and you don't need a second SQLite database.

Have a look at the pyproject.toml file — it registers several databases. You could use this mechanism to register your own database from your own Python package.

In the settings.ini configuration you can then reference your database (mydb://some+config+variables).

For the database, have a look at the generic database provider in database.py. The only methods that you need to implement are: oai_sets, oai_earliest_datestamp and oai_query.

The oai_query method returns dictionaries with record data. The keys of these dictionaries are defined in the metadata files (for example metadata.py) — have a look at the source.

For oai_dc there are the following names:

title, creator, subject, description, publisher, contributor, type, format, identifier, source, language, date, relation, coverage, rights

So a return value would look like:

{'id': '<oai record id>',
 'deleted': '<bool>',
 'modified': '<utc datetime>',
 'sets': ['<list of setspecs>'],
 'metadata': {
   'title': ['<list with publication title>'],
   'creator': ['<list of creator names>'],
   ...}
}

License

BSD-3-Clause — see LICENSE.txt for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moai_iplweb-2.0.2.tar.gz (36.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

moai_iplweb-2.0.2-py3-none-any.whl (43.9 kB view details)

Uploaded Python 3

File details

Details for the file moai_iplweb-2.0.2.tar.gz.

File metadata

  • Download URL: moai_iplweb-2.0.2.tar.gz
  • Upload date:
  • Size: 36.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.2

File hashes

Hashes for moai_iplweb-2.0.2.tar.gz
Algorithm Hash digest
SHA256 6109e12181ef2f473667372da6f785fbfb2c14724019a991847a299538243038
MD5 9db19612d5ae7af49eba0c627b4c0216
BLAKE2b-256 1c13d2624e206d90885e93a49cb1320774abc82a239fa10ff06ec20210240b74

See more details on using hashes here.

File details

Details for the file moai_iplweb-2.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for moai_iplweb-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 25c5a307b5a9d2c465ba822c6b28d757e8eca0f00d35dcc07cca672a5a1c5425
MD5 a2a1b782f0d6736b58e9d8046a521ead
BLAKE2b-256 565967c4fd977373fbe5ac4834e5306a39c93a5c8fa3038b2145b9840300981b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page