Skip to main content

Handling Draft and Production invenio records in one package

Project description

Invenio Records Draft

image image image image

Beta version, use at your own risk!!!

What the library does

Easily adds draft (a.k.a deposit) to all your invenio data models. REST API stays the same with extra operations for publish, unpublish, edit published record. REST example (some links omitted for brevity):

$ curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"title":"blah"}' \
  https://localhost:5000/api/draft/records/

returns:
    {
      "metadata": {
        "title": "blah",
        "oarepo:validity": {
          "valid": false,
          "marshmallow": [
            "field": "title",
            "message": "too short"
          ]
        }
      }
    }


$ curl --header "Content-Type: application/json" \
  --request PUT \
  --data '{"title":"longer blah"}' \
  https://localhost:5000/api/draft/records/1

returns:
    {
      "links": {
        "publish": "https://localhost:5000/api/draft/records/1/publish/"
      },
      "metadata": {
        "title": "longer blah",
        "oarepo:validity": {
          "valid": true
        }
      }
    }


$ curl --request POST \
  https://localhost:5000/api/draft/records/1/publish/

returns:
    302 Location https://localhost:5000/api/records/1

Installation

pip install oarepo-records-draft oarepo-validate

Configuration

To enable the library for your data model, you have to perform the following steps:

  • write the "published" version of your model, including marshmallow, mapping and json schemas.
  • Inherit the record from SchemaKeepingRecordMixin and MarshmallowValidatedRecordMixin from oarepo-validate modules. See oarepo-validate library for details on on-record validation vs. rest-access validation.
    Have a look at a sample record
  • Drop marshmallow loader & serializer from loaders and serializers, see oarepo-validate for details
  • Move the configuration of rest endpoint from RECORDS_REST_ENDPOINTS to RECORDS_DRAFT_ENDPOINTS. Add "draft": "<draftpid>" to the configuration, where draftpid is any unused pid type less or equal 6 chars in length.
RECORDS_DRAFT_ENDPOINTS = {             # <--- moved here
    'recid': dict(
        draft='drecid',                 # <--- added here
        pid_type='recid',
        pid_minter='recid',
        pid_fetcher='recid',
        list_route='/records/',
        item_route='/records/<{0}:pid_value>'.format(RECORD_PID),
        # rest of the stuff here
    ),
}
RECORDS_REST_ENDPOINTS = {}             # <--- made empty 
  • Do not forget to propagate this variable to the application's config (for example, in ext/init_config)
  • Add endpoint for draft to the same dict:
RECORDS_DRAFT_ENDPOINTS = {        
    'recid': dict(
        draft='drecid',
        # as above
    ),
    'drecid': dict(                     # <--- added here
    
    )
}
  • move create/update/delete permission factories from the published endpoint to the draft one
    • published record will automatically get deny_all for all modification operations
    • in the example below everyone can create/update/delete draft record, which is probably not what you want
RECORDS_DRAFT_ENDPOINTS = {            
    'recid': dict(
        # as above                      
    ),
    'drecid': dict(                     # <--- moved here
        create_permission_factory_imp=allow_all,
        delete_permission_factory_imp=allow_all,
        update_permission_factory_imp=allow_all,
    )
}
  • on published endpoint, add permission factories for publish/unpublish/edit
RECORDS_DRAFT_ENDPOINTS = {            
    'recid': dict(
        # as above                      # <--- added here    
        publish_permission_factory_imp=allow_logged_in,
        unpublish_permission_factory_imp=allow_logged_in,
        edit_permission_factory_imp=allow_logged_in,
    ),
    'drecid': dict(                     
        create_permission_factory_imp=allow_all,
        delete_permission_factory_imp=allow_all,
        update_permission_factory_imp=allow_all,
    )
}

Run invenio index init/create, start server and you're done. A new endpoint has been created for you and is at /api/draft/records. The whole configuration is in sample app

Library principles:

  1. Draft records follow the same json schema as published records with the exception that:

    1. invalid records are stored and indexed
    2. all properties are not required even though they are marked as such.
    3. Extra properties not defined in marshmallow/json schema can be stored but are marked invalid and the properties are not indexed in ES.
  2. "Draft" records live at a different endpoint and different ES index than published ones. The recommended URL is /api/records for the published records and /api/draft/records for drafts

  3. Draft and published records share the same value of pid but have two different pid types

  4. Published records can not be directly created/updated/patched. Draft records can be created/updated/patched.

  5. Invenio record contains Link header and links section in the JSON payload. Links of a published record contain (apart from self):

    1. draft - a url that links to the "draft" version of the record. This url is present only if the draft version of the record exists and the caller has the rights to edit the draft
    2. edit - URL to a handler that creates a draft version of the record and then returns HTTP 302 redirect to the draft version. This url is present only if the draft version does not exist
    3. unpublish - URL to a handler that creates a draft version of the record if it does not exist, deletes the published version and then returns HTTP 302 to the draft.
  6. On a draft record the links contain (apart from self):

    1. published - a url that links to the "published" version of the record. This url is present only if the published version of the record exists
    2. publish - a POST to this url publishes the record. The JSONSchema and marshmallow schema of the published record must pass. After the publishing the draft record is deleted. HTTP 302 is returned pointing to the published record.
  7. The serialized representation of a draft record contains a section named oarepo:validity. This section contains the result of marshmallow and JSONSchema validation against original schemas.

  8. Deletion of a published record does not delete the draft record.

  9. Deletion of a draft record does not delete the published record.

  10. All properties on published RECORDS_REST_ENDPOINTS are propagated to the draft endpoint, some of those modified. For a complete list/algorithm, see setup_draft_endpoint method in endpoints.py

REST API

Normal invenio-records-rest API is available both on draft and published endpoints, with the exception that published endpoints have the implicit permissions of create/update/delete operations set to deny_all.

Publishing draft record

Let's create a record (security not shown here):

$ curl --header "Content-Type: application/json" \
  --request POST \
  --data '{"title":"blah"}' \
  https://localhost:5000/api/draft/records/

201 Location https://localhost:5000/api/draft/records/1

And get it to see, if it is valid:

$ curl https://localhost:5000/api/draft/records/1

{
  links: {
    self: https://localhost:5000/api/draft/records/1,
    publish: https://localhost:5000/api/draft/records/1/publish/
  },
  metadata: {
    "title": "blah",
    "oarepo:validity": {
       "valid": true
     }
  }
}

As it is valid, we can publish the record (security not shown here):

$ curl --request POST \
  https://localhost:5000/api/draft/records/1/publish/

302 Location https://localhost:5000/api/records/1

And when retrieved, it does not contain the validation section:

$ curl https://localhost:5000/api/records/1

{
  links: {
    self: https://localhost:5000/api/records/1,
    unpublish: https://localhost:5000/api/records/1/unpublish/,
    edit: https://localhost:5000/api/records/1/edit/
  },
  metadata: {
    "title": "blah"
  }
}

Editing published record

Published record can not be edited in place, at first a draft record should be created. See the links section above for edit url:

$ curl --request POST \
  https://localhost:5000/api/records/1/edit/

302 Location https://localhost:5000/api/draft/records/1

Now the published record is still available (and not modified) and you have a url for the draft record for making your changes. When finished, run publish action above to publish your changes.

Unpublishing published record

To remove a published record and "move" it to draft, call unpublish:

$ curl --request POST \
  https://localhost:5000/api/records/1/unpublish/

302 Location https://localhost:5000/api/draft/records/1

After this action, the published record is deleted (and says 410 gone) and draft record is created. You can delete the draft record if desired or update and publish it again.

Files

If invenio-files-rest is installed, the configuration of RECORDS_DRAFT_ENDPOINTS might contain a file section:

RECORDS_DRAFT_ENDPOINTS = {
    'recid': dict(
        draft='drecid',
        # ...
    ),
    'drecid': dict(
        # ...
        files=dict(
            put_file_factory=Permission(RoleNeed('role1')),
            get_file_factory=Permission(RoleNeed('role1')),
            delete_file_factory=Permission(RoleNeed('role1')),
            # restricted=False,
            # as_attachment=False
        )
    )
}

Permission factories have a signature:

permission_factory(view: [oarepo_records_draft.actions.files.FileResource|FileListResource],
                   record: Record, 
                   key: str, 
                   file_object: invenio_record_files.api.FileObject)

and returns an object with can() method.

Adding the factory creates urls in the same manner as invenio-records-rest:

/api/<records or draft records>/<recid>/files/
    GET  ... lists all the files uploaded to a record
    POST<form data with a file> ... uploads a file and associates metadata with the file

/api/<records>/<recid>/files/<key>
   GET  ... downloads file with the given key
   PUT  ... creates or updates file at the given key, payload is the file stream
   DELETE ... removes file at the given key 

Extra REST actions

You can add extra rest actions for each resource that are registered within draft blueprint. They can be registered either globally via entry points (see python api later) or locally in the configuration:

RECORDS_DRAFT_ENDPOINTS = {
    'recid': dict(
        draft='drecid',
        # ...
    ),
    'drecid': dict(
        # ...
        actions={
            'path': <handler_function_registered_to_blueprint>,
            # ...
        }
    )
}

Python API

The entrypoint to python API is at oarepo_records_draft.current_drafts. It has the following public methods:

publish(record: Record, record_pid: PersistentIdentifier)

Publishes an instance of draft record with draft pid record_pid. Raises InvalidRecordException if the record is not valid (according to jsonschema/marshmallow) and can not be published. Returns a list of (draft_record: Record, published_record_context: RecordContext) tuples.

What it does:

  1. invokes collect_records_for_action signal to collect all records that should be published (sometimes linked records should be published as well)
  2. calls check_can_publish signal for each collected record
  3. calls before_publish signal
  4. for each record in reversed collected records publishes the record and deletes draft one
  5. calls after_publish signal
  6. for each record removes draft record from elasticsearch and indexes published one
  7. refreshes affected ES indices

unpublish(record: Record, record_pid: PersistentIdentifier)

Removes published instance and creates a draft one. record is the published record being unpublished, record_pid is its persistent identifier.

Returns a list of (published_record: Record, draft_record_context: RecordContext) tuples.

What it does:

  1. invokes collect_records_for_action signal to collect all records that should be published (sometimes linked records should be published as well)
  2. calls check_can_unpublish signal for each collected record
  3. calls before_unpublish signal
  4. for each record in reversed collected records removes the published record and creates draft
  5. calls after_unpublish signal
  6. for each record removes published record from elasticsearch and indexes draft one
  7. refreshes affected ES indices

edit(record: Record, record_pid: PersistentIdentifier)

Keeps published instance and creates a draft one. record is the published record to be edited , record_pid is its persistent identifier.

Returns a list of (published_record: Record, draft_record_context: RecordContext) tuples.

What it does:

  1. invokes collect_records_for_action signal to collect all records that should be published (sometimes linked records should be published as well)
  2. calls check_can_edit signal for each collected record
  3. calls before_edit signal
  4. for each record in reversed collected records removes creates draft record
  5. calls after_edit signal
  6. Indexes each created draft record in ES
  7. refreshes affected ES indices

Signals

See signals.py for the exhaustive list of signals

Globally-defined actions

To define actions on all draft-managed resources, create your view and a factory function. The function is called for that returns a dictionary with actions for the given endpoint:

class TestResource(MethodView):
    view_name = '{endpoint}_test'

    @pass_record
    def get(self, pid, record):
        return jsonify({'status': 'ok'})


def action_factory(code, files, rest_endpoint, extra, is_draft):
    # decide if view should be created on this resource and return blueprint mapping
    # rest path -> view 
    return {
        'files/_test': TestResource.as_view(
            TestResource.view_name.format(endpoint=code)
        )
    }

Register the factory to entry points and all records will have an additional rest action at path /api/<records>/1/files/_test

entry_points={
    'oarepo_records_draft.extra_actions': [
        'sample = sample.test:extras'
    ]
}

Custom uploaders

Sometimes you might want to define a custom uploader that creates record.files[key] instead of the default implementation of record.files[key] = <stream from request>. This might be usable for example when the stream will be provided in a different way (direct link to S3, for example).

Create an uploader function (this one is from tests and replaces content for test-uploader key):

def uploader(record, key, files, pid, request, resolver, endpoint, **kwargs):
    if key == 'test-uploader':
        bt = BytesIO(b'blah')
        files[key] = bt
        return lambda: ({
            'test-uploader': True,
            'url': resolver(TestResource.view_name)
        })

The function returns either None if it did not handle the upload or a no-arg function that is used later to return JSON that will be serialized as HTTP 200/201 response.

Register the function to entry points:

entry_points={
    'oarepo_records_draft.uploaders': [
        'sample = sample.test:uploader'
    ]
}

Facets and filters

To add facets and filters for validation errors, import and use DRAFT_FACETS and DRAFT_FILTERS

from oarepo_records_draft import DRAFT_FACETS, DRAFT_FILTERS

RECORDS_REST_FACETS = {
    'draft-records-record-v1.0.0': {
        'aggs': {
            # ...
            **DRAFT_FACETS
        },
        'filters': {
            # ...
            **DRAFT_FILTERS
        }
    }
}

Q&A

Can I provide my own record class for draft records?

Yes. The class has to inherit DraftRecordMixin, SchemaKeepingRecordMixin and MarshmallowValidatedRecordMixin. DraftRecordMixin should be the first mixin.

Set record_class property in RECORDS_DRAFT_ENDPOINTS/<your draft endpoint>

Can I provide my own link factory for either draft or published record?

Yes. Provide links_factory_imp as usual and it will get called. URLs for publishing/unpublishing/editing will be added automatically

Can I change XXX in rest config?

Yes, you can. Some of the values are automatically provided for drafts, please consult endpoints.py to see if the value should be modified for draft record or not.

I have my own record_to_index implementation. Is this library compatible?

Might be. See record.py to see which index must be used for draft records.

I really need to raise exception when user submits incorrect data

Think twice if you really want to do it, as it prevents users to pass incomplete record and work on it later.

If you really want to abort the process, raise an instance of FatalDraftException from within the marshmallow validation. This exception will cause the validation to terminate and the draft record will not be modified/created.

You can use the exception to re-raise a custom exception, for example:

from oarepo_records_draft import FatalDraftException
from flask import abort
try:
    # do something that can lead to
    abort(403)
except Exception as e:
    raise FatalDraftException() from e

The wrapped exception will later be retrieved and passed on to later stages of processing, and the correct HTTP 403 error page/json will be sent to the user.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oarepo-records-draft-5.5.8.tar.gz (37.2 kB view details)

Uploaded Source

Built Distribution

oarepo_records_draft-5.5.8-py3-none-any.whl (39.6 kB view details)

Uploaded Python 3

File details

Details for the file oarepo-records-draft-5.5.8.tar.gz.

File metadata

  • Download URL: oarepo-records-draft-5.5.8.tar.gz
  • Upload date:
  • Size: 37.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for oarepo-records-draft-5.5.8.tar.gz
Algorithm Hash digest
SHA256 8f412c510e13aebcacb68aa880d16a601f935f05a77e274e583b3ee2b760ef1a
MD5 ea880ed15d58770b6cfc917a3fbcbd1f
BLAKE2b-256 a37e0876b8513e54756e4bdf4d87a7f7b0fcda2c9c0b58d66cfa02c31f04d1b5

See more details on using hashes here.

File details

Details for the file oarepo_records_draft-5.5.8-py3-none-any.whl.

File metadata

File hashes

Hashes for oarepo_records_draft-5.5.8-py3-none-any.whl
Algorithm Hash digest
SHA256 012e08849d84705216914b3ff3aa859a939c494a347a1230323250545f3afdce
MD5 5ec1edb5e792ef2b815b982f09770771
BLAKE2b-256 36bf22170e464a4e33bc8b738d33e023e0f435c2460e3fcd59ac139ec9d57b40

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page