An utility library that generates OARepo required data model files from a JSON specification file
Project description
OARepo model builder
A library and command-line tool to generate invenio model from a single model file.
CLI Usage
oarepo-compile-model model.yaml
will compile the model.yaml into the current directory. Options:
--output-directory <dir> Output directory where the generated files will be
placed. Defaults to "."
--package <name> Package into which the model is generated. If not
passed, the name of the current directory,
converted into python package name, is used.
--set <name=value> Overwrite option in the model file.
Example --set settings.elasticsearch.keyword-ignore-above=20
-v Increase the verbosity. This option can be used
multiple times.
--config <filename> Load a config file and replace parts of the model
with it. The config file can be a json, yaml or a
python file. If it is a python file, it is
evaluated with the current model stored in the
"oarepo_model" global variable and after the
evaluation all globals are set on the model.
--isort / --skip-isort Call isort on generated sources (default: yes)
--black / --skip-black Call black on generated sources (default: yes)
Model file structure
A model is a json/yaml file with the following structure:
settings:
python:
elasticsearch:
model:
properties:
title: { type: 'fulltext' }
There might be more sections (documentation etc.), but only the settings
and model
are currently processed.
settings section
The settings section might contain the following keys (default values below):
settings:
package: basename(output dir) with '-' converted to '_'
kebap-package: to_kebap(package)
package-path: path to package as python Path instance
schema-version: 1.0.0
schema-name: { kebap-package }-{schema-version}.json
schema-file: full path to generated json schema
mapping-file: full path to generated mapping
collection-url: camel_case(last component of package)
processing-order: [ 'settings', '*', 'model' ]
python:
record-prefix: camel_case(last component of package)
templates: { } # overridden templates
marshmallow:
top-level-metadata: true
mapping: { }
record-prefix-snake: snake_case(record_prefix)
record-class: { settings.package }.record.{record_prefix}Record
# full record class name with package
record-schema-class: { settings.package }.schema.{record_prefix}Schema
# full record schema class name (apart from invenio stuff, contains only metadata field)
record-schema-metadata-class: { settings.package }.schema.{record_prefix}MetadataSchema
# full record schema metadata class name (contains model schema as marshmallow)
record-schema-metadata-alembic: { settings.package_base }
# name of key in pyproject.toml invenio_db.alembic entry point
record-metadata-class: { settings.package }.metadata.{record_prefix}Metadata
# db class to store record's metadata
record-metadata-table-name: { record_prefix.lower() }_metadata
# name of database table for storing metadata
record-permissions-class: { settings.package }.permissions.{record_prefix}PermissionPolicy
# class containing permissions for the record
record-dumper-class: { settings.package }.dumper.{record_prefix}Dumper
# record dumper class for elasticsearch
record-search-options-class: { settings.package }.search_options.{record_prefix}SearchOptions
# search options for the record
record-service-config-class: { settings.package }.service_config.{record_prefix}ServiceConfig
# configuration of record's service
record-resource-config-class: { settings.package }.resource.{record_prefix}ResourceConfig
# configuration of record's resource
record-resource-class: { settings.package }.resource.{record_prefix}Resource
# record resource
record-resource-blueprint-name: { record_prefix }
# blueprint name of the resource
register-blueprint-function: { settings.package }.blueprint.register_blueprint'
# name of the blueprint registration function
elasticsearch:
keyword-ignore-above: 50
plugins:
packages: [ ]
# list of extra packages that should be installed in compiler's venv
output|builder|model|property:
# plugin types - file outputs, builders, model preprocessors, property preprocessors
disabled: [ ]
# list of plugin names to disable
# string "__all__" to disable all plugins in this category
enabled:
# list of plugin names to enable. The plugins will be used
# in the order defined. Use with disabled: __all__
# list of "module:className" that will be added at the end of
# plugin list
model section
The model section is a json schema that might be annotated with extra sections. For example:
model:
properties:
title:
type: multilingual
oarepo:ui:
label: Title
class: bold-text
oarepo:documentation: |
Lorem ipsum ...
Dolor sit ...
Note: multilingual
is a special type (not defined in this library) that is translated to the correct schema,
mapping and marshmallow files with a custom PropertyPreprocessor
.
oarepo:ui
gives information for the ui output
oarepo:documentation
is a section that is currently ignored
Referencing a model
API Usage
To generate invenio model from a model file, perform the following steps:
-
Load the model into a
ModelSchema
instancefrom oarepo_model_builder.schema import ModelSchema from oarepo_model_builder.loaders import yaml_loader included_models = { 'my_model': lambda parent_model: {'test': 'abc'} } loaders = {'yaml': yaml_loader} model = ModelSchema(file_path='test.yaml', included_models=included_models, loaders=loaders)
You can also path directly the content of the file path in
content
attributeThe
included_models
is a mapping between model key and its accessor. It is used to replace anyoarepo:use
element. See the Referencing a model above.The
loaders
handle loading of files - the key is lowercased file extension, value a function taking (schema, path) and returning loaded content -
Create an instance of
ModelBuilder
To use the pre-installed set of builders and preprocessors, invoke:
from oarepo_model_builder.entrypoints \ import create_builder_from_entrypoints builder = create_builder_from_entrypoints()
To have a complete control of builders and preprocessors, invoke:
from oarepo_model_builder.builder import ModelBuilder from oarepo_model_builder.builders.jsonschema import JSONSchemaBuilder from oarepo_model_builder.builders.mapping import MappingBuilder from oarepo_model_builder.outputs.jsonschema import JSONSchemaOutput from oarepo_model_builder.outputs.mapping import MappingOutput from oarepo_model_builder.outputs.python import PythonOutput from oarepo_model_builder.property_preprocessors.text_keyword import TextKeywordPreprocessor from oarepo_model_builder.model_preprocessors.default_values import DefaultValuesModelPreprocessor from oarepo_model_builder.model_preprocessors.elasticsearch import ElasticsearchModelPreprocessor builder = ModelBuilder( output_builders=[JSONSchemaBuilder, MappingBuilder], outputs=[JSONSchemaOutput, MappingOutput, PythonOutput], model_preprocessors=[DefaultValuesModelPreprocessor, ElasticsearchModelPreprocessor], property_preprocessors=[TextKeywordPreprocessor] )
-
Invoke
builder.build(schema, output_directory)
Extending the builder
Builder pipeline
At first, an instance of ModelSchema is obtained. The schema can be either passed the content of the schema as text, or just a path pointing to the file. The extension of the file determines which loader is used. JSON, JSON5 and YAML are supported out of the box ( if you have json5 and pyyaml packages installed)
Then ModelBuilder.build(schema, output_dir) is called.
It begins with calling all ModelPreprocessors. They get the whole schema and settings and can modify both. See ElasticsearchModelPreprocessor as an example. The deepmerge function does not overwrite values if they already exist in settings.
For each of the outputs (jsonschema, mapping, record, resource, ...)
the top-level properties of the transformed schema are then iterated.
The order of the top-level properties is given by settings.processing-order
.
The top-level property and all its descendants (a visitor patern, visiting property by property), a PropertyPreprocessor is called.
The preprocessor can either modify the property, decide to remove it or replace it with a new set of properties (see multilang in tests ).
The property is then passed to the OutputBuilder (an example is JSONSchemaBuilder) that serializes the tree of properties into the output.
The output builder does not create files on the filesystem explicitly but uses instances of OutputBase, for example JSONOutput or more specialized JSONSchemaOutput.
See JSONBaseBuilder for an example of how to get an output and write to it (in this case, the json-based output).
This way, even if more output builders access the same file, their access is coordinated.
Registering Preprocessors, Builders and Outputs for commandline client
The model & property preprocessors, output builders and outputs are registered in entry points. In poetry, it looks as:
[tool.poetry.plugins."oarepo_model_builder.builders"]
010-jsonschema = "oarepo_model_builder.builders.jsonschema:JSONSchemaBuilder"
020-mapping = "oarepo_model_builder.builders.mapping:MappingBuilder"
030-python_structure = "oarepo_model_builder.builders.python_structure:PythonStructureBuilder"
040-invenio_record = "oarepo_model_builder.invenio.invenio_record:InvenioRecordBuilder"
[tool.poetry.plugins."oarepo_model_builder.ouptuts"]
jsonschema = "oarepo_model_builder.outputs.jsonschema:JSONSchemaOutput"
mapping = "oarepo_model_builder.outputs.mapping:MappingOutput"
python = "oarepo_model_builder.outputs.python:PythonOutput"
[tool.poetry.plugins."oarepo_model_builder.property_preprocessors"]
010-text_keyword = "oarepo_model_builder.preprocessors.text_keyword:TextKeywordPreprocessor"
[tool.poetry.plugins."oarepo_model_builder.model_preprocessors"]
01-default = "oarepo_model_builder.transformers.default_values:DefaultValuesModelPreprocessor"
10-invenio = "oarepo_model_builder.transformers.invenio:InvenioModelPreprocessor"
20-elasticsearch = "oarepo_model_builder.transformers.elasticsearch:ElasticsearchModelPreprocessor"
[tool.poetry.plugins."oarepo_model_builder.loaders"]
json = "oarepo_model_builder.loaders:json_loader"
json5 = "oarepo_model_builder.loaders:json_loader"
yaml = "oarepo_model_builder.loaders:yaml_loader"
yml = "oarepo_model_builder.loaders:yaml_loader"
[tool.poetry.plugins."oarepo_model_builder.templates"]
99-base_templates = "oarepo_model_builder.invenio.templates"
Generating python files
The default python output is based on libCST that enables merging generated code with a code that is already present in output files. The transformer provided in this package can:
- Add imports
- Add a new class or function on top-level
- Add a new method to an existing class
- Add a new const/property to an existing class
The transformer will not touch an existing function/method. Increase verbosity level to get a list of rejected patches
or add --set settings.python.overwrite=true
(use with caution, with sources stored in git and do diff afterwards).
Overriding default templates
The default templates are written as jinja2-based templates.
To override a single or multiple templates, create a package containing the templates and register it
in oarepo_model_builder.templates
. Be sure to specify the registration key smaller than 99-
. The template loader
iterates the sorted set of keys and your templates would be loaded before the default ones. Example:
my_package
+-- __init__.py
+-- templates
+-- invenio_record.py.jinja2
# my_package/__init__.py
TEMPLATES = {
# resolved relative to the package
"record": "templates/invenio_record.py.jinja2"
}
[tool.poetry.plugins."oarepo_model_builder.templates"]
20-my_templates = "my_package"
To override a template for a single model, in your model file (or configuration file with -c option or via --set option) , specify the relative path to the template:
settings:
python:
templates:
record: ./test/my_invenio_record.py.jinja2
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for oarepo-model-builder-0.9.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d815ae300975d513e0ec7a464427a0711db793ef0264df5ae726a840d36b86f4 |
|
MD5 | c77c9048a776d5285efa3b0175c12164 |
|
BLAKE2b-256 | 304937bcc09e19710c7d0e9e7a0ed8376c053b560f90be7bca01bba93cac5303 |
Hashes for oarepo_model_builder-0.9.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6808d54d1a63e9ca5a96873f41729b5788db1db642b6ee3a1b2e31d631b69d52 |
|
MD5 | 726e20e87b67473ff62d6c96643ef7fa |
|
BLAKE2b-256 | 1b373ec83c069a2c3bc34b0f0714915bfddaa96d30af0a18f253c97a485c2331 |