Support for custom fields and hierarchy on Invenio vocabularies
Project description
OARepo Vocabularies
Enhanced Invenio-Vocabularies extension providing hierarchical vocabulary support, custom fields integration, and advanced permission controls for vocabulary management in Invenio-based repositories.
Features
- Hierarchical Vocabularies: Parent-child relationships with automatic level tracking, ancestor chains, and leaf detection
- Custom Fields: Extend vocabulary records with custom metadata fields using Invenio custom fields system
- Advanced Permissions: Fine-grained vocabulary-type-specific permission policies with dangerous operation detection
- Multi-language Support: ICU collation support for sorting and suggestions across multiple languages
- UI Components: Complete UI resource layer with search, detail, edit, and create views
- API Extensions: REST API endpoints for vocabulary types and hierarchy operations
Installation
pip install oarepo-vocabularies
Configuration
Add to invenio.cfg:
from oarepo_vocabularies.services.config import VocabulariesConfig
from oarepo_vocabularies.resources.config import VocabulariesResourceConfig
VOCABULARIES_SERVICE_CONFIG = VocabulariesConfig
VOCABULARIES_RESOURCE_CONFIG = VocabulariesResourceConfig
Core Components
Vocabulary Record Model
The Vocabulary record class extends invenio_vocabularies.records.api.Vocabulary:
-
System Fields:
hierarchy:HierarchySystemField- manages hierarchy metadata (level, titles, ancestors, leaf status)parent:ParentSystemField- handles parent-child relationshipscustom_fields:DictField- stores custom metadatarelations:MultiRelationsField- manages parent and custom field relations
-
Hierarchy Database Model (
VocabularyHierarchy):id: UUID reference to vocabulary metadataparent_id: UUID reference to parent hierarchy recordpid: String PID of the vocabulary termlevel: Integer depth in hierarchy (1 for root)titles: JSON array of title objects in hierarchy orderancestors: JSON array of ancestor PIDs (parent to root)ancestors_or_self: JSON array including self PIDleaf: Boolean indicating if term has children
Hierarchy Operations
Automatic Hierarchy Management:
- On record creation: sets level, initializes ancestors, marks parent as non-leaf
- On parent change: updates entire descendant subtree, fixes previous parent leaf status
- On deletion: validates no children exist, updates parent leaf status
Query Methods:
# Get direct children
VocabularyHierarchy.get_direct_subterms_ids(parent_id)
# Get all descendants (recursive)
VocabularyHierarchy.get_subterms_ids(parent_id)
# Access from record
record.hierarchy.query_subterms() # direct children
record.hierarchy.query_descendants() # all descendants
Hierarchy Properties:
record.hierarchy.level # depth in tree
record.hierarchy.leaf # has no children
record.hierarchy.titles # [self_title, parent_title, ...]
record.hierarchy.ancestors_ids # ['parent', 'grandparent', ...]
record.hierarchy.parent_id # direct parent PID
Parent Management
Setting Parent:
# On creation
data = {
"id": "eng.US",
"title": {"en": "English (US)"},
"parent": {"id": "eng"},
"type": "languages"
}
vocab = Vocabulary.create(data=data)
# On update
record.parent.set("new_parent_id") # or None to remove
record.commit()
Constraints:
- Cannot delete record with children (raises
ValidationError) - Cannot create cycles (parent cannot be descendant)
- Parent change triggers full descendant tree update
Custom Fields
Configuration:
from invenio_records_resources.services.custom_fields import TextCF
VOCABULARIES_CF = [
TextCF(name="blah"),
# ... other custom fields
]
Usage:
vocab_service.create(
system_identity,
{
"id": "eng",
"title": {"en": "English"},
"type": "languages",
"custom_fields": {"blah": "Hello"}
}
)
Services
VocabulariesConfig (oarepo_vocabularies.services.config.VocabulariesConfig):
- Extends
invenio_vocabularies.services.VocabulariesServiceConfig - Adds
KeepVocabularyIdComponent- prevents ID changes on update - Adds
ScanningOrderComponent- manages vocabulary ordering - Schema:
VocabularySchemawith hierarchy and custom_fields support - Search:
VocabularySearchOptionswith hierarchy filters
VocabularyTypeService (oarepo_vocabularies.services.service.VocabularyTypeService):
- Lists available vocabulary types with metadata
- Aggregates term counts per vocabulary type
- Provides links to vocabulary listings
Service Components:
KeepVocabularyIdComponent: Ensures vocabulary ID remains constant during updatesScanningOrderComponent: Handles vocabulary item ordering logic
Search Options
VocabularySearchOptions parameters:
type: vocabulary type filterh-level: filter by hierarchy levelh-parent: filter by direct parent PIDh-ancestor: filter by any ancestor PIDh-ancestor-or-self: filter including selftags: filter by tagsupdated_after: filter by update timestampids: list of (type, id) tuples for specific recordssource: specify returned fields
Sort Options:
bestmatch: relevance score (default for queries)title: alphabetical by title (language-aware)newest: most recently created firstoldest: creation date ascending
Query Parser:
- Boosts matches in current language (10x for title, 5x for hierarchy titles)
- Default operator: AND
Resources
VocabulariesResourceConfig:
- Extends
invenio_vocabularies.resources.config.VocabulariesResourceConfig - Adds hierarchy search parameters (
h-parent,h-ancestor,h-level) - Adds UI JSON serializer (
application/vnd.inveniordm.v1+json)
Vocabulary Type Resource:
- Endpoint:
/api/vocabularies/ - Lists available vocabulary types with counts
- Returns configured metadata (name, description, icons)
API Links (generated for each vocabulary):
self: API detail endpointself_html: UI detail pagevocabulary: API list for vocabulary typevocabulary_html: UI search pageedit_html: UI edit formparent: parent record (if exists)parent_html: parent UI pagechildren: API list of direct childrenchildren_html: UI list of childrendescendants: API list of all descendantsdescendants_html: UI list of descendants
Permissions
Permission Generators:
IfVocabularyType(vocabulary_type, then_, else_):
- Conditional permission based on vocabulary type
- Example: Allow all users to manage "languages" but restrict "countries"
can_create = [
IfVocabularyType("languages", then_=[AnyUser()], else_=[])
]
IfNonDangerousVocabularyOperation(then_, else_):
- Detects dangerous operations (ID change, parent change)
- Example: Allow custom field updates but restrict hierarchy changes
can_update = [
IfNonDangerousVocabularyOperation(then_=[AnyUser()], else_=[Admin()])
]
Configuration:
from invenio_vocabularies.services.permissions import PermissionPolicy
# Set custom permission policy
VOCABULARIES_PERMISSIONS_POLICY = MyCustomPermissionPolicy
# Or use presets
OAREPO_PERMISSIONS_PRESETS = {
"vocabularies": PermissionPolicy
}
Dangerous Operations:
- Changing vocabulary
idfield - Changing hierarchy parent (adding/removing/changing)
UI Views
Blueprints:
oarepo_vocabularies_ui: main vocabulary UI (search, detail, edit, create)oarepo_vocabulary_type_ui: vocabulary type listing UI
UI Components:
VocabularyTypeAndProps: provides vocabulary type metadata to templatesDepositVocabularyOptionsComponent: vocabulary selector for deposits
Templates:
- Search results page with hierarchy breadcrumbs
- Detail page showing hierarchy position
- Edit form with parent selector
- Create form with vocabulary type selection
CLI Commands
# Vocabulary management (extensible)
invenio oarepo vocabularies --help
Configuration Options
# Permission policy factory
VOCABULARIES_PERMISSIONS_POLICY = "path.to.PermissionPolicy"
# Vocabulary type metadata
INVENIO_VOCABULARY_TYPE_METADATA = {
"languages": {
"name": {"en": "Languages", "cs": "Jazyky"},
"description": {"en": "Language vocabulary", "cs": "Slovník jazyků"},
# ... additional metadata
}
}
# Custom fields definition
VOCABULARIES_CF = [
TextCF(name="custom_field_1"),
# ...
]
# Sort/suggest custom fields
OAREPO_VOCABULARIES_SORT_CF = ["field1", "field2"]
OAREPO_VOCABULARIES_SUGGEST_CF = ["field1"]
# Cache settings for facets
VOCABULARIES_FACET_CACHE_SIZE = 2048
VOCABULARIES_FACET_CACHE_TTL = 60 * 60 * 24 # 24 hours
# Service/resource overrides
OAREPO_VOCABULARY_TYPE_SERVICE = VocabularyTypeService
OAREPO_VOCABULARY_TYPE_SERVICE_CONFIG = VocabularyTypeServiceConfig
OAREPO_VOCABULARY_TYPE_RESOURCE = VocabularyTypeResource
OAREPO_VOCABULARY_TYPE_RESOURCE_CONFIG = VocabularyTypeResourceConfig
API Examples
Create Hierarchical Vocabulary
from invenio_access.permissions import system_identity
from invenio_vocabularies.proxies import current_service as vocab_service
# Create parent
parent = vocab_service.create(
system_identity,
{
"id": "eng",
"title": {"en": "English"},
"type": "languages"
}
)
# Create child
child = vocab_service.create(
system_identity,
{
"id": "eng.US",
"title": {"en": "English (US)"},
"hierarchy": {"parent": "eng"},
"type": "languages"
}
)
# Access hierarchy data
print(child.data["hierarchy"])
# {
# "level": 2,
# "titles": [{"en": "English (US)"}, {"en": "English"}],
# "ancestors": ["eng"],
# "ancestors_or_self": ["eng.US", "eng"],
# "leaf": True,
# "parent": "eng"
# }
Search with Hierarchy Filters
# Get all children of a term
results = vocab_service.search(
system_identity,
{"h-parent": "eng"},
type="languages"
)
# Get all descendants
results = vocab_service.search(
system_identity,
{"h-ancestor": "eng"},
type="languages"
)
# Filter by level
results = vocab_service.search(
system_identity,
{"h-level": 2},
type="languages"
)
Update Parent Relationship
# Change parent
vocab_service.update(
system_identity,
("languages", "eng.US"),
{
"hierarchy": {"parent": "eng.UK"},
"title": {"en": "English (US)"},
"type": "languages"
}
)
# Remove parent (make root)
vocab_service.update(
system_identity,
("languages", "eng.US"),
{
"hierarchy": {"parent": None},
"title": {"en": "English (US)"},
"type": "languages"
}
)
Using Record API
from oarepo_vocabularies.records.api import Vocabulary
# Get record
record = Vocabulary.pid.with_type_ctx("languages").resolve("eng.US")
# Access hierarchy
print(record.hierarchy.level) # 2
print(record.hierarchy.parent_id) # "eng"
print(record.hierarchy.leaf) # True
print(record.hierarchy.ancestors_ids) # ["eng"]
# Query children
children = record.hierarchy.query_subterms()
# Update parent programmatically
record.parent.set("new_parent_id")
record.commit()
Testing
# Run all tests
./run.sh test
# Run specific test
pytest tests/test_hierarchy.py -v
# Run with coverage
pytest --cov=oarepo_vocabularies tests/
Development
# Install development dependencies
pip install -e ".[dev,tests]"
# Format code
black oarepo_vocabularies tests
isort oarepo_vocabularies tests
autoflake --in-place --remove-all-unused-imports -r oarepo_vocabularies
# Type checking
mypy oarepo_vocabularies
Entry Points
The package registers several Invenio entry points:
[project.entry-points."invenio_base.apps"]
oarepo_vocabularies = "oarepo_vocabularies.ext:OARepoVocabularies"
oarepo_vocabularies_ui = "oarepo_vocabularies.ui.ext:InvenioVocabulariesAppExtension"
[project.entry-points."invenio_base.api_apps"]
oarepo_vocabularies = "oarepo_vocabularies.ext:OARepoVocabularies"
oarepo_vocabularies_ui = "oarepo_vocabularies.ui.ext:InvenioVocabulariesAppExtension"
[project.entry-points."invenio_jsonschemas.schemas"]
oarepo_vocabularies = "oarepo_vocabularies.records.jsonschemas"
[project.entry-points."invenio_base.blueprints"]
oarepo_ui = "oarepo_vocabularies.views.app:create_app_blueprint"
oarepo_vocabularies_ui = "oarepo_vocabularies.ui.views:create_blueprint"
oarepo_vocabulary_type_ui = "oarepo_vocabularies.ui.views:create_vocabulary_type_blueprint"
[project.entry-points."invenio_base.api_blueprints"]
oarepo_vocabulary_type_api = "oarepo_vocabularies.views.api:create_api_blueprint"
[project.entry-points."invenio_assets.webpack"]
oarepo_vocabularies_ui_theme = "oarepo_vocabularies.ui.theme.webpack:theme"
[project.entry-points."invenio_i18n.translations"]
oarepo_vocabularies_ui = "oarepo_vocabularies"
License
Copyright (c) 2025 CESNET z.s.p.o.
OARepo Vocabularies is free software; you can redistribute it and/or modify it under the terms of the MIT License. See LICENSE file for more details.
Links
- Documentation: https://narodni-repozitar.github.io/developer-docs/docs/technology/invenio/nrp-toolchain/plugins/vocabularies
- Repository: https://github.com/oarepo/oarepo-vocabularies
- PyPI: https://pypi.org/project/oarepo-vocabularies/
- Issues: https://github.com/oarepo/oarepo-vocabularies/issues
- OARepo Project: https://github.com/oarepo
Acknowledgments
This project builds upon Invenio Framework and is developed as part of the OARepo ecosystem.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file oarepo_vocabularies-5.0.0.tar.gz.
File metadata
- Download URL: oarepo_vocabularies-5.0.0.tar.gz
- Upload date:
- Size: 87.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd3407cc9a0819a1d29075267075055841d0c341df2095d11e6ffae8168a7635
|
|
| MD5 |
e44b5350a4bfcfdfb6d88ff4334cd379
|
|
| BLAKE2b-256 |
e7136e9385f1da27f0a2cbdb817c8eb82b47bae951ad31922ed67e8b84031094
|
File details
Details for the file oarepo_vocabularies-5.0.0-py3-none-any.whl.
File metadata
- Download URL: oarepo_vocabularies-5.0.0-py3-none-any.whl
- Upload date:
- Size: 186.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcc5d7ea60d0643ed6032307af7f97f1a1aa6830b0e9e0d32f01e7aaf799fb06
|
|
| MD5 |
b1532081d62e45be10622b34ddb1434c
|
|
| BLAKE2b-256 |
7d5b08302ae2747a7fa6bd11e811c1a5cd4d9d4085ad7e1628838dcfd02e1fa7
|