Link your data to authority lists or your own controlled lists
Project description
Django Controlled Vocabulary
This app provides models and admin interface to link your data to standard vocabularies (e.g. ISO language codes, Wikidata). Benefits: increases the consistency and understandability of your project data.
Requirements: Python 3.5+, Django 2.2+
Development Status: Beta
A ControlledTerm field in the Django admin interface. The user selects the vocabulary (here: Wikidata), then starts typing a term in the text box. Suggestions are brought from Wikidata. When the user saves the changes, information about the selected term is copied into the database (url, identifier, label).
Features
- create your own controlled lists of terms (i.e. a local lists, project-specific)
- look up terms from standard vocabularies (i.e. authority files maintained by other organisations)
- extensible plug-in architecture for lookups into particular vocabularies (see table below for built-in plugins)
- stores used terms from remote vocabularies into your database:
- space efficient (doesn't clutter the database with unused terms)
- self-contained (i.e. can still works offline & DB always 'semantically' complete)
- autocomplete widget for Django admin; reusable ControlledTermField for your models
- command line tool to download vocabulary files from authoritative sources
- [TODO] possibility to store additional metadata (e.g. geographic coordinates)
- [TODO] simple rest API to publish your own terms
Standard vocabularies included
Built-in plugins for the following authority files:
Vocabulary | Description |
---|---|
Schema.org | High-level categories of content |
Wikidata | High level concepts or specific instances (e.g. places, people) |
ISO 639-2 | Language codes |
DCMI Type | Dublin Core Format Type |
MIME | Media/File types |
FAST Topics | Topic categorisation |
FAST Forms and Genres | Genres of a piece of work |
VIAF | Various: regions, people, companies, ... |
Data Model & Software Design
Django models
Vocabularies | Terms |
---|---|
-
ControlledVocabulary
- prefix: the vocabulary standard prefix, see http://prefix.cc/wikidata
- label: the short name of the vocabulary
- base_url: the url used as a base for all terms in the vocabulary
- concept: the type of terms this vocabulary represents (e.g. language, people)
- description: a longer description
-
ControlledTerm
- termid: a unique code for the term within a vocabulary, it is case sensitive
- label: standard name for the term, as provided by the authority
- vocabulary: a reference to the ControlledVocabulary this term belongs to
Conventions:
- joining base_url (e.g. http://schema.org) with termid (e.g. Movie) must give the exact standard/canonical URI for the term, e.g. http://schema.org/Movie
Vocabulary plug-ins / managers
A Vocabulary plug-in / manager is a python class that provides services for a vocabulary:
- autocomplete terms from local or remote datasets (see ControlledTermField)
- supplies metadata for the vocabulary (see ControlledVocabulary)
Managers can provide terms from a CSV file downloaded from an authoritative source.
Some vocabularies can contain thousands of terms or more. A plugin will only insert the terms used by your application. The rest will be accessed on demand from a file on disk or in a third-party server. This approach saves database space and keeps your application data self-contained.
This project comes with built-in plugins such a Wikidata or Schema.org. Those plugins are enabled by default; see below how to selectively enable them.
This architecture allows third-party plugins to be supplied via separate python packages.
Limitations
- controlled list rather than fully fledged vocabularies, (i.e. just a bag of terms with unique IDs/URIs, no support for taxonomic relationships among terms like broader, narrower, synonyms, ...)
- no notion of granularity (e.g. geonames country, region, city, street are all treated as part of the same vocabulary)
Setup
Installation
Install into your environment:
pip install django-controlled-vocabulary
Add the app to the INSTALLED_APPS list in your Django settings file:
INSTALLED_APPS = [
...
'controlled_vocabulary',
...
]
Add the following path to your project urls.py:
from django.urls import path, include
...
urlpatterns = [
...
path('vocabularies/', include('controlled_vocabulary.urls')),
...
]
Run the migrations:
./manage.py migrate
Download vocabulary data and add metadata to the database:
./manage.py vocab init
Configuration
Enabling specific vocabulary plug-ins (optional)
Currently all built-in plugins / managers are enabled by default. Add the following code in your settings.py to enable only specific vocabularies based on the import path of their classes. You can also use this to enable your own or third-party plugins.
# List of import paths to vocabularies lookup classes
CONTROLLED_VOCABULARY_VOCABULARIES = [
'controlled_vocabulary.vocabularies.iso639_2',
'controlled_vocabulary.vocabularies.dcmitype',
]
After enabling a new plug-in / manager, always run ./manage.py vocab init
.
ControlledTerm(s)Field
Use the ControlledTermField field to define a field with an autocomplete to controlled terms in your Django Model:
from controlled_vocabulary.models import ControlledTermField
...
class MyModel(models.Model):
...
language_code = ControlledTermField(
'iso639-2',
null=True, blank=True
)
Where iso639-2
is the prefix of a controlled vocabulary in your database.
ControlledTermField is essentially syntactic sugar for a ForeignKeyField with an adapted Select widget.
For multiple values, you can use ControlledTermsField (note the 's' in the name), which inherits from ManyToManyField with an adapted SelectMultiple widget. The useage is identical but obviously null=True
should be omitted.
By default the widget proposes the given vocabulary to the end user, but they can use the dropdown to switch to any other available vocabulary (see screenshot at the top of this page). To lock the selection to a single vocabulary, use this expression instead:
language_code = ControlledTermField(
['iso639-2'],
null=True, blank=True
)
You can have more than one prefix in that list if you want. The first item is always the one proposed by default on page load.
vocab (command line tool)
vocab is a django command line tool that lets you manipulate the vocabularies and the plugins. To find out more use the help:
./manage vocab help
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file django-controlled-vocabulary-0.12.tar.gz
.
File metadata
- Download URL: django-controlled-vocabulary-0.12.tar.gz
- Upload date:
- Size: 26.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.5 Linux/5.11.0-46-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c3e64ee98e994891c7b08dc7f1691c20f11f57fa6f2a9e61027fd83be531a1d |
|
MD5 | 928b5b7240d98d02886ec49c3c70396a |
|
BLAKE2b-256 | 349a1d490b15a81d7dbf838df7e3157ca2ef2e7aaba0eff3fc453d1a47a9b776 |
File details
Details for the file django_controlled_vocabulary-0.12-py3-none-any.whl
.
File metadata
- Download URL: django_controlled_vocabulary-0.12-py3-none-any.whl
- Upload date:
- Size: 35.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.9.5 Linux/5.11.0-46-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f41f252ff5b75bb54dce7073f85358de304b7c8d527bccbc26a8bfb1d59c858d |
|
MD5 | 8c0c3cd2ee71478f8c0010f679a94af1 |
|
BLAKE2b-256 | 344e0564ee3c0b0977dfe9a2cce2b95099d1276c424c3155b0ab94b5a314b6ca |