An open-source metadata standard for documenting and sharing Dzaleka's digital heritage.
Project description
Dzaleka Metadata Standard (DMS)
An open-source metadata specification and toolkit for describing, organising, and sharing digital heritage content from Dzaleka Refugee Camp.
What is DMS?
The Dzaleka Metadata Standard (DMS) is an open-source metadata specification and toolkit designed to describe, organise, and share digital heritage content from Dzaleka Refugee Camp in Malawi.
It provides a standardised, interoperable, and reusable schema for heritage items such as stories, photos, documents, audio, and events.
๐งญ Purpose
The purpose of DMS is to:
- Enable consistent metadata creation for heritage assets
- Support discoverability, interoperability, and reuse of heritage data
- Provide tools to validate, manage, and export metadata
- Serve as an open standard for heritage documentation
DMS helps both technical systems and community contributors work with heritage content in a structured way.
๐ฆ What DMS Includes
๐ Metadata Schema
A machine-readable specification defining fields, types, and constraints for heritage metadata.
Available formats:
- JSON Schema โ
dms/data/schema/dms.json(Draft 2020-12) - YAML Schema โ
dms/data/schema/dms.yaml - JSON-LD / RDF โ
dms/data/schema/dms.jsonldfor semantic web / linked data use
๐ ๏ธ Python Tools & Web UI
A suite of tools for creating, validating, and converting metadata records. Includes both a command-line interface (CLI) and a local premium Web UI (DMS Vault). See Quick Start below.
๐ Documentation
Field definitions, best practices, and tutorials for metadata entry. See Documentation.
๐ Example Records
Sample records covering stories, photos, documents, audio, and events. See Examples.
Quick Start
Installation
# Clone the repository
git clone https://github.com/Dzaleka-Connect/Dzaleka-Metadata-Standard.git
cd dzaleka-metadata-standard
# Install the CLI tools
pip install -e .
The Web UI (DMS Vault)
The easiest way to build, validate, and manage records is using the built-in local Web UI:
dms web --port 8080 --dir records/
This also exposes a local vocabulary API at http://127.0.0.1:8080/api/taxonomy for DMS term lookups, deprecations, change logs, and JSON-LD/Turtle/RDF/XML output.
Create a Record via CLI
# Interactive wizard
dms init
# Skip type selection prompt
dms init --type poem
# Save to specific file
dms init --output my-record.json
Validate a Record
# Single file
dms validate examples/story.json
# All files in a directory
dms validate --dir examples/
Search & Analyze
# Search records by type, subject, or free-text
dms search --dir records/ --type poem -q "displacement"
# View collection analytics (types, languages, completion)
dms stats --dir records/
# Generate a browsable HTML catalogue of your collection
dms report --dir records/ --output catalogue.html
Interoperability Tools
# Export record(s) as JSON-LD for semantic web
dms export examples/story.json
# Convert CSV batch to JSON records
dms convert csv2json examples/batch.csv
# Compare two records field-by-field
dms diff record_v1.json record_v2.json
View Schema Info
dms info
Schema Overview
A DMS record describes a single heritage item with these fields:
| Field | Required | Description |
|---|---|---|
id |
โ | Unique identifier (UUID) |
title |
โ | Name of the item |
type |
โ | Category: story, photo, document, audio, video, event, map, artwork, site, poem |
description |
โ | Narrative context |
language |
โ | Language code (BCP 47) |
creator |
Recommended | Who created it (name, role, affiliation) |
date |
Recommended | When it was created or occurred |
subject |
Recommended | Controlled tags and keywords |
subject_ref |
Optional | Structured subject identifiers and scheme references |
location |
Recommended | Place name, area, coordinates |
rights |
Recommended | License, access level, holder |
source |
Optional | Contributor, collection, original format |
format |
Optional | MIME type of the digital object |
technical |
Optional | File-level technical metadata |
relation |
Optional | IDs of related records |
relation_detail |
Optional | Typed relationships to related records or resources |
coverage |
Optional | Time period covered |
All fields map to Dublin Core for broad interoperability, with Dzaleka-specific extensions for camp areas and access levels.
Repository Structure
โโโ dms/ Python CLI tools
โ โโโ cli.py Command entry points
โ โโโ validator.py Schema validation engine
โ โโโ generator.py Interactive record creator
โ โโโ converter.py CSV โ JSON converter
โ โโโ taxonomy.py Local vocabulary service and serializers
โ
โ data/schema/ Schema definitions
โ โโโ dms.json JSON Schema (Draft 2020-12)
โ โโโ dms.yaml YAML version
โ โโโ dms.jsonld JSON-LD context for linked data
โ
โ data/taxonomy/ DMS vocabulary files
โ โโโ types.json Curated heritage item type vocabulary
โ
โโโ docs/ Documentation
โ โโโ field-guide.md Field definitions & guidelines
โ โโโ best-practices.md Metadata entry best practices
โ โโโ semantic-tagging.md Controlled vocabularies and richer subject metadata
โ โโโ taxonomy-api.md Local vocabulary API endpoints and formats
โ โโโ getting-started.md Installation & tutorial
โ
โโโ examples/ Sample records
โ โโโ story.json Oral history
โ โโโ photo.json Photograph
โ โโโ document.json Administrative record
โ โโโ audio.json Music recording
โ โโโ event.json Community event
โ โโโ site.json Heritage site (Site Register)
โ โโโ mural.json Public artwork (Art Catalogue)
โ โโโ poem.json Poetry
โ โโโ batch.csv CSV batch import example
โ
โโโ tests/ Test suite
Examples
The examples/ directory contains sample records for common heritage item types:
- story.json โ "Journey to Dzaleka: A Story of Hope" (oral history)
- photo.json โ "Market Day at Dzaleka" (daily life photography)
- document.json โ "Community School Registration Records, 2018"
- audio.json โ "Traditional Songs of the Great Lakes Region"
- event.json โ "World Refugee Day Celebration 2024"
- site.json โ "Dzaleka Health Centre" (from Site Register)
- mural.json โ "Child Early Marriage Awareness Mural" (from Art Catalogue)
- poem.json โ "Home Is a Word I Carry" (poetry)
- batch.csv โ Records in CSV format for batch import
Documentation
- Field Guide โ Detailed definitions for every schema field
- Best Practices โ Guidelines for quality metadata entry
- Semantic Tagging โ DMS guidance for controlled vocabularies and richer subject metadata
- Taxonomy API โ Query vocabularies, terms, deprecations, and semantic formats
- Getting Started โ Installation and first steps tutorial
Interoperability
DMS is designed to work with existing standards and systems. The dms.jsonld context enables linked data publishing with mappings to:
| Vocabulary | Prefix | Used for |
|---|---|---|
| Dublin Core | dc:, dcterms: |
Core metadata fields (title, creator, subject, rights, etc.) |
| FOAF | foaf: |
Person/Agent descriptions (foaf:name, foaf:Person, foaf:Image) |
| BIBO | bibo: |
Bibliographic roles (bibo:editor, bibo:translator, bibo:interviewer) |
| Schema.org | schema: |
Creative works, places, events, affiliations |
| W3C Geo | geo: |
Geographic coordinates (geo:lat, geo:long) |
| SKOS | skos: |
Subject vocabularies and concept schemes |
Additional format support:
- CSV โ Import/export for spreadsheet-based workflows
- JSON Schema โ Machine-readable validation for any language or platform
Contributing
We welcome contributions! See CONTRIBUTING.md for guidelines.
Areas where you can help:
- ๐ Adding example records from the Dzaleka community
- ๐ Translating documentation into Swahili, French, or Kinyarwanda
- ๐ง Improving the CLI tools
- ๐ Writing guides for specific use cases
- ๐ Reporting bugs and suggesting improvements
License
- Code (Python tools): MIT License
- Schema & Documentation: Creative Commons Attribution 4.0
Acknowledgments
- The Dzaleka refugee community for their heritage, stories, and resilience
- Dublin Core Metadata Initiative for the foundational metadata standard
- All contributors and volunteers who help preserve Dzaleka's digital heritage
Preserving heritage. Empowering community. Building the future.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dzaleka_metadata_standard-1.1.0.tar.gz.
File metadata
- Download URL: dzaleka_metadata_standard-1.1.0.tar.gz
- Upload date:
- Size: 78.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2cf006a605d6f725dc79ff9515a96b8001a5979a375600fe2110867ef6789bdf
|
|
| MD5 |
45ddc72a4d87e1d002ff083dda793de5
|
|
| BLAKE2b-256 |
0a027384c492985553e812ef540740c8424d4b3183831c77bb6bcf85ed5a5718
|
Provenance
The following attestation bundles were made for dzaleka_metadata_standard-1.1.0.tar.gz:
Publisher:
pypi-publish.yml on Dzaleka-Connect/Dzaleka-Metadata-Standard
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dzaleka_metadata_standard-1.1.0.tar.gz -
Subject digest:
2cf006a605d6f725dc79ff9515a96b8001a5979a375600fe2110867ef6789bdf - Sigstore transparency entry: 1268077586
- Sigstore integration time:
-
Permalink:
Dzaleka-Connect/Dzaleka-Metadata-Standard@445db6affa8abbb052e28d3a40bed781e6859943 -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/Dzaleka-Connect
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@445db6affa8abbb052e28d3a40bed781e6859943 -
Trigger Event:
release
-
Statement type:
File details
Details for the file dzaleka_metadata_standard-1.1.0-py3-none-any.whl.
File metadata
- Download URL: dzaleka_metadata_standard-1.1.0-py3-none-any.whl
- Upload date:
- Size: 77.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
148dac44234808a338519391db40e695b8b0b4920eb7baf2148931ba6d7eb851
|
|
| MD5 |
f92dd7ce8650bd3dd0b6221961ef07d5
|
|
| BLAKE2b-256 |
1cc107a20b3f782b2aac86a307c325833137b38ce8559fe13e285b834ca8451b
|
Provenance
The following attestation bundles were made for dzaleka_metadata_standard-1.1.0-py3-none-any.whl:
Publisher:
pypi-publish.yml on Dzaleka-Connect/Dzaleka-Metadata-Standard
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dzaleka_metadata_standard-1.1.0-py3-none-any.whl -
Subject digest:
148dac44234808a338519391db40e695b8b0b4920eb7baf2148931ba6d7eb851 - Sigstore transparency entry: 1268077720
- Sigstore integration time:
-
Permalink:
Dzaleka-Connect/Dzaleka-Metadata-Standard@445db6affa8abbb052e28d3a40bed781e6859943 -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/Dzaleka-Connect
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@445db6affa8abbb052e28d3a40bed781e6859943 -
Trigger Event:
release
-
Statement type: