Schema resources for the National Microbiome Data Collaborative (NMDC)
Project description
National Microbiome Data Collaborative Schema
The NMDC is a multi-organizational effort to integrate microbiome data across diverse areas in medicine, agriculture, bioenergy, and the environment. This integrated platform facilitates comprehensive discovery of and access to multidisciplinary microbiome data in order to unlock new possibilities with microbiome data science.
This repository mainly defines a LinkML schema for managing metadata from the National Microbiome Data Collaborative (NMDC).
Repository Contents Overview
Some products that are maintained, and tasks orchestrated within this repository are:
- Maintenance of LinkML YAML that specifies the NMDC Schema
- src/schema/nmdc.yaml
- and various other YAML schemas imported by it, like prov.yaml, annotation.yaml, etc. all which you can find in the src/schema folder
- Makefile targets for converting the schema from it's native LinkML YAML format to other artifact like JSON Schema
- Build, deployment and distribution of the schema as a PyPI package
- Automatic publishing of refreshed documentation upon change to the schema, accessible here
Background
The NMDC Introduction to metadata and ontologies primer provides some the context for this project.
See also these slides describing the schema.
Maintaining the Schema
See MAINTAINERS.md for instructions on maintaining and updating the schema.
NMDC metadata downloads
See https://github.com/microbiomedata/nmdc-runtime/#data-exports
Ecosystem Diagram
flowchart TD
subgraph nmdc-schema repo
ly([NMDC LinkML YAML files])
lg(generated artifacts)
ly-.make all.->lg
end
subgraph Data Validation
click ly href "https://github.com/microbiomedata/nmdc-schema/tree/main/src/schema" _top
d[(Some data)]
v[[Validation process]]
v--Has input-->d
v--Has input-->ly
end
subgraph MIxS
m([MIxS Schema])
end
subgraph SubmissionPortal
sppg[(Postgres)]
spa[Portal API]
sppg<-->spa
click spa href "https://data.dev.microbiomedata.org/docs" _top
ps[Pydantic schema]
end
subgraph MongoDB
mc[(Collections)]
ms[Implicit schema]
ma[Search API]
mc<-->ma
click ma href "https://api.dev.microbiomedata.org/docs" _top
end
mc --Ingest--> sppg
subgraph DH Template Prep
saf[sheets_and_friends repo]
sps([Submission Portal Schema])
dhjs[Data Harmoizer JS, etc.]
saf-->sps-->dhjs
end
dhjs-->SubmissionPortal
subgraph DataMapping
sa[sample-annotator repo]
end
spa-->sa-..->ma
ly-..->ps
sj[some json]
ly-..->sj-..->MongoDB-..->ps
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file nmdc_schema-7.3.0.tar.gz
.
File metadata
- Download URL: nmdc_schema-7.3.0.tar.gz
- Upload date:
- Size: 398.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d994a67cecf5885325096e47def13491e481d2fc37b3c3d23ed6cd6f386cc416 |
|
MD5 | 4987e0ad7e055a0c895a99552bcf28d7 |
|
BLAKE2b-256 | 7dcc29ccf7f6dd175eff131a40ce56a4a57775996e7c67bd8824f4f994c7cf1b |
File details
Details for the file nmdc_schema-7.3.0-py3-none-any.whl
.
File metadata
- Download URL: nmdc_schema-7.3.0-py3-none-any.whl
- Upload date:
- Size: 408.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e53799b52dcb0c6e25217168900df056ef1de2aa6495891c9993d9230d9293b5 |
|
MD5 | 64cf9b4ebe1108091772167c4e07b359 |
|
BLAKE2b-256 | 30081e32d8b6a867dd95e75504dc55a801f64b2d24409c2c0584d4a0ae1fe311 |