ontology_loader
Project description
ontology_loader
Suite of tools to configure and load an ontology from the OboFoundary into the data object for OntologyClass as specified by NMDC schema.
Development Environment
Pre-requisites
-
=Python 3.9
- Poetry
- Docker
- MongoDB
- NMDC materialized schema
- ENV variable for MONGO_PASSWORD (or pass it in via the cli/runner itself directly)
% docker pull mongo
% docker run -d --name mongodb-container -p 27018:27017 mongo
MongoDB Connection Settings
When connecting to MongoDB, you need to set the correct environment variables depending on where your code is running:
-
When running from your local machine (CLI or tests):
export MONGO_HOST=localhost export MONGO_PORT=27018 export ENABLE_DB_TESTS=true export MONGO_PASSWORD="your_valid_password"
-
When running inside Docker containers:
export MONGO_HOST=mongo export MONGO_PORT=27017
The Docker container networking uses container names (like 'mongo') for internal communication, while your host machine must use 'localhost' with the mapped port (27018).
Basic mongosh commands
% docker ps
% docker exec -it [mongodb-container-id] bash
% mongosh mongodb://admin:root@mongo:27017/nmdc?authSource=admin
% show dbs
% use nmdc
% db.ontology_class_set.find().pretty()
% db.ontology_relation_set.find().pretty()
% db.ontology_class_set.find( { id: { $regex: /^PO/ } } ).pretty()
% db.ontology_class_set.find( { id: { $regex: /^UBERON/ } } ).pretty()
% db.ontology_class_set.find( { id: { $regex: /^ENVO/ } } ).pretty()
Command line
% poetry install
% poetry run ontology_loader --help
% poetry run ontology_loader --source-ontology "envo"
% poetry run ontology_loader --source-ontology "uberon"
Running the tests
% make test
Running the linter
% make lint
Python example usage
pip install nmdc-ontology-loader
from ontology_loader.ontology_load_controller import OntologyLoaderController
import tempfile
def load_ontology():
"""Load an ontology using the default MongoDB connection."""
loader = OntologyLoaderController(
source_ontology="envo",
output_directory=tempfile.gettempdir(),
generate_reports=True,
)
loader.run_ontology_loader()
Using with an existing MongoDB connection
If you already have a MongoDB connection established (e.g., in a Dagster/Dagit job), you can pass it directly to the OntologyLoaderController:
from pymongo import MongoClient
from ontology_loader.ontology_load_controller import OntologyLoaderController
import tempfile
# Use an existing MongoDB client
mongo_client = MongoClient("mongodb://admin:password@localhost:27018/nmdc?authSource=admin")
# Pass the client and database name to OntologyLoaderController
loader = OntologyLoaderController(
source_ontology="envo",
output_directory=tempfile.gettempdir(),
generate_reports=True,
mongo_client=mongo_client, # Pass the existing client
db_name="nmdc", # Required when passing an existing client
)
# The loader will use the provided client instead of creating a new connection
loader.run_ontology_loader()
This approach is particularly useful when:
- You're running in a job scheduler like Dagster/Dagit
- You want to reuse an existing connection pool
- You have custom MongoDB connection settings that are managed externally
- You need to use a connection with specific authentication or configuration
Note: When passing an existing MongoDB client, you must also provide the
db_nameparameter to specify which database to use. This is required as the database name cannot be automatically determined from a MongoDB client instance.
Testing CRUD operations in a live MongoDB
If you want to test the CRUD operations in a live MongoDB instance, you need to set two environment variables: MONGO_PASSWORD="your_valid_password" ENABLE_DB_TESTS=true
This will allow you to run tests to actually insert/update/delete records in your MongoDB tests instance instead of simply mocking the calls. You can then run the tests with the following command:
make test
The same test command will run without the environment variables, but it will only mock the calls to the database. This is intended to help prevent accidental data loss or corruption in a live database environment and to ensure that MONGO_PASSWORD is not hardcoded in the codebase.
Reset collections in dev
docker exec -it nmdc-runtime-test-mongo-1 bash
mongosh mongodb://admin:root@mongo:27017/nmdc?authSource=admin
db.ontology_class_set.find({}).pretty()
db.ontology_relation_set.find({}).pretty()
db.biosample_set.find({}).pretty()
db.ontology_class_set.drop()
db.ontology_relation_set.drop()
db.ontology_class_set.countDocuments()
db.ontology_relation_set.countDocuments()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ontology_loader-0.2.2.tar.gz.
File metadata
- Download URL: ontology_loader-0.2.2.tar.gz
- Upload date:
- Size: 12.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4e15b0f077ab55c8ca717fa9588a778bbbcd4ed97bb9600a7cec0095a57da94c
|
|
| MD5 |
dcd8ca8646445ce539beaac8438e7f58
|
|
| BLAKE2b-256 |
5c0804695a2d243adbc4094e414c8684bf8a0d35e901e9a412504f4c85bd81dc
|
File details
Details for the file ontology_loader-0.2.2-py3-none-any.whl.
File metadata
- Download URL: ontology_loader-0.2.2-py3-none-any.whl
- Upload date:
- Size: 14.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
86f47c2e55d8cde5365d2a081f912bd8b7beea2239fdec8943102831990ea865
|
|
| MD5 |
cae44e66906e226d2898dcf9c889acc4
|
|
| BLAKE2b-256 |
8a870ef1934bafff1eb03df9f5d81db7e76295911c55c92efa00a59db7d19963
|