Skip to main content

code repository for TA2 Knowledge Graph (MinMod) construction & deployment

Project description

Overview

Code for TA2 Knowledge Graph and other related services such as its API, CDR integration, data browser, etc.

Repository Structure

Installation

Setup the workspace

Setup the workspace by cloning ta2-minmod-data and this repository ta2-minmod-kg inside your WORKDIR

git clone --depth 1 https://github.com/DARPA-CRITICALMAAS/ta2-minmod-data
git clone --depth 1 https://github.com/DARPA-CRITICALMAAS/ta2-minmod-kg
mkdir kgdata
mkdir config

The directory will look like this

<WORKDIR>
  ├── kgdata                    # for storing databases
  ├── config                    # for storing configuration
  ├── ta2-minmod-data           # ta2-minmod-data repository
  └── ta2-minmod-kg             # code to setup TA2 KG

Setup the environment variables

The following commands will use these environment variables:

  1. USER_ID & GROUP_ID: the current user id, this is to make sure the docker containers will create files with the same owner as the current user. You can set the USER_ID and GROUP_ID automatically using this command:

    export USER_ID=$(id -u)
    export GROUP_ID=$(id -g)
    
  2. CERT_DIR: a directory containing SSL certificate (fullchain.pem and privkey.pem) for your server (see more at Generating an SSL certificate).

  3. CFG_DIR: a directory containing necessary files for the API such as config.yml

To make it easy to set these environment variables, you can create a copied file named .myenv from env.template and update the values accordingly. Then you can run the following command to set the environment variables:

. ./.myenv

Generating an SSL certificate

You can use Let's Encrypt to create a free SSL certificate (fullchain.pem and privkey.pem) for your server. However, for the purpose of testing locally, you can generate your own using the following command

openssl req -x509 -newkey rsa:4096 -keyout privkey.pem -out fullchain.pem -sha256 -days 3650 -nodes -subj "/C=XX/ST=StateName/L=CityName/O=CompanyName/OU=CompanySectionName/CN=CommonNameOrHostname"

Generating API configuration

Create a copy of config.yml.template at the config folder and name it config.yml. Update the values accordingly. Note that you must update the secret key for the API by running openssl rand -hex 32 and update the value in config.yml.

Setup dependencies

1. Installing required services using Docker:

List of services: knowledge graph, our API, nginx

cd ta2-minmod-kg
docker network create minmod
docker compose build
cd ..

2. Installing python library

Requiring poetry: pip install poetry

cd ta2-minmod-kg
python -m venv .venv
poetry install --only main
cd ..

Setup Users

You can create a user using the command: python -m minmodkg.api user -u <username> -n <name> -e <email> (the password will be prompted).

Additionally, we have an option to load users from a file (.JSON) using the following command: python -m minmodkg.api load-user <filepath>. To add a user to a file, run python -m minmodkg.api add-user <filepath> -u <username> -n <name> -e <email>. To add multiple users from a CSV file, run python -m minmodkg.api batch-add-user <filepath> <csvfile>.

These commands need to be run on a machine that can have access to the database. If you deployed the database inside Docker, you can run the command inside the Docker container with docker exec -it <container_name> python -m minmodkg.api ....

Usage

drawing

The figure above denotes how different components work to produce mineral site data and grade tonnage models. The rounded rectangle are systems, blue rectangle are processes, and the parallelogram is the input source.

1. Building TA2 knowledge graph

source ta2-minmod-kg/.venv/bin/activate
python -m statickg ta2-minmod-kg/etl.yml ./kgdata ta2-minmod-data --overwrite-config --refresh 20

Note that this process will continue running to monitor for new changes. Hence, it will not terminate unless a terminating signal is received explicitly.

2. Starting other services

docker compose -f ./ta2-minmod-kg/docker-compose.yml up nginx api

If you also want to start our dashboard, run docker compose up nginx api dashboard instead. Note that currently, URLs for TA2 services are hardcoded in the dashboard, so it will not query our local services.

Once it starts, you can view our API docs in https://localhost/api/v1/docs

3. Upload data to CDR

To upload data to CDR, you need to obtain a token first. Then, run the following command:

export CDR_AUTH_TOKEN=<your cdr token>
export MINMOD_API=https://localhost/api/v1
export MINMOD_SYSTEM=test
python -m minmodkg.integrations.cdr

The two environment variables MINMOD_API and MINMOD_SYSTEM are used for testing purposes. For production, you can simply ignore these two variables and our code will use the default values MINMOD_API=https://minmod.isi.edu/api/v1 and MINMOD_SYSTEM=minmod.

4. Download data from CDR

The CDR endpoint for getting TA2 output is https://api.cdr.land/docs#/Minerals/list_dedup_site_commodity_v1_minerals_dedup_site_searchcommodityget.

There are two required parameters:

  • commodity: name of the commodity (capitalize, case-sensitive) e.g., Lithium. Here is list of commodities
  • system: minmod

By default, the CDR should return all sites matched the provided two parameters. To further filter the data for sites with deposit type classification and sites with grade/tonnage data, we can apply the following additional filters:

  • Site with deposit type classification: with_location = true, and with_deposit_types_only = true

    CURL Command:

    curl -X 'GET' \
    'https://api.cdr.land/v1/minerals/dedup-site/search/Lithium?with_location=true&with_deposit_types_only=true&system=minmod&top_n=1&limit=-1' \
    -H 'Authorization: Bearer <your token>'
    
  • Site with grade/tonnage data: with_contained_metals = true

    CURL Command:

    curl -X 'GET' \
    'https://api.cdr.land/v1/minerals/dedup-site/search/Lithium?with_contained_metals=true&system=minmod&top_n=1&limit=-1' \
    -H 'Authorization: Bearer <your token>'
    

5. Browsing the data locally

You can browse the data locally by replacing the hostname from minmod.isi.edu to localhost. For example, https://minmod.isi.edu/resource/kg by https://localhost/resource/kg

6. Querying data

If you know SPARQL, you can query the data by sending your query to https://<hostname>/sparql. We have a helper function to make it easier if you know Python in here

Also, you can download mineral site data and grade/tonnage model directly from our API. Please see our API docs in https://localhost/api/v1/docs for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

minmodkg-2.7.1.tar.gz (109.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

minmodkg-2.7.1-py3-none-any.whl (150.1 kB view details)

Uploaded Python 3

File details

Details for the file minmodkg-2.7.1.tar.gz.

File metadata

  • Download URL: minmodkg-2.7.1.tar.gz
  • Upload date:
  • Size: 109.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.4 Darwin/24.3.0

File hashes

Hashes for minmodkg-2.7.1.tar.gz
Algorithm Hash digest
SHA256 110861b062d0699f610e925e23fabe677afc910eb471f05df757205cc17ec701
MD5 4062f232e5427160092c5b7870bfb52e
BLAKE2b-256 7b6354ffa25b9530a76c243a0bef9ccb9175c464910e933fbd300bbdfdd01916

See more details on using hashes here.

File details

Details for the file minmodkg-2.7.1-py3-none-any.whl.

File metadata

  • Download URL: minmodkg-2.7.1-py3-none-any.whl
  • Upload date:
  • Size: 150.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.4 CPython/3.12.4 Darwin/24.3.0

File hashes

Hashes for minmodkg-2.7.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c750ce6c280ea46747495064a6656cb63375fdf0687724c7b58a971c2da1c8ef
MD5 de5fe88a97d7d4507391c746b196574e
BLAKE2b-256 f69a3e059f6db625d26bbe663283307562be6fc09c524bb96b3ed541f009d98e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page