Minimal tooling to keep track of Score-P Profiles + Traces.

These details have not been verified by PyPI

Project description

scorep-db

Minimal tooling to keep track of Score-P Profiles + Traces. It relies on Score-P's metadata collection abilities.

Until version 1.0 is reached, this is a proof of concept. Metadata schema structure may change without notice before version 1.0.

It currently relies on feature branches of Score-P and master branch of cubelib.

Find the repository here.

Install

Either install via pip

pip install scorep-db

or from source (git).

scorep-db commands

scorep-db add          <config> [offline|online] <path/to/experiment>
scorep-db query        <config> [offline|online] <path/to/query.sparql>
scorep-db download     <config> [offline|online] <path/to/download_query.sparql> <target/path>
scorep-db health-check <config> [offline|online]
scorep-db merge        <config>
scorep-db get-id                                 <path/to/experiment>
scorep-db clear        <config> [offline|online]

Brief explanation:

add: Adds an experiment to the database
query: Query the database with a sparql query file
download: Download test cases according to a certain query the the target_path. The query must follow some structure (see below)
health-check: Test, if the databases are available
merge: Merges an offline database into a online database
get-id: Get the Score-P Experiment ID (same as scorep-info show-metatadata --experiment-id)
clear: Delete everything within the selected database.

Query

See directory example/query/ for some queries.

Download Query

The download capability currently relies on a query with a special format. The query must be based on the following minimal example. It is critical, that the results ?Experiment and ?storePath are in the result. It does not matter, if they are upper or lower or mixed case.

PREFIX scorep: <http://scorep-fair.github.io/ontology#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?Experiment ?storePath
WHERE {
  ?Experiment rdf:type scorep:Experiment ;
              scorep:storePath ?storePath .
}

The query above will download all experiments into to specified target path. The name will be (some random) uuid name, so no renaming takes place.

The download name can be modified by providing additional search terms.

PREFIX scorep: <http://scorep-fair.github.io/ontology#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?Experiment ?storePath ?program
WHERE {
  ?Experiment rdf:type scorep:Experiment  ;
              scorep:storePath ?storePath ;
              scorep:program   ?program   .
}

The query above leads to the name program_<program_name>.<experiment-id> (e.g. program_sp-mz.A.x.709330_1725173478_308410). This name is created by concatinating any search key words and its values. Each ?Experiment is unique - if it appears multiple times in a search results, the folder name will be created by merging the key,value pairs together.

E.g. the following query

PREFIX scorep: <http://scorep-fair.github.io/ontology#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?experiment ?storePath ?program ?n ?c ?toolchain
WHERE {
  ?experiment rdf:type scorep:Experiment    ;
              scorep:storePath   ?storePath ;
              scorep:program     ?program   ;
              scorep:environment ?envVar    .

  ?envVar scorep:envName  ?envName  ;
          scorep:envState ?envState ;
          scorep:envValue ?envValue .

  FILTER(?envName IN ("SLURM_NTASKS", "SLURM_CPUS_PER_TASK", "TC_NAME"))
  FILTER(?envState = "set")
  FILTER(?program = "sp-mz.A.x")

  BIND(IF(?envName = "SLURM_NTASKS", ?envValue, "") AS ?n)
  BIND(IF(?envName = "SLURM_CPUS_PER_TASK", ?envValue, "") AS ?c)
  BIND(IF(?envName = "TC_NAME", ?envValue, "") AS ?toolchain)
}

will create a download pattern of e.g

n_2.c_2.toolchain_foss2022a.program_sp.mz.A.x.709330_1725173478_308410/

which allows the user to associate some case setup with the folder name. Not that this naming scheme is close to the one needed for Extra-P. This does not work yet, but might be addressed in the future.

Inclusion of 'external' data.

Other JSON-LD files may be merged and linked into the metadata.

The User has to link its JSON-LD the Score-P Run to its metadata. The runtime id of the Score-P Run can be extracted with

SCOREP_RUN_ID=`scorep-db get-id <path/to/experiment_directory>`
echo $SCOREP_RUN_ID

SCOREP_RUN_ID=`scorep-info show-metadata --experiment-id <path/to/experiment_directory>`
echo $SCOREP_RUN_ID

which can then be used to link the external JSON-LD to this Score-P Experiment.

See scripts in cube_x_to_jsonld/* on how this might exemplarily be done.

Performance

The query via RDFlib is quite slow, and, depeding on the query, can be very, very slow. This issue can be solved by using a different, more performance "Triple Store" backend.

Config File Layout

Almost all scorep-db command need a config file (except get-id). The config file configures some paths and credentials of the following type.

# Offline - Data Store
SCOREP_DB_OFFLINE_DIRECTORY=${HOME}/repos/scorep-db/example/showcase_NAS-NPB/showcase_database/

# Offline - Metadata Store
SCOREP_DB_OFFLINE_PATH=${HOME}/repos/scorep-db/example/showcase_NAS-NPB/showcase_database/
SCOREP_DB_OFFLINE_NAME=scorep-experiments.db

# ----------------------------------------- #

# Online - Data Store
SCOREP_DB_ONLINE_OBJ_HOSTNAME=localhost
SCOREP_DB_ONLINE_OBJ_PORT=9000
SCOREP_DB_ONLINE_OBJ_USER=minioadmin
SCOREP_DB_ONLINE_OBJ_PASSWORD=minioadmin
SCOREP_DB_ONLINE_OBJ_BUCKET_NAME=scorep-experiments

# Online - Metadata Store
SCOREP_DB_ONLINE_RDF_HOSTNAME=localhost
SCOREP_DB_ONLINE_RDF_PORT=5432
SCOREP_DB_ONLINE_RDF_USER=postgres
SCOREP_DB_ONLINE_RDF_PASSWORD=mysecretpassword
SCOREP_DB_ONLINE_RDF_DB_NAME=postgres

It dependes on the metadata emitted by Score-P

The env may contain further data, which means that this must be attributed as well.

I this case it will be attributed to the run. The envVariable_name is the property, its is the value

Example usage

View the showcase in example/showcase_NAS-NPB/03_run_testcases.sh

Setup 'Online' Infrastructure

You can use docker to host the online infrastructure. Make sure to match these with the <config_files>. Postgres

$ docker pull postgres
$ docker run --name my_postgres \
             -e POSTGRES_PASSWORD=mysecretpassword \
             -p 5432:5432 \
             -d \
             postgres

Minio

$ docker pull minio/minio
$ docker run --name minio \
    -v $HOME/.minio-data:/data \
    -v $HOME/.minio:/root/.minio \
    -e "MINIO_ROOT_USER=minioadmin" \
    -e "MINIO_ROOT_PASSWORD=minioadmin" \
    -p 9000:9000 \
    -p 9001:9001 \
    -d \
    minio/minio server /data --console-address ":9001"

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

Jun 1, 2025

This version

0.1.2

Sep 3, 2024

0.1.1

Sep 3, 2024

0.1.0

Sep 3, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scorep_db-0.1.2.tar.gz (123.6 kB view details)

Uploaded Sep 3, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

scorep_db-0.1.2-py3-none-any.whl (124.6 kB view details)

Uploaded Sep 3, 2024 Python 3

File details

Details for the file scorep_db-0.1.2.tar.gz.

File metadata

Download URL: scorep_db-0.1.2.tar.gz
Upload date: Sep 3, 2024
Size: 123.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-44-generic

File hashes

Hashes for scorep_db-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`47c90206b9d24d9a4bcdccf1c32e494d0b52f5d802ce834b474ee9b09b3f637b`
MD5	`f2c7fbebf3c9d978afbc7447d90b190b`
BLAKE2b-256	`40154e3008a04ca922acffc42461d5258dbd92dbdd4539748427bc360cb542f7`

See more details on using hashes here.

File details

Details for the file scorep_db-0.1.2-py3-none-any.whl.

File metadata

Download URL: scorep_db-0.1.2-py3-none-any.whl
Upload date: Sep 3, 2024
Size: 124.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.3 CPython/3.10.12 Linux/6.5.0-44-generic

File hashes

Hashes for scorep_db-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ba958c1772d520893055d6208b5423d12c0dee8ac51a7969c4ed8fa94f19e6ef`
MD5	`c3dfad29c2e60aad160c846d5a0143f7`
BLAKE2b-256	`1fe021f8870ce619de02177b9c57a4e2c850666d837f487bf463915db8bc366e`

See more details on using hashes here.

scorep-db 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

scorep-db

Install

scorep-db commands

Query

Download Query

Inclusion of 'external' data.

Performance

Config File Layout

Example usage

Setup 'Online' Infrastructure

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes