Gear to report on metadata qc results

These details have not been verified by PyPI

Project links

Project description

metadata-error-reporter (Metadata Error Reporter)

Overview

Summary

Gear to create report on qc values from metadata on various sections of the hierarchy.

Cite

Developed by Flywheel.

License

MIT

Classification

Analysis gear.

Gear Level:

Project
Subject
Session
Acquisition
Analysis

[[TOC]]

Inputs

metadata-rules
- Type: YAML file
- Optional: True
- Description: A YAML file describing how to handle non-default QC values.

Config

debug
- Type: boolean
- Description: Log debug messages
- Default: False
report-on-success
- Type: boolean
- Description: If true, report on QC results that passed as well. Otherwise, report on only failed QC results.
- Default: False
output-format
- Type: string (Allowed values: {csv, json})
- Description: Format of output report
- Default: csv
intermediate-containers
- Type: boolean
- Description: If true, report on files attached to all containers under the run level. Otherwise only report on files attached to acquisition.
- Default: False
skip-analyses
- Type: boolean
- Description: If true, skip analysis QC results. Otherwise include analyses
- Default: True

Outputs

Files

report
- Name: {proj|sub|ses}-{id}-report.{csv|json}
- Type: CSV or JSON report
- Description: Consolidated report of configured QC results in hierarchy below run container

Metadata

N/A

Pre-requisites

Prerequisite Gear Runs

All gears which create QC results and are desired to be reported on should be run before metadata-error-reporter is run.

Prerequisite Files

N/A

Prerequisite Metadata

N/A

Usage

Description

metadata-error-reporter heavily relies upon Dataviews in Flywheel. The gear essentially does two things:

Submits and waits for the completion of Dataviews which correspond to config options.

The gear by default submits a dataview reporting on the qc namespace on all acquisitions under the run level
With intermediate-containers == True the gear will also submit dataviews for each intermediate run level, i.e. if the run level is project, the gear will submit dataviews at the subject and session level as well as at acquisition
With skip-analyses == False the gear will also submit dataviews for analyses attached to each container level it runs at. For example running at the subject level and intermediate-containers == True, the gear will submit 4 dataviews: session level, session-analysis level, acquistion level, and acquistion-analysis level.

Post-processes the dataviews performing default and any custom operations to report on QC results see Metadata Rules

By default, the gear reports on every key in the qc namespace. Within each key, it reports on QC result (assumed to be a key), except job_info. Within each QC result, it reports the state and data keys, see Output section.

For example, with the following qc namespace structure:

file.info.qc: {
  "gear-name1": {
    "job_info": {},
    "qc_result1": {
      "state": "FAIL",
      "data": "invalid value"
    }
  },
  "gear-name2": {
    "job_info": {},
    "qc_result1": {
      "state": "FAIL",
      "data": "invalid value"
    }
  }
}

The gear would create 2 CSV lines for this particular file.

Metadata Rules

The metadata-rules input is an optional YAML file that provides additional config globally or for individual qc-results

Global options

top_level_namespace: For if you want to report on qc results under a different key
excluded_qc_results: List of qc-result names to exclude added to the default list (job_info and gear_info).
excluded_qc_results_override: List of qc-results names to exclude overriding the default list.
fail_names: List of string values which will be interepreted to mean a failed qc-result (case-insensitive). Defaults to ["fail", "failure", "failed"].
state_name: Name of the key in each qc-result which provides the state (pass vs. fail) information. Defaults to state
true_means_fail: For boolean valued qc-results, defines the mapping between the boolean and pass/fail. If True, then a True valued boolean is defined as a failure, if False, then a True valued boolean is defined as a success.

Field options

Use the fields key to define how to treat specific qc-results. The fields key should contain a mapping under it, where each key is a specific qc-result. Each key should follow the format <gear_name>.<qc_name>. For example, to configure options for the qc-result slice_consistency generated by dicom-qc, you would use the key dicom-qc.slice_consistency.

Under each key (qc-result) in the fields section, the global state_name, true_means_fail, and fail_names can be overriden. Additionally, supporting data within the qc-result can be configured by using the data key.

Under the data key, each entry should be a key-value mapping where the key is the key of the data field you want extracted, and the value is one of either unfold or default:

unfold: Unfold lists or dictionaries. For a list value, create a row for each element in the list with item value represented as a string in description column. For a dictionary value, create a row for each item in the dictionary with item key represented in key column and item value represented in value column.
default: Represent everything as json object in description. For a list value, create one row for the whole list will be a json representation of the list, i.e. [<item1>, <item2>]. For a dictionary value, create one row for the whole dictionary will be a json representation of the list, i.e. [{"<key1>": "<value1>"}, {"<key2>": "<value2>"}]

NOTE: The data key under a field definition does not support nested fields at this time.

Examples

Boolean valued qc-results

The qc-reporter gear is meant to report on qc-results created by the GearToolkitContexts add_qc_result method.

If you have a qc-result that was produced a different way (and therefore looks different), you will need to add a field within the metadata-rules input file to define how to process that result.

For example, if you have a gear called boolean-reporter that produces a single qc-result called value, such as this:

{
  ...
  "qc": {
    "boolean-reporter": {
      "job_info": {...}
      "value": {
        "result": True
      }
    }
  }
}

You could write a metadata-rules like this:

---
fields:
  boolean-reporter.value:
    state_name: "result"
    true_means_fail: True

This would tell the qc-reporter gear to look at the key result within the qc-result to determine state, and that a True value should be interpreted as a failure.

Expanding supporting data

Dicom-qc reports on jsonschema validation, this can produce nested data such as this:

{
  ...
  "qc": {
    "dicom-qc": {
      "job_info": {...}
      "jsonschema-validation": {
        "data": [{
          "error_context": "",
          "error_message": "'dicom' is a required property",
          "error_type": "required",
          "error_value": ["dicom", "dicom_array"],
          "item": "file.info.header"
        },
        {
          "error_context": "",
          "error_message": "'dicom_array' is a required property",
          "error_type": "required",
          "error_value": ["dicom", "dicom_array"],
          "item": "file.info.header"
        }],
        "state": "FAIL"
      },
    }
  }
}

By default if you ran qc-reporter on this, you would get a single row for this failed QC value, but data is a list of length 2, so if you wanted to get 2 rows (one for each failure), you could make a metadata-rules file that looked like this:

---
fields:
  dicom-qc.jsonschema-validation:
    data:
      data: "unfold"

This will produce two rows for the one failed QC result with all the supporting data as the data field in the output CSV.

Output

If JSON output is selected the output will look like below. Otherwise if CSV is selected, the output format will be the same, but with each object in the list being a row in the CSV.

{
  [
      # Schema
	  {
      # Machine readable quick access
      "file_id": <file id | None>,
      "version": <file version | None>,
      # Human readable quick access
      "filename": <filename>,
      "subject.label": <label>,
      "session.label": <label | None>,
      "acquisition.label": <label | None>,
      "analysis.label": <label | None>,
      # for easier nav, non-existent for subject
      "session-url": <session-url>,
      # QC result  
      "state": <pass | fail>,
      "qc-namespace": <top level key under file.info.qc>,
      "qc": <key of the qc result>,
      "data": <supporting-data>,
      "key": <only used for “unfold” operation in custom optional input>,
      "value": <only used for “unfold” operation in custom optional input>,
	  },
      ## Examples
      # For specifically dicom-qc “jsonschema” and dicom-fixer “fixed” (both list types), unfold that list
    {
      …
      "qc-namespace": “dicom-qc”,
      "qc": “jsonschema”,
      "state": PASS | FAIL,
      "data": <error_messsage[0]>
    },
     …
    {
      …
      "qc-namespace": “dicom-qc”,
      "qc": “jsonschema”,
      "state": PASS | FAIL,
      "data": <error_messsage[n]>
    },
    {
      …
      "qc-namespace": “dicom-fixer”,
      "qc": “fixed”,
      "state": PASS | FAIL,
      "data": <fix[1]>
    },
    {
      …
      "qc-namespace": “dicom-fixer”,
      "qc": “fixed”,
      "state": PASS | FAIL,
      "data": <fix[n]>
    },
  ]
}

Workflow

A general workflow:

Upload data to project with gear rules enabled
Gear rules run
Run any custom QC gears across project
Run metadata-error-reporter on project or subsection of project
Use output report to correct QC errors.

Logging

An overview/orientation of the logging and how to interpret it.

FAQ

FAQ.md

Contributing

[For more information about how to get started contributing to that gear, checkout CONTRIBUTING.md.]

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.3

Dec 5, 2024

0.3.2

Dec 3, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fw_gear_qc_reporter-0.3.3-py3-none-any.whl (13.5 kB view details)

Uploaded Dec 5, 2024 Python 3

File details

Details for the file fw_gear_qc_reporter-0.3.3-py3-none-any.whl.

File metadata

Download URL: fw_gear_qc_reporter-0.3.3-py3-none-any.whl
Upload date: Dec 5, 2024
Size: 13.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.4 CPython/3.12.7 Linux/5.15.154+

File hashes

Hashes for fw_gear_qc_reporter-0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bbc767e7348002a7e4e72712fc0f263752eb7a92b7d1b578d193367f2365972d`
MD5	`22c2eaf9c9bcfafc0fe57bf5564a52c3`
BLAKE2b-256	`24a36f932c68fdef50048afb6da366ed6d28b44c9541f099a530b891cda580eb`

See more details on using hashes here.

fw-gear-qc-reporter 0.3.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

metadata-error-reporter (Metadata Error Reporter)

Overview

Summary

Cite

License

Classification

Inputs

Config

Outputs

Files

Metadata

Pre-requisites

Prerequisite Gear Runs

Prerequisite Files

Prerequisite Metadata

Usage

Description

Metadata Rules

Global options

Field options

Examples

Boolean valued qc-results

Expanding supporting data

Output

Workflow

Logging

FAQ

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes