Utilities for the Laboratory Catalog and Archive Service of the Early Detection Research Network
Project description
LabCAS Utilities
This is a hodge-podge collection of various utilities for managing, reporting on, and maintaining instances of the Laboratory Catalog and Archive System (LabCAS).
📀 Installation
Using Python 3.9 or newer, create a virtual environment and install it:
python3 -m venv venv
venv/bin/pip install jpl.labacs.utils
🔧 The Utilities
The numerous utilities are quickly summarized below:
assign-uuids— finds Solr documents without a UUID and assign onebackup-labcas— issues backup commands for the Solr cores for LabCAS, "collections", "datasets", and "files"common-prefixes— allegedly finds common prefixes in FileLocation fields in Solr (but doesn't?)date-report— writes a CSV to stdout of protocols with dates and dates of corresponding collections in LabCASdcm-header-usage— shows usage of DICOM headers in Lung_Team_Project_2 and Prostate_MRI collections based on a Google spreadsheet for inputdelete-collection— deletes a collection (including all of its datasets and files) from LabCAS Solrdelete-datasets— deletes datasets from LabCAS Solr while producing a CSV of the corresponding files that will need to be deleted from diskdelete-field— deletes a field from documents in Solrfield-usage— tells what fields are in use by collections, datasets, and files in Solr and marks those that appear in collection and dataset.cfgfilesfix-event-ids— repairs event IDs in Solr after publication of a dataset (or collection) that uses event IDs based on a LabCAS publishalias.jsonfilefix-patient-ids— overwritesPatiendIDfield in DICOM files with event IDs from Solrfix-principals— sets theOwnerPrincipalfields of the "collections", "datasets", and "files" cores in Solr based on information in metadata.cfgfilesmangle-headers— mangles DICOM headers according to Radka's specificationsmass-spec-fix— fixed misspelled "mass spectrometry" in Solr coresmissing-bbd-dcis— finds missing anonymized BBD and DCIS not specified in a given.csvfilemissing-event-ids— given event IDs, report which are missing in LabCAS Solrpopulate-bbd-dcis— populate the BBD or DCIS files in Solr with filenames in a BBD or DCIS.csvfilereplace-field— replace a list field in Solr with a new single valuereplace-fields— replace a list field in Solr with multiple valuesreport— generate various reports, including events, privacy, event correlation, availability, or patient IDsreport-fields— generate a report on requested fieldsreport-file-size— report total size of all files in LabCAS using Solr metadatarestructure-bbd-dcis— make symlinks into theValidationandDiscoveryfolders for BBD and DCIS data on disks3-report— generate a report about files, sub-folders, and average number of files in sub-folders in S3split-brsi— split the contents of a gzip'd tar file into training and validation folders based on a spreadsheet inputsub-field— substitute the value of a field in multiple documents
Many of these utilities are one-offs, which is typical for LabCAS.
🔁 Looping
Some of these utilities loop over large collections of data, paginating through results and making updates. You may have to run the utilities multiple times until they report updating no more documents.
🛤️ Solr and Tunneling
Many of these utilities operate on Solr that's assumed to be at https://localhost:8984/solr/ with a self-signed certificate. You can override these with a --solr option.
Feel free to tunnel these connection over ssh to a preferential Solr, or run it directly on a host like edrn-labcas, mcl-labcas, labcas-dev, and so forth.
🖥️ Development
To install from source:
git clone https://github.com/jpl-labcas/jpl.labcas.utils.git
cd jpl.labcas.utils
pre-commit install
python3 -m venv .venv
source .venv/bin/activate # or activate.csh if you're a csh/tcsh user
pip install --editable .
To release to PyPI:
python3 -m build .
twine upload dist/*
👩🎨 Creators
The principal developer is:
To contact the team as a whole, email the Informatics Center.
📃 License
The project is licensed under the Apache version 2 license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jpl_labcas_utils-0.0.1.tar.gz.
File metadata
- Download URL: jpl_labcas_utils-0.0.1.tar.gz
- Upload date:
- Size: 37.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab2843ae9a2527aac1976994aade76e01580eef620ccfa133f39b778464f9d54
|
|
| MD5 |
03a95e24ac5117a88336cee44a970e97
|
|
| BLAKE2b-256 |
03c7a64e6ac0502be313e197d24b8b47cfa981cfda47d5597b352d900425f3eb
|
File details
Details for the file jpl_labcas_utils-0.0.1-py3-none-any.whl.
File metadata
- Download URL: jpl_labcas_utils-0.0.1-py3-none-any.whl
- Upload date:
- Size: 61.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff01a5f4fb471cd2e4d10148da46c7270ea635f36d329fa26cd16b574eb2add6
|
|
| MD5 |
1a487434f4b7598761c4350e930031e0
|
|
| BLAKE2b-256 |
54d5e034697a8a603d2d18eab43c5d99314a2b0bf9c9d5f767af7ea315408366
|