A utility for packaging objects and validating metadata for FAIRSCAPE
Project description
fairscape-cli
A utility for packaging objects and validating metadata for FAIRSCAPE.
Documentation: https://fairscape.github.io/fairscape-cli/
Features
fairscape-cli provides a Command Line Interface (CLI) that allows the client side to create:
- RO-Crate - a light-weight approach to packaging research data with their metadata. The CLI allows users to:
- Create Research Object Crates (RO-Crates)
- Add (transfer) digital objects to the RO-Crate
- Register metadata of the objects
- Describe the schema of tabular dataset objects as metadata and perform validation.
Requirements
Python 3.8+
Installation
$ pip install fairscape-cli
Minimal example
Basic commands
- Show all commands, arguments, and options
$ fairscape-cli --help
- Create an RO-Crate in a specified directory
$ fairscape-cli rocrate create \
--name "test rocrate" \
--description "Example RO Crate for Tests" \
--organization-name "UVA" \
--project-name "B2AI" \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS" \
"./test_rocrate"
- Create an RO-Crate in the current working directory
$ fairscape-cli rocrate init \
--name "test rocrate" \
--description "Example RO Crate for Tests" \
--organization-name "UVA" \
--project-name "B2AI" \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS"
- Add a dataset to the RO-Crate
$ fairscape-cli rocrate add dataset \
--name "AP-MS embeddings" \
--author "Krogan lab (https://kroganlab.ucsf.edu/krogan-lab)" \
--version "1.0" \
--date-published "2021-04-23" \
--description "Affinity purification mass spectrometer (APMS) embeddings for each protein in the study, generated by node2vec predict." \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS" \
--data-format "CSV" \
--source-filepath "./tests/data/APMS_embedding_MUSIC.csv" \
--destination-filepath "./test_rocrate/APMS_embedding_MUSIC.csv" \
"./test_rocrate"
- Add a software to the RO-Crate
$ fairscape-cli rocrate add software \
--name "calibrate pairwise distance" \
--author "Qin, Y." \
--version "1.0" \
--description "script written in python to calibrate pairwise distance." \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS" \
--file-format "py" \
--source-filepath "./tests/data/calibrate_pairwise_distance.py" \
--destination-filepath "./test_rocrate/calibrate_pairwise_distance.py" \
--date-modified "2021-04-23" \
"./test_rocrate"
- Register a computation to the RO-Crate
$ fairscape-cli rocrate register computation \
--name "calibrate pairwise distance" \
--run-by "Qin, Y." \
--date-created "2021-05-23" \
--description "Average the predicted proximities" \
--keywords "b2ai" \
--keywords "cm4ai" \
--keywords "U2OS" \
"./test_rocrate"
- Create a schema
$ fairscape-cli schema create-tabular \
--name 'APMS Embedding Schema' \
--description 'Tabular format for APMS music embeddings from PPI networks from the music pipeline from the B2AI Cellmaps for AI project' \
--separator ',' \
--header False \
./schema_apms_music_embedding.json
- Add a string property
$ fairscape-cli schema add-property string \
--name 'Experiment Identifier' \
--index 0 \
--description 'Identifier for the APMS experiment responsible for generating the raw PPI used to create this embedding vector' \
--pattern '^APMS_[0-9]*$' \
./schema_apms_music_embedding.json
- Add annother string property
$ fairscape-cli schema add-property string \
--name 'Gene Symbol' \
--index 1 \
--description 'Gene Symbol for the APMS bait protien' \
--pattern '^[A-Za-z0-9\-]*$' \
--value-url 'http://edamontology.org/data_1026' \
./schema_apms_music_embedding.json
- Add an array property
$ fairscape-cli schema add-property array \
--name 'MUSIC APMS Embedding' \
--index '2::' \
--description 'Embedding Vector values for genes determined by running node2vec on APMS PPI networks. Vector has 1024 values for each bait protien' \
--items-datatype 'number' \
--unique-items False \
--min-items 1024 \
--max-items 1024 \
./schema_apms_music_embedding.json
- Show a successful validation of the schema against the dataset
$ fairscape-cli schema validate \
--data ./examples/schemas/MUSIC_embedding/APMS_embedding_MUSIC.csv \
--schema ./examples/schemas/MUSIC_embedding/music_apms_embedding_schema.json
- Show an unsuccessful validation of the schema against the dataset
$ fairscape-cli schema validate \
--data examples/schemas/MUSIC_embedding/APMS_embedding_corrupted.csv \
--schema examples/schemas/MUSIC_embedding/music_apms_embedding_schema.json
- Validate using default schemas
# validate imageloader files
$ fairscape-cli schema validate \
--data "examples/schemas/cm4ai-rocrates/imageloader/samplescopy.csv" \
--schema "ark:59852/schema-cm4ai-imageloader-samplescopy"
$ fairscape-cli schema validate \
--data "examples/schemas/cm4ai-rocrates/imageloader/uniquecopy.csv" \
--schema "ark:59852/schema-cm4ai-imageloader-uniquecopy"
# validate image embedding outputs
$ fairscape-cli schema validate \
--data "examples/schemas/cm4ai-rocrates/image_embedding/image_emd.tsv" \
--schema "ark:59852/schema-cm4ai-image-embedding-image-emd"
$ fairscape-cli schema validate \
--data "examples/schemas/cm4ai-rocrates/image_embedding/labels_prob.tsv" \
--schema "ark:59852/schema-cm4ai-image-embedding-labels-prob"
# validate apsm loader input
$ fairscape-cli schema validate \
--data "examples/schemas/cm4ai-rocrates/apmsloader/ppi_gene_node_attributes.tsv" \
--schema "ark:59852/schema-cm4ai-apmsloader-gene-node-attributes"
$ fairscape-cli schema validate \
--data "examples/schemas/cm4ai-rocrates/apmsloader/ppi_edgelist.tsv" \
--schema "ark:59852/schema-cm4ai-apmsloader-ppi-edgelist"
# validate apms embedding
$ fairscape-cli schema validate \
--data "examples/schemas/cm4ai-rocrates/apms_embedding/ppi_emd.tsv" \
--schema "ark:59852/schema-cm4ai-apms-embedding"
# validate coembedding
$ fairscape-cli schema validate \
--data "examples/schemas/cm4ai-rocrates/coembedding/coembedding_emd.tsv" \
--schema "ark:59852/schema-cm4ai-coembedding"
Contribution
If you'd like to request a feature or report a bug, please create a GitHub Issue using one of the templates provided.
License
This project is licensed under the terms of the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fairscape_cli-1.0.2.tar.gz
(34.6 kB
view details)
Built Distribution
File details
Details for the file fairscape_cli-1.0.2.tar.gz
.
File metadata
- Download URL: fairscape_cli-1.0.2.tar.gz
- Upload date:
- Size: 34.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c091e279d7299cf2f73f656e4385a46a780db96eb05ce036c5e63a0a995a24b |
|
MD5 | 393d67eb714df59253044a8dfa0bf380 |
|
BLAKE2b-256 | e356ebb683dd2ceb9f281d8eeec02feb7a964ae0357d11bcb2cb6052d97690d0 |
File details
Details for the file fairscape_cli-1.0.2-py3-none-any.whl
.
File metadata
- Download URL: fairscape_cli-1.0.2-py3-none-any.whl
- Upload date:
- Size: 38.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ee278ced78a18007628777df4de1ab9e08249dcd4b39b1dbf1076038521492d |
|
MD5 | a1444fe84f102688cf8ee64c10313b3f |
|
BLAKE2b-256 | 38831c6eb81a5fe9c729fb401bbdb3ca647b94a6883840d5815549d3e6074a17 |