Skip to main content

A utility for packaging objects and validating metadata for FAIRSCAPE

Project description

fairscape-cli

A utility for packaging objects and validating metadata for FAIRSCAPE.


Documentation: https://fairscape.github.io/fairscape-cli/

Features

fairscape-cli provides a Command Line Interface (CLI) that allows the client side to create:

  • RO-Crate - a light-weight approach to packaging research data with their metadata. The CLI allows users to:
    • Create Research Object Crates (RO-Crates)
    • Add (transfer) digital objects to the RO-Crate
    • Register metadata of the objects
    • Describe the schema of tabular dataset objects as metadata and perform validation.

Requirements

Python 3.8+

Installation

$ pip install fairscape-cli

Minimal example

Basic commands

  • Show all commands, arguments, and options
$ fairscape-cli --help
  • Create an RO-Crate in a specified directory
$ fairscape-cli rocrate create \
  --name "test rocrate" \
  --description "Example RO Crate for Tests" \
  --organization-name "UVA" \
  --project-name "B2AI"  \
  --keywords "b2ai" \
  --keywords "cm4ai" \
  --keywords "U2OS" \
  "./test_rocrate"
  • Create an RO-Crate in the current working directory
$ fairscape-cli rocrate init \
  --name "test rocrate" \
  --description "Example RO Crate for Tests" \
  --organization-name "UVA" \
  --project-name "B2AI"  \
  --keywords "b2ai" \
  --keywords "cm4ai" \
  --keywords "U2OS"
  • Add a dataset to the RO-Crate
$ fairscape-cli rocrate add dataset \
  --name "AP-MS embeddings" \
  --author "Krogan lab (https://kroganlab.ucsf.edu/krogan-lab)" \
  --version "1.0" \
  --date-published "2021-04-23" \
  --description "Affinity purification mass spectrometer (APMS) embeddings for each protein in the study,  generated by node2vec predict." \
  --keywords "b2ai" \
  --keywords "cm4ai" \
  --keywords "U2OS" \
  --data-format "CSV" \
  --source-filepath "./tests/data/APMS_embedding_MUSIC.csv" \
  --destination-filepath "./test_rocrate/APMS_embedding_MUSIC.csv" \
  "./test_rocrate"
  • Add a software to the RO-Crate
$ fairscape-cli rocrate add software \
  --name "calibrate pairwise distance" \
  --author "Qin, Y." \
  --version "1.0" \
  --description "script written in python to calibrate pairwise distance." \
  --keywords "b2ai" \
  --keywords "cm4ai" \
  --keywords "U2OS" \
  --file-format "py" \
  --source-filepath "./tests/data/calibrate_pairwise_distance.py" \
  --destination-filepath "./test_rocrate/calibrate_pairwise_distance.py" \
  --date-modified "2021-04-23" \
  "./test_rocrate"
  • Register a computation to the RO-Crate
$ fairscape-cli rocrate register computation \
  --name "calibrate pairwise distance" \
  --run-by "Qin, Y." \
  --date-created "2021-05-23" \
  --description "Average the predicted proximities" \
  --keywords "b2ai" \
  --keywords "cm4ai" \
  --keywords "U2OS" \
  "./test_rocrate"
  • Create a schema
$ fairscape-cli schema create-tabular \
    --name 'APMS Embedding Schema' \
    --description 'Tabular format for APMS music embeddings from PPI networks from the music pipeline from the B2AI Cellmaps for AI project' \
    --separator ',' \
    --header False \
    ./schema_apms_music_embedding.json
  • Add a string property
$ fairscape-cli schema add-property string \
    --name 'Experiment Identifier' \
    --index 0 \
    --description 'Identifier for the APMS experiment responsible for generating the raw PPI used to create this embedding vector' \
    --pattern '^APMS_[0-9]*$' \
    ./schema_apms_music_embedding.json
  • Add annother string property
$ fairscape-cli schema add-property string \
    --name 'Gene Symbol' \
    --index 1 \
    --description 'Gene Symbol for the APMS bait protien' \
    --pattern '^[A-Za-z0-9\-]*$' \
    --value-url 'http://edamontology.org/data_1026' \
    ./schema_apms_music_embedding.json
  • Add an array property
$ fairscape-cli schema add-property array \
    --name 'MUSIC APMS Embedding' \
    --index '2::' \
    --description 'Embedding Vector values for genes determined by running node2vec on APMS PPI networks. Vector has 1024 values for each bait protien' \
    --items-datatype 'number' \
    --unique-items False \
    --min-items 1024 \
    --max-items 1024 \
    ./schema_apms_music_embedding.json
  • Show a successful validation of the schema against the dataset
$ fairscape-cli schema validate \
    --data ./examples/schemas/MUSIC_embedding/APMS_embedding_MUSIC.csv  \
    --schema ./examples/schemas/MUSIC_embedding/music_apms_embedding_schema.json
  • Show an unsuccessful validation of the schema against the dataset
$ fairscape-cli schema validate \
    --data examples/schemas/MUSIC_embedding/APMS_embedding_corrupted.csv \
    --schema examples/schemas/MUSIC_embedding/music_apms_embedding_schema.json
  • Validate using default schemas
# validate imageloader files
$ fairscape-cli schema validate \
        --data "examples/schemas/cm4ai-rocrates/imageloader/samplescopy.csv" \
        --schema "ark:59852/schema-cm4ai-imageloader-samplescopy" 
    
$ fairscape-cli schema validate \
        --data "examples/schemas/cm4ai-rocrates/imageloader/uniquecopy.csv" \
        --schema "ark:59852/schema-cm4ai-imageloader-uniquecopy"
       
# validate image embedding outputs
$ fairscape-cli schema validate \
        --data "examples/schemas/cm4ai-rocrates/image_embedding/image_emd.tsv" \
        --schema "ark:59852/schema-cm4ai-image-embedding-image-emd"
     
$ fairscape-cli schema validate \
        --data "examples/schemas/cm4ai-rocrates/image_embedding/labels_prob.tsv" \
        --schema "ark:59852/schema-cm4ai-image-embedding-labels-prob"

# validate apsm loader input
$ fairscape-cli schema validate \
        --data "examples/schemas/cm4ai-rocrates/apmsloader/ppi_gene_node_attributes.tsv" \
        --schema "ark:59852/schema-cm4ai-apmsloader-gene-node-attributes"

$ fairscape-cli schema validate \
        --data "examples/schemas/cm4ai-rocrates/apmsloader/ppi_edgelist.tsv" \
        --schema "ark:59852/schema-cm4ai-apmsloader-ppi-edgelist"

# validate apms embedding 
$ fairscape-cli schema validate \
        --data "examples/schemas/cm4ai-rocrates/apms_embedding/ppi_emd.tsv" \
        --schema "ark:59852/schema-cm4ai-apms-embedding"    

# validate coembedding 
$ fairscape-cli schema validate \
        --data "examples/schemas/cm4ai-rocrates/coembedding/coembedding_emd.tsv" \
        --schema "ark:59852/schema-cm4ai-coembedding"

Contribution

If you'd like to request a feature or report a bug, please create a GitHub Issue using one of the templates provided.

License

This project is licensed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairscape_cli-1.0.2.tar.gz (34.6 kB view details)

Uploaded Source

Built Distribution

fairscape_cli-1.0.2-py3-none-any.whl (38.8 kB view details)

Uploaded Python 3

File details

Details for the file fairscape_cli-1.0.2.tar.gz.

File metadata

  • Download URL: fairscape_cli-1.0.2.tar.gz
  • Upload date:
  • Size: 34.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for fairscape_cli-1.0.2.tar.gz
Algorithm Hash digest
SHA256 8c091e279d7299cf2f73f656e4385a46a780db96eb05ce036c5e63a0a995a24b
MD5 393d67eb714df59253044a8dfa0bf380
BLAKE2b-256 e356ebb683dd2ceb9f281d8eeec02feb7a964ae0357d11bcb2cb6052d97690d0

See more details on using hashes here.

File details

Details for the file fairscape_cli-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for fairscape_cli-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8ee278ced78a18007628777df4de1ab9e08249dcd4b39b1dbf1076038521492d
MD5 a1444fe84f102688cf8ee64c10313b3f
BLAKE2b-256 38831c6eb81a5fe9c729fb401bbdb3ca647b94a6883840d5815549d3e6074a17

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page