CLI tool and SDK for interacting with the Cirro platform
Project description
Cirro Client
A Python 3.8+ library for the Cirro platform.
Installation
You can install Cirro using pip:
pip install cirro
or by cloning the repository and running:
python setup.py install
Set Up
Run a one-time configuration of your login credentials in the command line using:
cirro-cli configure
This will ask you to select an authentication method. If you are a member of Fred Hutch or the University of Washington, select the default method which will give you a link to use to log through the browser. If you are not a member of those institutions, select the non-institutional authentication method and enter your Data Portal username and password into the command line when prompted.
Command Line Usage
Downloading a dataset:
Usage: cirro-cli download [OPTIONS]
Download dataset files
Options:
--project TEXT Name or ID of the project
--dataset TEXT ID of the dataset
--data-directory TEXT Directory to store the files
-i, --interactive Gather arguments interactively
--help Show this message and exit.
Uploading a dataset:
Usage: cirro-cli upload [OPTIONS]
Upload and create a dataset
Options:
--name TEXT Name of the dataset
--description TEXT Description of the dataset (optional)
--project TEXT Name or ID of the project
--process TEXT Name or ID of the ingest process
--data-directory TEXT Directory you wish to upload
-i, --interactive Gather arguments interactively
--help Show this message and exit.
Listing datasets:
Usage: cirro-cli list-datasets [OPTIONS]
List available datasets
Options:
--project TEXT ID of the project
-i, --interactive Gather arguments interactively
--help Show this message and exit.
Interactive Commands
When running a command, you can specify the --interactive
flag to gather the command arguments interactively.
Example:
$ cirro-cli upload --interactive
? What project is this dataset associated with? Test project
? Enter the full path of the data directory /shared/biodata/test
? Please confirm that you wish to upload 20 files (0.630 GB) Yes
? What type of files? Illumina Sequencing Run
? What is the name of this dataset? test
? Enter a description of the dataset (optional)
Python Usage
See the following set of Jupyter notebooks that contain examples on the following topics:
Jupyter Notebook | Topic |
---|---|
Introduction | Installing and authenticating |
Uploading a dataset | Uploading data |
Downloading a dataset | Downloading data |
Interacting with a dataset | Calling data and reading into tables |
Analyzing a dataset | Running analysis pipelines |
Using references | Managing reference data |
R Usage
Jupyter Notebook | Topic |
---|---|
Downloading a dataset in R | Reading data with R |
Advanced Usage
Supported environment variables
Name | Description | Default |
---|---|---|
PW_HOME | Local configuration directory | ~/.cirro |
PW_BASE_URL | Base URL of the data portal | data-portal.io |
Configuration
The cirro-cli configure
command creates a file in PW_HOME
called config.ini
.
You can set the base_url
property in the config file rather than using the environment variable.
The transfer_max_retries
configuration property specifies the maximum number of times to attempt uploading a file to Cirro in the event of a transfer failure.
When uploading files to Cirro, network issues or temporary outages can occasionally cause a transfer to fail.
It will pause for an increasing amount of time for each retry attempt.
[General]
base_url = data-portal.io
transfer_max_retries = 15
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.