CLI tool for interacting with the PubWeb platform
Project description
PubWeb Client
A Python 3.8+ library for the PubWeb platform.
Installation
You can install PubWeb using pip:
pip install pubweb
or by cloning the repository and running:
python setup.py install
Set Up
Run a one-time configuration of your login credentials in the command line using:
pubweb-cli configure
This will ask you to select an authentication method. If you are a member of Fred Hutch or the University of Washington, select the default method which will give you a link to use to log through the browser. If you are not a member of those institutions, select the non-institutional authentication method and enter your Data Portal username and password into the command line when prompted.
Command Line Usage
Downloading a dataset:
Usage: pubweb-cli download [OPTIONS]
Download dataset files
Options:
--project TEXT Name or ID of the project
--dataset TEXT ID of the dataset
--data-directory TEXT Directory to store the files
--interactive Gather arguments interactively
--help Show this message and exit.
Uploading a dataset:
Usage: pubweb-cli upload [OPTIONS]
Upload and create a dataset
Options:
--name TEXT Name of the dataset
--description TEXT Description of the dataset (optional)
--project TEXT Name or ID of the project
--process TEXT Name or ID of the ingest process
--data-directory TEXT Directory you wish to upload
--interactive Gather arguments interactively
--use-third-party-tool Use third party tool for upload (Generate manifest and one-time upload authentication token)
--help Show this message and exit.
Listing datasets:
Usage: pubweb-cli list-datasets [OPTIONS]
List available datasets
Options:
--project TEXT ID of the project
--interactive Gather arguments interactively
--help Show this message and exit.
Interactive Commands
When running a command, you can specify the --interactive
flag to gather the command arguments interactively.
Example:
$ pubweb-cli upload --interactive
? What project is this dataset associated with? Test project
? Enter the full path of the data directory /shared/biodata/test
? Please confirm that you wish to upload 20 files (0.630 GB) Yes
? What type of files? Illumina Sequencing Run
? What is the name of this dataset? test
? Enter a description of the dataset (optional)
? How would you like to upload or download your data? PubWeb CLI
Python Usage
See the following set of Jupyter notebooks that contain examples on the following topics:
Jupyter Notebook | Topic |
---|---|
Introduction | Installing and authenticating |
Uploading a dataset | Uploading data |
Downloading a dataset | Downloading data |
Interacting with a dataset | Calling data and reading into tables |
Analyzing a dataset | Running analysis pipelines |
Using references | Managing reference data |
R Usage
Jupyter Notebook | Topic |
---|---|
Downloading a dataset in R | Reading data with R |
Advanced Usage
Supported environment variables
Name | Description | Default |
---|---|---|
PW_HOME | Local configuration directory | ~/.pubweb |
PW_BASE_URL | Base URL of the data portal | data-portal.io |
Configuration
The pubweb-cli configure
command creates a file in PW_HOME
called config.ini
.
You can set the base_url
property in the config file rather than using the environment variable.
[General]
base_url = data-portal.io
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.