ReadStore Command Line Interface (CLI) Is A Python Package For Accessing Data from the ReadStore API

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: Apache Software License
Operating System
- Unix
Programming Language
- Python :: 3

Project description

ReadStore CLI

This README describes the ReadStore Command Line Interface (CLI). Also available as GitHub Page.

The ReadStore CLI is used to upload FASTQ files and Processed Data to the ReadStore database and access Projects, Datasets, metadata and attachment files. The ReadStore CLI enables you to automate your bioinformatics pipelines by providing simple and standardized access to datasets.

Check the ReadStore Github repository for more information how to get started.

More infos on the ReadStore website https://evo-byte.com/readstore/

Tutorials and Intro Videos how to get started: https://www.youtube.com/@evobytedigitalbio

Blog posts and How-Tos: https://evo-byte.com/blog/

For general questions reach out to info@evo-byte.com

Happy analysis :)

Description
Security and Permissions
Installation
Usage
Contributing
License
Credits and Acknowledgments

The Lean Solution for Managing NGS and Omics Data

ReadStore is a platform for storing, managing, and integrating genomic data. It accelerates analysis and offers an easy way to manage and share FASTQ file, NGS datasets and processed datasets. With built-in project and metadata management, ReadStore structures your workflows, and its collaborative user interface enhances teamwork — so you can focus on generating insights.

The integrated Webservice (API) enables your to directly retrieve data from ReadStore via the terminal Command-Line-Interface (CLI) or Python / R SDKs.

The ReadStore Basic version provides a local web server with simple user management. For organization-wide deployment, advanced user and group management, or cloud integration, please check out the ReadStore Advanced versions and contact us at info@evo-byte.com.

Description

The ReadStore Command-Line Interface (CLI) is a powerful tool for uploading and managing your omics data. With the ReadStore CLI, you can upload FASTQ files and Processed Data directly to the ReadStore database, as well as access and manage Projects, Datasets, metadata, and attachment files with ease.

The CLI can be run from your shell or terminal and is designed for seamless integration into data pipelines and scripts, enabling efficient automation of data management tasks. This flexibility allows you to integrate the ReadStore CLI within any bioinformatics application or pipeline, streamlining data uploads, access, and organization within ReadStore.

By embedding the ReadStore CLI in your bioinformatics workflows, you can improve efficiency, reduce manual tasks, and ensure your data is readily accessible for analysis and collaboration.

Security and Permissions

PLEASE READ AND FOLLOW THESE INSTRUCTIONS CAREFULLY!

User Accounts and Token

Using the CLI with a ReadStore server requires an active User Account and a Token. You should never enter your user account password when working with the CLI.

To retrieve your token:

Login to the ReadStore web app via your browser
Navigate to Settings page and click on Token
If needed you can regenerate your token (Reset). This will invalidate the previous token

For uploading FASTQ files or Processed Data your User Account needs to have Staging Permission. If you can check this in the Settings page of your account. If you not have Staging Permission, ask the ReadStore server Admin to grant you permission.

CLI Configuration

After running the readstore configure the first time, a configuration file is created in your home directory (~/.readstore/config) to store your credentials and CLI configuration.

The config file is created with user-excklusive read-/write permissions (chmod 600), please make sure to keep the file permissions restricted.

You find more information on the configuration file below.

Installation

pip3 install readstore-cli

You can perform the install in a conda or venv virtual environment to simplify package management.

A local install is also possible

pip3 install --user readstore-cli

Make sure that ~/.local/bin is on your $PATH in case you encounter problems when starting the server.

Validate the install by running

readstore -v

This should print the ReadStore CLI version

Usage

Detailed tutorials, videos and explanations are found on YouTube or on the EVOBYTE blog.

Quickstart

Let's upload some FASTQ files.

1. Configure CLI

Make sure you have the ReadStore CLI installed and a running ReadStore server with your user registered.

Run readstore configure
Enter your username and token
Select the default output of your CLI requests. You can choose between text outputs, comma-separated csv or json.
Run readstore configure list and check if your credentials are correct.

2. Upload Files

For uploading FASTQ files your User Account needs to have Staging Permission. If you can check this in the Settings page of your account. If you not have Staging Permission, ask the ReadStore Server Admin to grant you permission.

Move to a folder that contains some FASTQ files

readstore upload myfile_r1.fastq

This will upload the file and run the QC check. You can select multiple files at once using the * wildcard. The fastq files need to have the default file endings .fastq, .fastq.gz, .fq, .fq.gz.

You can also upload multiple FASTQ files from a template .csv file using the import fastq function. More information below.

3. Stage Files

Login to the web app on your browser and move to the Staging page. Here you find a list of all FASTQ files that you just uploaded. For larger files, the QC step can take a while to complete.

FASTQ files are grouped into Datasets which you can Check In. Checked In Datasets appear in the Datasets page and can be accessed by the CLI.

Check the Batch Check In button to import several Dataset at once.

4. Access Datasets via the CLI

The ReadStore CLI enables programmatic access to Datasets and FASTQ files. Some examples are:

readstore list List all FASTQ files

readstore get --id 25 Get detailed view on Dataset 25

readstore get --id 25 --read1-path Get path for Read1 FASTQ file

readstore get --id 25 --meta Get metadata for Dataset 25

readstore project get --name cohort1 --attachment Get attachment files for Project "cohort1"

You can find a full list of CLI commands below.

5. Managing Processed Data

Processed Data refer to files generated through processing of raw sequencing data. Depending on the omics technology and assay used, this could be for instance transcript count files, variant files or gene count matrices.

readstore pro-data upload -d test_dataset_1 -n test_dataset_count_matrix -t count_matrix test_count_matrix.h5
Upload count matrix test_count_matrix.h5 with name "test_dataset_count_matrix" for dataset with name "test_dataset_1"

readstore pro-data list List Processed Data for all Datasets and Projects

readstore pro-data get -d test_dataset_1 -n test_dataset_count_matrix Get ProData details for Dataset "test_dataset_1" with the name "test_dataset_count_matrix"

readstore pro-data delete -d test_dataset_1 -n test_dataset_count_matrix Delete ProData for dataset "test_dataset_1" with the name "test_dataset_count_matrix"

The delete operation does not remove the file from the file system, only from the database. A user needs Staging Permission to create or remove datasets.

CLI Configuration

readstore configure manages the CLI configuration. To setup the configuration:

Run readstore configure
Enter your username and token
Select the default output of your CLI requests. You can choose between text outputs, comma-separated csv or json.
Run readstore configure list and check if your credentials are correct.

If you already have a configuration in place, the CLI will ask whether you want to overwrite the existing credentials. Select y if yes.

After running the readstore configure the first time, a configuration file is created in your home directory (~/.readstore/config). The config file is created with user-excklusive read-/write permissions (chmod 600), please make sure to keep the file permissions restricted.

[general]
endpoint_url = http://localhost:8000
fastq_extensions = ['.fastq', '.fastq.gz', '.fq', '.fq.gz']
output = csv

[credentials]
username = myusername
token = myrandomtoken

You can further edit the configuration of the CLI client from this configuration file. In case your ReadStore Django server is not run at the default port 8000, you need to update the endpoint_url. If you need to process FASTQ files with file endings other than those listed in fastq_extensions, you can modify the list.

Upload FASTQ Files

For uploading FASTQ files your User Account needs to have Staging Permission. You can check this in the Settings page of your account. If you do not have Staging Permission, ask the ReadStore Server Admin to grant you permission.

readstore upload myfile_r1.fastq myfile_r2.fastq ...

This will upload the files and run the QC check. You can select several files at once using the * wildcard. It can take some time before FASTQ files are available in your Staging page depending on how large file are and how long the QC step takes.

usage: readstore upload [options]

Upload FASTQ Files

positional arguments:
  fastq_files  FASTQ Files to Upload

Import FASTQ files from .csv Template

Import FASTQ files from template .csv file.

A .csv file can be downloaded from the ReadStore App in the Staging Page or from this repository, or is available in this repository under assets/readstore_template.csv

The template .csv file must contain the columns FASTQFileName,ReadType & UploadPath.

FASTQFileName Name for the FASTQ File in ReadStore DB
ReadType FASTQ Read Type: R1 (Read 1), R2 (Read 2), I1 (Index 1) or I2 (Index 2)
Upload Path File path to FASTQ file. Must be accessible from ReadStore server

usage: readstore import fastq [options]

Import FASTQ Files

positional arguments:
  fastq_template  FASTQ Template .csv File

Access Projects

There are 3 commands for accessing projects, readstore project list, readstore project get and readstore project download.

list provides an overview of project, metadata and attachments
get provides detailed information on individual projects and to its metadata and attachments
download lets you download attachment files of a project from the ReadStore database

readstore project list

usage: readstore project ls [options]

List Projects

options:
  -h, --help            show this help message and exit
  -m, --meta            Get Metadata
  -a, --attachment      Get Attachment
  --output {json,text,csv}
                        Format of command output (see config for default)

Show project id and name.

The -m/--meta include metadata for projects as json string in output.

The -a/--attachment include attachment names as list in output.

Adapt the output format of the command using the --output options.

readstore project get

usage: readstore project get [options]

Get Project

options:
  -h, --help            show this help message and exit
  -id , --id            Get Project by ID
  -n , --name           Get Project by name
  -m, --meta            Get only Metadata
  -a, --attachment      Get only Attachment
  --output {json,text,csv}
                        Format of command output (see config for default)

Show project details for a project selected either by --id or the --name argument. The project details include description, date of creation, attachments and metadata

The -m/--meta shows only the metadata with keys in header.

The -a/--attachment shows only the attachments.

Adapt the output format of the command using the --output options.

Example: readstore project get --id 2

readstore project download

usage: readstore project download [options]

Download Project Attachments

options:
  -h, --help          show this help message and exit
  -id , --id          Select Project by ID
  -n , --name         Select Project by name
  -a , --attachment   Set Attachment Name to download
  -o , --outpath      Download path or directory (default . )

Download attachment files for a project. Select a project selected either by --id or the --name argument.

With the --attachment argument you specify the name of the attachment file to download.

Use the --outpath to set a directory to download files to.

Example readstore project download --id 2 -a ProjectQC.pptx -o ~/downloads

Access Datasets and FASTQ Files

There are 3 commands for accessing dataset, readstore list, readstore get and readstore download.

list provides an overview of datasets, metadata and attachments
get provides detailed information on an individual dataset and to its metadata and attachments and individual FASTQ read files and statistics.
download lets you download attachment files of a dataset

readstore list

usage: readstore ls [options]

List FASTQ Datasets

options:
  -h, --help            show this help message and exit
  -p , --project-name   Subset by Project Name
  -pid , --project-id   Subset by Project ID
  -m, --meta            Get Metadata
  -a, --attachment      Get Attachment
  --output {json,text,csv}
                        Format of command output (see config for default)

Show dataset id, name, description, qc_passed, paired_end, index_read, project_ids and project_names

-p/--project-name subset dataset from a specified project

-pid/--project-id subset dataset from a specified project

-m/--meta include metadata for datasets

-a/--attachment include attachment names as list for datasets

Adapt the output format of the command using the --output options.

readstore get

usage: readstore get [options]

Get FASTQ Datasets and Files

options:
  -h, --help            show this help message and exit
  -id , --id            Get Dataset by ID
  -n , --name           Get Dataset by name
  -m, --meta            Get only Metadata
  -a, --attachment      Get only Attchments
  -r1, --read1          Get Read 1 Data
  -r2, --read2          Get Read 2 Data
  -r1p, --read1-path    Get Read 1 FASTQ Path
  -r2p, --read2-path    Get Read 2 FASTQ Path
  -i1, --index1         Get Index 1 Data
  -i2, --index2         Get Index 2 Data
  -i1p, --index1-path   Get Index 1 FASTQ Path
  -i2p, --index2-path   Get Index 2 FASTQ Path
  --output {json,text,csv}
                        Format of command output (see config for default)

Show details for a dataset selected either by --id or the --name argument.

-m/--meta shows only the metadata with keys in header.

-a/--attachment shows only the attachments.

-r1/--read1 shows details for dataset Read 1 data (same for --read2, --index1, --index2)

-r1p/--read1-path returns path for dataset Read 1 (same for --read2-path, --index1-path, --index2-path)

Adapt the output format of the command using the --output options.

Example: readstore get --id 2

Example: readstore get --id 2 --read1-path

readstore download

usage: readstore download [options]

Download Dataset attachments

options:
  -h, --help          show this help message and exit
  -id , --id          Select Dataset by ID
  -n , --name         Select Dataset by name
  -a , --attachment   Set Attachment Name to download
  -o , --outpath      Download path or directory (default . )

Download attachment files for a dataset. Select dataset either by --id or the --name argument.

With the --attachment argument you specify the name of the attachment file to download.

Use the --outpath to set a directory to download files to.

Example readstore download --id 2 -a myAttachment.csv -o ~/downloads

Access Processed Data

readstore pro-data upload

usage: readstore pro-data upload [options]

Upload Processed Data

positional arguments:
  pro_data_file         Path to Processed Data File to Upload

options:
  -h, --help            show this help message and exit
  -did , --dataset-id   Set associated Dataset by ID
  -d , --dataset-name   Set associated Dataset by Name
  -n , --name           Set Processed Data Name (required)
  -t , --type           Set Type of Processed Data (e.g. gene_counts) (required)
  --description         Set Description
  -m META, --meta META  Set metadata as JSON string (e.g '{"key": "value"}')

Upload Processed Data to ReadStore database and connect with an existing dataset.

Processed Data can be any file type and tyically represent datasets for downstream omics analysis, for instance gene count matrices or variant files.

Your ReadStore user account is required to have Staging Permissions to upload or delete Processed Data.

You need to specify a --dataset-id or --dataset-name to select the dataset to attach files to.

-n/--name defines the name to set for the processed data in the ReadStore DB

-t/--type defines the data type of the processed dataset. The type is free to choose, for instance gene_counts or count_matrix

-m/--meta enables to set metadata for the processed data (optional). This attribute must be a json-formatted string, e.g. '{"key": "value"}'

--description set a optional description for the dataset (optional).

Example: readstore pro-data upload -d test_dataset_1 -n test_dataset_count_matrix -t count_matrix -m '{"key":"value"}' test_count_matrix.h5

readstore pro-data list

usage: readstore pro-data list [options]

List Processed Data

options:
  -h, --help            show this help message and exit
  -pid , --project-id   Subset by Project ID
  -p , --project-name   Subset by Project Name
  -did , --dataset-id   Subset by Dataset ID
  -d , --dataset-name   Subset by Dataset Name
  -n , --name           Subset by ProData Name
  -t , --type           Subset by Data Type
  -m, --meta            Get Metadata
  -a, --archived        Include Archived ProData
  --output {json,text,csv}
                        Format of command output (see config for default)

List Processed Data stored in the ReadStore database.

You can subset the list by Projects (-pid/-p), Datasets (-did/-d) and/or by the specific Name (-n) of the Processed Data stored.

-m/--meta Also show metadata

-a/--archived Show archived Processed Data.

Processed Data are archived when a new file with the same name attribute is uploaded. This invalidates a previous version of the Processed Data

Example: readstore pro-data list -p TestProject

readstore pro-data get

usage: readstore pro-data get [options]

Get Processed Data

options:
  -h, --help            show this help message and exit
  -id , --id            Get ProData by ID
  -did , --dataset-id   Get ProData by Dataset ID
  -d , --dataset-name   Get ProData by Dataset Name
  -n , --name           Get ProData by Name
  -m, --meta            Get only Metadata
  -p, --upload-path     Get only Upload Path
  -v , --version        Get ProData Version (default: latest)
  --output {json,text,csv}
                        Format of command output (see config for default)

Get single Processed Data by their -id or the associated --dataset-id/--dataset-name plus --name argument.

-m/--meta Return only metadata

-p/--upload-path Return only upload path

-v/--version Select ProData by specific version (Optional). Default: latest version.

Example: readstore pro-data get -d test_dataset_1 -n test_dataset_count_matrix

readstore pro-data delete

usage: readstore pro-data delete [options]

Delete Processed Data

options:
  -h, --help            show this help message and exit
  -id , --id            Delete ProData by ID
  -did , --dataset-id   Delete ProData by Dataset ID
  -d , --dataset-name   Delete ProData by Dataset Name
  -n , --name           Delete ProData by Name
  -v , --version        Delete ProData Version (default: latest)

Delete Processed Data by their -id or the associated --dataset-id / --dataset-name plus --name argument.

-v/--version Delete ProData by specific version (Optional). Default: latest version.

Example: readstore pro-data delete -d test_dataset_1 -n test_dataset_count_matrix

Contributing

Contributions make this project better! Whether you want to report a bug, improve documentation, or add new features, any help is welcomed!

How You Can Help

Report Bugs
Suggest Features
Improve Documentation
Code Contributions

Contribution Workflow

Fork the repository and create a new branch for each contribution.
Write clear, concise commit messages.
Submit a pull request and wait for review.

Thank you for helping make this project better!

License

The ReadStore CLI is licensed under an Apache 2.0 Open Source License. See the LICENSE file for more information.

Credits and Acknowledgments

ReadStore CLI is built upon the following open-source python packages and would like to thank all contributing authors, developers and partners.

Python (https://www.python.org/)
Requests (https://requests.readthedocs.io/en/latest/)

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: Apache Software License
Operating System
- Unix
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

1.3.0

Feb 11, 2025

1.2.0

Dec 22, 2024

This version

1.1.0

Dec 1, 2024

1.0.2

Nov 14, 2024

1.0.1

Oct 30, 2024

1.0.0

Oct 30, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

readstore_cli-1.1.0.tar.gz (34.4 kB view details)

Uploaded Dec 1, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

readstore_cli-1.1.0-py3-none-any.whl (30.2 kB view details)

Uploaded Dec 1, 2024 Python 3

File details

Details for the file readstore_cli-1.1.0.tar.gz.

File metadata

Download URL: readstore_cli-1.1.0.tar.gz
Upload date: Dec 1, 2024
Size: 34.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.4

File hashes

Hashes for readstore_cli-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`894d84673c1bc49f0a85b028965841f0a9f0d4354d3c320f4933d914daed46f7`
MD5	`66825f31dffea918a1d622da4998546c`
BLAKE2b-256	`73c01fc74645b298c153f05388e23fd9af0fd38a149bef13e17fcd1d9f382260`

See more details on using hashes here.

File details

Details for the file readstore_cli-1.1.0-py3-none-any.whl.

File metadata

Download URL: readstore_cli-1.1.0-py3-none-any.whl
Upload date: Dec 1, 2024
Size: 30.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.11.4

File hashes

Hashes for readstore_cli-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5d9cb48cb4cd86f1af0613b4bdb3107bccd637d9ec976079e560b6b63788791f`
MD5	`540aa4e095d1fca0746dffcde8d71c79`
BLAKE2b-256	`7c679fad0ba4139e7f2c36ad9e6ea9872a9a862aea2a4c9a69fbf887faef4f6d`

See more details on using hashes here.

readstore-cli 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ReadStore CLI

Table of Contents

The Lean Solution for Managing NGS and Omics Data

Description

Security and Permissions

User Accounts and Token

CLI Configuration

Installation

Usage

Quickstart

1. Configure CLI

2. Upload Files

3. Stage Files

4. Access Datasets via the CLI

5. Managing Processed Data

CLI Configuration

Upload FASTQ Files

Import FASTQ files from .csv Template

Access Projects

readstore project list

readstore project get

readstore project download

Access Datasets and FASTQ Files

readstore list

readstore get

readstore download

Access Processed Data

readstore pro-data upload

readstore pro-data list

readstore pro-data get

readstore pro-data delete

Contributing

How You Can Help

Contribution Workflow

License

Credits and Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes