Skip to main content

BioTuring Smartbulk Connector

Project description

BioTuring SmartBulk-Connector SDK

Bioturing SmartBulk-Connector SDK is a Python package that provides an interface to interact with Bioturing's services

Installation

You can install the SmartBulk-Connector SDK package using pip:

pip install --upgrade smartbulk-connector

Get API TOKEN from SmartBulk

An API token is a unique identifier that allows a user or application to access an API. It is a secure way to authenticate a user or application and to control what permissions they have.

You do not need to regenerate your API token every time you use it. However, you may need to regenerate your API token if it is compromised.

Firstly, you need to navigate the SmartBulk SDK to get a token. The user’s token is generated from the host website

How To Use

import warnings
from smartbulk_connector import SmartbulkConnector

warnings.filterwarnings("ignore")

Connect to SmartBulk private server

# authentication
DOMAIN = "<your-smartbulk-server-domain>"
TOKEN = "<your-API-token>"
connector = SmartbulkConnector(domain=DOMAIN, token=TOKEN)
Connecting to host at https://dev.bioturing.com/smartbulk
Connect to SmartBulk successfully
# get current version
connector.get_versions()
smartbulk_connector: version 0.1.0

Get user groups available for your token

connector.get_user_groups()
[{'group_id': '<hidden-id>',
  'group_name': 'Personal workspace'},
 {'group_id': '<hidden-id>', 
  'group_name': 'All members'},
 {'group_id': '<hidden-id>',
  'group_name': 'BioTuring Public Studies'}]

Get all projects from a group

connector.get_all_projects_info_in_group(group_id='personal')
[{'project_id': 'prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
  'project_name': 'sample dataset'},
 {'project_id': 'prj_84c1a392-8080-11ef-8f07-0242ac130004',
  'project_name': 'human sample'},
 {'project_id': 'prj_94f6f0ef-d6c8-49f5-96f6-7bb5fa6a3de8',
  'project_name': 'mouse_sample'}]

List files and directory in workspace

connector.listdir_workspace()
['example_data', 'sample dataset', 'mouse_sample']
connector.listdir_workspace('example_data', fullpath=True)
['/path/to/server/workspace/upload/example_data/count_mat_2.csv',
 '/path/to/server/workspace/upload/example_data/count_mat.csv',
 '/path/to/server/workspace/upload/example_data/metadata_2.csv',
 '/path/to/server/workspace/upload/example_data/metadata.csv',
 '/path/to/server/workspace/upload/example_data/recipes.csv']

List files and directory in cloud_storage

connector.listdir_cloud_storage()
['bioturing-lens', 'bioturingdebug', 'bioturingdebug.log.txt']

Upload a single file

connector.upload_file('path/to/local/count_mat.csv', server_folder_name='test', debug_mode=True)
{'status': 0,
 'path': '/path/to/server/workspace/upload/test/v1.count_mat.csv',
 'url_path': '/path/to/server/workspace/upload/test/v1.count_mat.csv'}

Upload a folder

connector.upload_folder('tsv_sample/', debug_mode=True)
  0%|          | 0.00/16.1M [00:00<?, ?B/s]

Upload tsv_sample/matrix_200.csv.gz, chunk index : 1 ...


100%|██████████| 16.1M/16.1M [00:29<00:00, 538kB/s]



{'folder_name': 'tsv_sample',
 'file_path': ['tsv_sample/recipes.csv',
  'tsv_sample/SRP092402.tsv',
  'tsv_sample/matrix_200.csv.gz'],
 'server_path': ['/path/to/server/workspace/upload/tsv_sample/v1.recipes.csv',
  '/path/to/server/workspace/upload/tsv_sample/v1.SRP092402.tsv',
  '/path/to/server/workspace/upload/tsv_sample/v1.matrix_200.csv.gz']}

Create new BulkRNAseq project from the uploaded folder path in the SmartBulk Server

submit_result = connector.create_project(
    group_id='personal',
    species='human',
    project_name='human sample',
    matrix_paths=['/path/to/server/workspace/upload/tsv_sample/v1.matrix_200.csv.gz'],
    metadata_paths=['/path/to/server/workspace/upload/tsv_sample/v1.SRP092402.tsv'],
    dataset_name='Sample Dataset',
    use_gene_symbols=True,
)

Check project creation status

connector.check_project_status(submit_result=submit_result)

Add new BulkRNAseq dataset to a project

submit_result = connector.add_project(
    project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
    group_id='personal',
    species='human',
    matrix_paths=['/path/to/server/workspace//upload/tsv_sample/matrix_200.csv.gz'],
    metadata_paths=['/path/to/server/workspace//upload/tsv_sample/SRP092402.tsv'],
    dataset_name='Another Dataset',
    use_gene_symbols=True,
)
if submit_result:
    connector.check_project_status(submit_result=submit_result)

Create new NanoString project from the uploaded folder path in the SmartBulk Server

# metadata is optional in Nanostring
submit_result = connector.create_project(
    group_id='personal',
    species='human',
    project_name='nanostring',
    dataset_name='RCC',
    rcc_folder_path='/path/to/server/workspace//upload/GSE268196_RCC/',
    aggregate_count=True,
    platform='bulk',
)
if submit_result:
    connector.check_project_status(submit_result=submit_result)

Add new Nanostring dataset to a project

Note that Nanostring accepts only one metadata file when creating a new project.

# metadata is optional in Nanostring
submit_result = connector.add_project(
    group_id='personal',
    species='human',
    project_id='prj_566b623a-cdab-11ef-859d-48ad9afa0555',
    dataset_name='nanostring_2',
    rcc_folder_path='/path/to/server/workspace//upload/GSE268196_RCC/',
    metadata_paths=['/path/to/server/workspace//upload/RCC_matrix_file/metadata.tsv.gz'],
    aggregate_count=True,
    platform='nanostring',
)
if submit_result:
    connector.check_project_status(submit_result=submit_result)

Add new Nanostring dataset from uploaded matrix and metadata to a project

Note that Nanostring accepts only one matrix and one metadata file when creating a new project.

submit_result = connector.add_project(
    group_id='personal',
    species='human',
    project_id='prj_566b623a-cdab-11ef-859d-48ad9afa0555',
    dataset_name='uploaded_matrix',
    matrix_paths=['/path/to/server/workspace//upload/RCC_matrix_file/matrix.tsv.gz'],
    metadata_paths=['/path/to/server/workspace//upload/RCC_matrix_file/metadata.tsv.gz'],
    aggregate_count=True,
    platform='nanostring',
)
if submit_result:
    connector.check_project_status(submit_result=submit_result)

Create project with multiple datasets with a recipes file

Create a new project with multiple dataset using recipes

This recipes file is a csv file that includes: 

    dataset_name: the name of dataset
    path_on_server: server path to the file in one dataset
    file_type: can be matrix or metadata or rcc, identify the path_on_server type
    species: can be human, mouse, rat or monkey
    platform: can be bulk or nanostring

Sample recipes.csv file:

dataset_name path_on_server file_type species platform
Dataset_1 /path/to/server/workspace//upload/example_data/count_mat.csv matrix human bulk
Dataset_1 /path/to/server/workspace//upload/example_data/metadata.csv metadata human bulk
Dataset_2 /path/to/server/workspace//upload/example_data/count_mat_2.csv matrix human bulk
Dataset_2 /path/to/server/workspace//upload/example_data/count_mat_3.csv matrix human bulk
Dataset_2 /path/to/server/workspace//upload/example_data/metadata_2.csv metadata human bulk
Dataset_2 /path/to/server/workspace//upload/example_data/metadata_3.csv metadata human bulk
Dataset_3 /path/to/server/workspace//upload/example_data/metadata_3.csv rcc human nanostring
connector.create_project_from_recipes(
    group_id='personal', 
    recipes_path='example_data/recipes.csv', 
    project_name='sample dataset',,
    aggregate_count=True, # Parameter for nanostring
    use_gene_symbols=True, # Parameter for bulkRNAseq
)

Add new dataset to a project with a recipes file

connector.add_project_from_recipes(
    project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
    group_id='personal', 
    recipes_path='data/recipes.csv',,
    aggregate_count=True, # Parameter for nanostring
    use_gene_symbols=True, # Parameter for bulkRNAseq
)

Resume add dataset to a project with a recipes file using tracelog

connector.create_project_from_recipes(
    group_id='personal',
    recipes_path='data/recipes.csv', 
    project_name='21_bulk_datasets',
    trace_log='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141/project_trace_log.json',
    aggregate_count=True, # Parameter for nanostring
    use_gene_symbols=True, # Parameter for bulkRNAseq
)

or

connector.add_project_from_recipes(
    project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
    group_id='personal', 
    recipes_path='data/recipes.csv',
    trace_log='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141/project_trace_log.json',,
    aggregate_count=True, # Parameter for nanostring
    use_gene_symbols=True, # Parameter for bulkRNAseq
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartbulk_connector-1.0.5.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smartbulk_connector-1.0.5-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file smartbulk_connector-1.0.5.tar.gz.

File metadata

  • Download URL: smartbulk_connector-1.0.5.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for smartbulk_connector-1.0.5.tar.gz
Algorithm Hash digest
SHA256 0a9fcbffbf5f6ab1458be754d7a58b08d8ac21d81fd0e36ce0cd277a770c8026
MD5 d0d3e2d2bf25374e1d8b4355b0af54d6
BLAKE2b-256 ce483f85018c73442c6ea87418224601fb4b17be90bfbf841ce4421b36c242f6

See more details on using hashes here.

File details

Details for the file smartbulk_connector-1.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for smartbulk_connector-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 7794194434ea62e48f1db4e2909a32f673ecc4daa6419b85a5eb46e692a2881a
MD5 0b715ac11d402caf2c0af1def223cbf2
BLAKE2b-256 cf52f7a707d8f31ab9ea9f25e6d97097f8cea2ba1796f324c04757f5d539bc7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page