BioTuring Smartbulk Connector
Project description
BioTuring SmartBulk-Connector SDK
Bioturing SmartBulk-Connector SDK is a Python package that provides an interface to interact with Bioturing's services
Installation
You can install the SmartBulk-Connector SDK package using pip:
pip install --upgrade smartbulk-connector
Get API TOKEN from SmartBulk
An API token is a unique identifier that allows a user or application to access an API. It is a secure way to authenticate a user or application and to control what permissions they have.
You do not need to regenerate your API token every time you use it. However, you may need to regenerate your API token if it is compromised.
Firstly, you need to navigate the SmartBulk SDK to get a token. The user’s token is generated from the host website
How To Use
import warnings
from smartbulk_connector import SmartbulkConnector
warnings.filterwarnings("ignore")
Connect to SmartBulk private server
# authentication
DOMAIN = "<your-smartbulk-server-domain>"
TOKEN = "<your-API-token>"
connector = SmartbulkConnector(domain=DOMAIN, token=TOKEN)
Connecting to host at https://dev.bioturing.com/smartbulk
Connect to SmartBulk successfully
# get current version
connector.get_versions()
smartbulk_connector: version 0.1.0
Get user groups available for your token
connector.get_user_groups()
[{'group_id': '<hidden-id>',
'group_name': 'Personal workspace'},
{'group_id': '<hidden-id>',
'group_name': 'All members'},
{'group_id': '<hidden-id>',
'group_name': 'BioTuring Public Studies'}]
Get all projects from a group
connector.get_all_projects_info_in_group(group_id='personal')
[{'project_id': 'prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
'project_name': 'sample dataset'},
{'project_id': 'prj_84c1a392-8080-11ef-8f07-0242ac130004',
'project_name': 'human sample'},
{'project_id': 'prj_94f6f0ef-d6c8-49f5-96f6-7bb5fa6a3de8',
'project_name': 'mouse_sample'}]
List files and directory in workspace
connector.listdir_workspace()
['example_data', 'sample dataset', 'mouse_sample']
connector.listdir_workspace('example_data', fullpath=True)
['/path/to/server/workspace/upload/example_data/count_mat_2.csv',
'/path/to/server/workspace/upload/example_data/count_mat.csv',
'/path/to/server/workspace/upload/example_data/metadata_2.csv',
'/path/to/server/workspace/upload/example_data/metadata.csv',
'/path/to/server/workspace/upload/example_data/recipes.csv']
List files and directory in cloud_storage
connector.listdir_cloud_storage()
['bioturing-lens', 'bioturingdebug', 'bioturingdebug.log.txt']
Upload a single file
connector.upload_file('path/to/local/count_mat.csv', server_folder_name='test', debug_mode=True)
{'status': 0,
'path': '/path/to/server/workspace/upload/test/v1.count_mat.csv',
'url_path': '/path/to/server/workspace/upload/test/v1.count_mat.csv'}
Upload a folder
connector.upload_folder('tsv_sample/', debug_mode=True)
0%| | 0.00/16.1M [00:00<?, ?B/s]
Upload tsv_sample/matrix_200.csv.gz, chunk index : 1 ...
100%|██████████| 16.1M/16.1M [00:29<00:00, 538kB/s]
{'folder_name': 'tsv_sample',
'file_path': ['tsv_sample/recipes.csv',
'tsv_sample/SRP092402.tsv',
'tsv_sample/matrix_200.csv.gz'],
'server_path': ['/path/to/server/workspace/upload/tsv_sample/v1.recipes.csv',
'/path/to/server/workspace/upload/tsv_sample/v1.SRP092402.tsv',
'/path/to/server/workspace/upload/tsv_sample/v1.matrix_200.csv.gz']}
Create new BulkRNAseq project from the uploaded folder path in the SmartBulk Server
submit_result = connector.create_project(
group_id='personal',
species='human',
project_name='human sample',
matrix_paths=['/path/to/server/workspace/upload/tsv_sample/v1.matrix_200.csv.gz'],
metadata_paths=['/path/to/server/workspace/upload/tsv_sample/v1.SRP092402.tsv'],
dataset_name='Sample Dataset',
use_gene_symbols=True,
)
Check project creation status
connector.check_project_status(submit_result=submit_result)
Add new BulkRNAseq dataset to a project
submit_result = connector.add_project(
project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
group_id='personal',
species='human',
matrix_paths=['/path/to/server/workspace//upload/tsv_sample/matrix_200.csv.gz'],
metadata_paths=['/path/to/server/workspace//upload/tsv_sample/SRP092402.tsv'],
dataset_name='Another Dataset',
use_gene_symbols=True,
)
if submit_result:
connector.check_project_status(submit_result=submit_result)
Create new NanoString project from the uploaded folder path in the SmartBulk Server
# metadata is optional in Nanostring
submit_result = connector.create_project(
group_id='personal',
species='human',
project_name='nanostring',
dataset_name='RCC',
rcc_folder_path='/path/to/server/workspace//upload/GSE268196_RCC/',
aggregate_count=True,
platform='bulk',
)
if submit_result:
connector.check_project_status(submit_result=submit_result)
Add new Nanostring dataset to a project
Note that Nanostring accepts only one metadata file when creating a new project.
# metadata is optional in Nanostring
submit_result = connector.add_project(
group_id='personal',
species='human',
project_id='prj_566b623a-cdab-11ef-859d-48ad9afa0555',
dataset_name='nanostring_2',
rcc_folder_path='/path/to/server/workspace//upload/GSE268196_RCC/',
metadata_paths=['/path/to/server/workspace//upload/RCC_matrix_file/metadata.tsv.gz'],
aggregate_count=True,
platform='nanostring',
)
if submit_result:
connector.check_project_status(submit_result=submit_result)
Add new Nanostring dataset from uploaded matrix and metadata to a project
Note that Nanostring accepts only one matrix and one metadata file when creating a new project.
submit_result = connector.add_project(
group_id='personal',
species='human',
project_id='prj_566b623a-cdab-11ef-859d-48ad9afa0555',
dataset_name='uploaded_matrix',
matrix_paths=['/path/to/server/workspace//upload/RCC_matrix_file/matrix.tsv.gz'],
metadata_paths=['/path/to/server/workspace//upload/RCC_matrix_file/metadata.tsv.gz'],
aggregate_count=True,
platform='nanostring',
)
if submit_result:
connector.check_project_status(submit_result=submit_result)
Create project with multiple datasets with a recipes file
Create a new project with multiple dataset using recipes
This recipes file is a csv file that includes:
dataset_name: the name of dataset
path_on_server: server path to the file in one dataset
file_type: can be matrix or metadata or rcc, identify the path_on_server type
species: can be human, mouse, rat or monkey
platform: can be bulk or nanostring
Sample recipes.csv file:
| dataset_name | path_on_server | file_type | species | platform |
|---|---|---|---|---|
| Dataset_1 | /path/to/server/workspace//upload/example_data/count_mat.csv | matrix | human | bulk |
| Dataset_1 | /path/to/server/workspace//upload/example_data/metadata.csv | metadata | human | bulk |
| Dataset_2 | /path/to/server/workspace//upload/example_data/count_mat_2.csv | matrix | human | bulk |
| Dataset_2 | /path/to/server/workspace//upload/example_data/count_mat_3.csv | matrix | human | bulk |
| Dataset_2 | /path/to/server/workspace//upload/example_data/metadata_2.csv | metadata | human | bulk |
| Dataset_2 | /path/to/server/workspace//upload/example_data/metadata_3.csv | metadata | human | bulk |
| Dataset_3 | /path/to/server/workspace//upload/example_data/metadata_3.csv | rcc | human | nanostring |
connector.create_project_from_recipes(
group_id='personal',
recipes_path='example_data/recipes.csv',
project_name='sample dataset',,
aggregate_count=True, # Parameter for nanostring
use_gene_symbols=True, # Parameter for bulkRNAseq
)
Add new dataset to a project with a recipes file
connector.add_project_from_recipes(
project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
group_id='personal',
recipes_path='data/recipes.csv',,
aggregate_count=True, # Parameter for nanostring
use_gene_symbols=True, # Parameter for bulkRNAseq
)
Resume add dataset to a project with a recipes file using tracelog
connector.create_project_from_recipes(
group_id='personal',
recipes_path='data/recipes.csv',
project_name='21_bulk_datasets',
trace_log='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141/project_trace_log.json',
aggregate_count=True, # Parameter for nanostring
use_gene_symbols=True, # Parameter for bulkRNAseq
)
or
connector.add_project_from_recipes(
project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
group_id='personal',
recipes_path='data/recipes.csv',
trace_log='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141/project_trace_log.json',,
aggregate_count=True, # Parameter for nanostring
use_gene_symbols=True, # Parameter for bulkRNAseq
)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smartbulk_connector-1.0.5.tar.gz.
File metadata
- Download URL: smartbulk_connector-1.0.5.tar.gz
- Upload date:
- Size: 17.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a9fcbffbf5f6ab1458be754d7a58b08d8ac21d81fd0e36ce0cd277a770c8026
|
|
| MD5 |
d0d3e2d2bf25374e1d8b4355b0af54d6
|
|
| BLAKE2b-256 |
ce483f85018c73442c6ea87418224601fb4b17be90bfbf841ce4421b36c242f6
|
File details
Details for the file smartbulk_connector-1.0.5-py3-none-any.whl.
File metadata
- Download URL: smartbulk_connector-1.0.5-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7794194434ea62e48f1db4e2909a32f673ecc4daa6419b85a5eb46e692a2881a
|
|
| MD5 |
0b715ac11d402caf2c0af1def223cbf2
|
|
| BLAKE2b-256 |
cf52f7a707d8f31ab9ea9f25e6d97097f8cea2ba1796f324c04757f5d539bc7e
|