BioTuring Smartbulk Connector

Project description

BioTuring SmartBulk-Connector SDK

Bioturing SmartBulk-Connector SDK is a Python package that provides an interface to interact with Bioturing's services

Installation

You can install the SmartBulk-Connector SDK package using pip:

pip install --upgrade smartbulk-connector

Get API TOKEN from SmartBulk

An API token is a unique identifier that allows a user or application to access an API. It is a secure way to authenticate a user or application and to control what permissions they have.

You do not need to regenerate your API token every time you use it. However, you may need to regenerate your API token if it is compromised.

Firstly, you need to navigate the SmartBulk SDK to get a token. The user’s token is generated from the host website

How To Use

import warnings
from smartbulk_connector import SmartbulkConnector

warnings.filterwarnings("ignore")

Connect to SmartBulk private server

# authentication
DOMAIN = "<your-smartbulk-server-domain>"
TOKEN = "<your-API-token>"
connector = SmartbulkConnector(domain=DOMAIN, token=TOKEN)

Connecting to host at https://dev.bioturing.com/smartbulk
Connect to SmartBulk successfully

# get current version
connector.get_versions()

smartbulk_connector: version 0.1.0

Get user groups available for your token

connector.get_user_groups()

[{'group_id': '<hidden-id>',
  'group_name': 'Personal workspace'},
 {'group_id': '<hidden-id>', 
  'group_name': 'All members'},
 {'group_id': '<hidden-id>',
  'group_name': 'BioTuring Public Studies'}]

Get all projects from a group

connector.get_all_projects_info_in_group(group_id='personal')

[{'project_id': 'prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
  'project_name': 'sample dataset'},
 {'project_id': 'prj_84c1a392-8080-11ef-8f07-0242ac130004',
  'project_name': 'human sample'},
 {'project_id': 'prj_94f6f0ef-d6c8-49f5-96f6-7bb5fa6a3de8',
  'project_name': 'mouse_sample'}]

List files and directory in workspace

connector.listdir_workspace()

['example_data', 'sample dataset', 'mouse_sample']

connector.listdir_workspace('example_data', fullpath=True)

['/path/to/server/workspace/upload/example_data/count_mat_2.csv',
 '/path/to/server/workspace/upload/example_data/count_mat.csv',
 '/path/to/server/workspace/upload/example_data/metadata_2.csv',
 '/path/to/server/workspace/upload/example_data/metadata.csv',
 '/path/to/server/workspace/upload/example_data/recipes.csv']

List files and directory in cloud_storage

connector.listdir_cloud_storage()

['bioturing-lens', 'bioturingdebug', 'bioturingdebug.log.txt']

Upload a single file

connector.upload_file('path/to/local/count_mat.csv', server_folder_name='test', debug_mode=True)

{'status': 0,
 'path': '/path/to/server/workspace/upload/test/v1.count_mat.csv',
 'url_path': '/path/to/server/workspace/upload/test/v1.count_mat.csv'}

Upload a folder

connector.upload_folder('tsv_sample/', debug_mode=True)

  0%|          | 0.00/16.1M [00:00<?, ?B/s]

Upload tsv_sample/matrix_200.csv.gz, chunk index : 1 ...


100%|██████████| 16.1M/16.1M [00:29<00:00, 538kB/s]



{'folder_name': 'tsv_sample',
 'file_path': ['tsv_sample/recipes.csv',
  'tsv_sample/SRP092402.tsv',
  'tsv_sample/matrix_200.csv.gz'],
 'server_path': ['/path/to/server/workspace/upload/tsv_sample/v1.recipes.csv',
  '/path/to/server/workspace/upload/tsv_sample/v1.SRP092402.tsv',
  '/path/to/server/workspace/upload/tsv_sample/v1.matrix_200.csv.gz']}

Create new BulkRNAseq project from the uploaded folder path in the SmartBulk Server

submit_result = connector.create_project(
    group_id='personal',
    species='human',
    project_name='human sample',
    matrix_paths=['/path/to/server/workspace/upload/tsv_sample/v1.matrix_200.csv.gz'],
    metadata_paths=['/path/to/server/workspace/upload/tsv_sample/v1.SRP092402.tsv'],
    dataset_name='Sample Dataset',
    use_gene_symbols=True,
)

Check project creation status

connector.check_project_status(submit_result=submit_result)

Add new BulkRNAseq dataset to a project

submit_result = connector.add_project(
    project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
    group_id='personal',
    species='human',
    matrix_paths=['/path/to/server/workspace//upload/tsv_sample/matrix_200.csv.gz'],
    metadata_paths=['/path/to/server/workspace//upload/tsv_sample/SRP092402.tsv'],
    dataset_name='Another Dataset',
    use_gene_symbols=True,
)
if submit_result:
    connector.check_project_status(submit_result=submit_result)

Create new NanoString project from the uploaded folder path in the SmartBulk Server

# metadata is optional in Nanostring
submit_result = connector.create_project(
    group_id='personal',
    species='human',
    project_name='nanostring',
    dataset_name='RCC',
    rcc_folder_path='/path/to/server/workspace//upload/GSE268196_RCC/',
    aggregate_count=True,
    platform='bulk',
)
if submit_result:
    connector.check_project_status(submit_result=submit_result)

Add new Nanostring dataset to a project

Note that Nanostring accepts only one metadata file when creating a new project.

# metadata is optional in Nanostring
submit_result = connector.add_project(
    group_id='personal',
    species='human',
    project_id='prj_566b623a-cdab-11ef-859d-48ad9afa0555',
    dataset_name='nanostring_2',
    rcc_folder_path='/path/to/server/workspace//upload/GSE268196_RCC/',
    metadata_paths=['/path/to/server/workspace//upload/RCC_matrix_file/metadata.tsv.gz'],
    aggregate_count=True,
    platform='nanostring',
)
if submit_result:
    connector.check_project_status(submit_result=submit_result)

Add new Nanostring dataset from uploaded matrix and metadata to a project

Note that Nanostring accepts only one matrix and one metadata file when creating a new project.

submit_result = connector.add_project(
    group_id='personal',
    species='human',
    project_id='prj_566b623a-cdab-11ef-859d-48ad9afa0555',
    dataset_name='uploaded_matrix',
    matrix_paths=['/path/to/server/workspace//upload/RCC_matrix_file/matrix.tsv.gz'],
    metadata_paths=['/path/to/server/workspace//upload/RCC_matrix_file/metadata.tsv.gz'],
    aggregate_count=True,
    platform='nanostring',
)
if submit_result:
    connector.check_project_status(submit_result=submit_result)

Create project with multiple datasets with a recipes file

Create a new project with multiple dataset using recipes

This recipes file is a csv file that includes: 

    dataset_name: the name of dataset
    path_on_server: server path to the file in one dataset
    file_type: can be matrix or metadata or rcc, identify the path_on_server type
    species: can be human, mouse, rat or monkey
    platform: can be bulk or nanostring

Sample recipes.csv file:

dataset_name	path_on_server	file_type	species	platform
Dataset_1	/path/to/server/workspace//upload/example_data/count_mat.csv	matrix	human	bulk
Dataset_1	/path/to/server/workspace//upload/example_data/metadata.csv	metadata	human	bulk
Dataset_2	/path/to/server/workspace//upload/example_data/count_mat_2.csv	matrix	human	bulk
Dataset_2	/path/to/server/workspace//upload/example_data/count_mat_3.csv	matrix	human	bulk
Dataset_2	/path/to/server/workspace//upload/example_data/metadata_2.csv	metadata	human	bulk
Dataset_2	/path/to/server/workspace//upload/example_data/metadata_3.csv	metadata	human	bulk
Dataset_3	/path/to/server/workspace//upload/example_data/metadata_3.csv	rcc	human	nanostring

connector.create_project_from_recipes(
    group_id='personal', 
    recipes_path='example_data/recipes.csv', 
    project_name='sample dataset',,
    aggregate_count=True, # Parameter for nanostring
    use_gene_symbols=True, # Parameter for bulkRNAseq
)

Add new dataset to a project with a recipes file

connector.add_project_from_recipes(
    project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
    group_id='personal', 
    recipes_path='data/recipes.csv',,
    aggregate_count=True, # Parameter for nanostring
    use_gene_symbols=True, # Parameter for bulkRNAseq
)

Resume add dataset to a project with a recipes file using tracelog

connector.create_project_from_recipes(
    group_id='personal',
    recipes_path='data/recipes.csv', 
    project_name='21_bulk_datasets',
    trace_log='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141/project_trace_log.json',
    aggregate_count=True, # Parameter for nanostring
    use_gene_symbols=True, # Parameter for bulkRNAseq
)

connector.add_project_from_recipes(
    project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
    group_id='personal', 
    recipes_path='data/recipes.csv',
    trace_log='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141/project_trace_log.json',,
    aggregate_count=True, # Parameter for nanostring
    use_gene_symbols=True, # Parameter for bulkRNAseq
)

Project details

Release history Release notifications | RSS feed

This version

1.0.5

Feb 24, 2025

1.0.4

Jan 9, 2025

1.0.3

Dec 16, 2024

1.0.2

Nov 19, 2024

1.0.1

Oct 14, 2024

1.0.0

Oct 4, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartbulk_connector-1.0.5.tar.gz (17.2 kB view details)

Uploaded Feb 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

smartbulk_connector-1.0.5-py3-none-any.whl (16.1 kB view details)

Uploaded Feb 24, 2025 Python 3

File details

Details for the file smartbulk_connector-1.0.5.tar.gz.

File metadata

Download URL: smartbulk_connector-1.0.5.tar.gz
Upload date: Feb 24, 2025
Size: 17.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for smartbulk_connector-1.0.5.tar.gz
Algorithm	Hash digest
SHA256	`0a9fcbffbf5f6ab1458be754d7a58b08d8ac21d81fd0e36ce0cd277a770c8026`
MD5	`d0d3e2d2bf25374e1d8b4355b0af54d6`
BLAKE2b-256	`ce483f85018c73442c6ea87418224601fb4b17be90bfbf841ce4421b36c242f6`

See more details on using hashes here.

File details

Details for the file smartbulk_connector-1.0.5-py3-none-any.whl.

File metadata

Download URL: smartbulk_connector-1.0.5-py3-none-any.whl
Upload date: Feb 24, 2025
Size: 16.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for smartbulk_connector-1.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7794194434ea62e48f1db4e2909a32f673ecc4daa6419b85a5eb46e692a2881a`
MD5	`0b715ac11d402caf2c0af1def223cbf2`
BLAKE2b-256	`cf52f7a707d8f31ab9ea9f25e6d97097f8cea2ba1796f324c04757f5d539bc7e`

See more details on using hashes here.

smartbulk-connector 1.0.5

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Project description

BioTuring SmartBulk-Connector SDK

Installation

Get API TOKEN from SmartBulk

How To Use

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes