Skip to main content

BioTuring Smartbulk Connector

Project description

BioTuring SmartBulk-Connector SDK

Bioturing SmartBulk-Connector SDK is a Python package that provides an interface to interact with Bioturing's services

Installation

You can install the SmartBulk-Connector SDK package using pip:

pip install --upgrade smartbulk-connector

Get API TOKEN from SmartBulk

An API token is a unique identifier that allows a user or application to access an API. It is a secure way to authenticate a user or application and to control what permissions they have.

You do not need to regenerate your API token every time you use it. However, you may need to regenerate your API token if it is compromised.

Firstly, you need to navigate the SmartBulk SDK to get a token. The user’s token is generated from the host website

How To Use

import warnings
from smartbulk_connector import SmartbulkConnector

warnings.filterwarnings("ignore")

Connect to SmartBulk private server

# authentication
DOMAIN = "<your-smartbulk-server-domain>"
TOKEN = "<your-API-token>"
connector = SmartbulkConnector(domain=DOMAIN, token=TOKEN)
Connecting to host at https://dev.bioturing.com/smartbulk
Connect to SmartBulk successfully
# get current version
connector.get_versions()
smartbulk_connector: version 0.1.0

Get user groups available for your token

connector.get_user_groups()
[{'group_id': '<hidden-id>',
  'group_name': 'Personal workspace'},
 {'group_id': '<hidden-id>', 
  'group_name': 'All members'},
 {'group_id': '<hidden-id>',
  'group_name': 'BioTuring Public Studies'}]

Get all projects from a group

connector.get_all_projects_info_in_group(group_id='personal')
[{'project_id': 'prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
  'project_name': 'sample dataset'},
 {'project_id': 'prj_84c1a392-8080-11ef-8f07-0242ac130004',
  'project_name': 'human sample'},
 {'project_id': 'prj_94f6f0ef-d6c8-49f5-96f6-7bb5fa6a3de8',
  'project_name': 'mouse_sample'}]

List files and directory in workspace

connector.listdir_workspace()
['example_data', 'sample dataset', 'mouse_sample']
connector.listdir_workspace('example_data', fullpath=True)
['/path/to/server/workspace/upload/example_data/count_mat_2.csv',
 '/path/to/server/workspace/upload/example_data/count_mat.csv',
 '/path/to/server/workspace/upload/example_data/metadata_2.csv',
 '/path/to/server/workspace/upload/example_data/metadata.csv',
 '/path/to/server/workspace/upload/example_data/recipes.csv']

List files and directory in cloud_storage

connector.listdir_cloud_storage()
['bioturing-lens', 'bioturingdebug', 'bioturingdebug.log.txt']

Upload a single file

connector.upload_file('path/to/local/count_mat.csv', server_folder_name='test', debug_mode=True)
{'status': 0,
 'path': '/path/to/server/workspace/upload/test/v1.count_mat.csv',
 'url_path': '/path/to/server/workspace/upload/test/v1.count_mat.csv'}

Upload a folder

connector.upload_folder('tsv_sample/', debug_mode=True)
  0%|          | 0.00/16.1M [00:00<?, ?B/s]

Upload tsv_sample/matrix_200.csv.gz, chunk index : 1 ...


100%|██████████| 16.1M/16.1M [00:29<00:00, 538kB/s]



{'folder_name': 'tsv_sample',
 'file_path': ['tsv_sample/recipes.csv',
  'tsv_sample/SRP092402.tsv',
  'tsv_sample/matrix_200.csv.gz'],
 'server_path': ['/path/to/server/workspace/upload/tsv_sample/v1.recipes.csv',
  '/path/to/server/workspace/upload/tsv_sample/v1.SRP092402.tsv',
  '/path/to/server/workspace/upload/tsv_sample/v1.matrix_200.csv.gz']}

Create new project from the uploaded folder path in the SmartBulk Server

submit_result = connector.create_project(
    group_id='personal',
    species='human',
    project_name='human sample',
    matrix_paths=['/path/to/server/workspace/upload/tsv_sample/v1.matrix_200.csv.gz'],
    metadata_paths=['/path/to/server/workspace/upload/tsv_sample/v1.SRP092402.tsv'],
    dataset_name='Sample Dataset',
)

Check project creation status

connector.check_project_status(submit_result=submit_result)

Add new dataset to a project

submit_result = connector.add_project(
    project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
    group_id='personal',
    species='human',
    matrix_paths=['/path/to/server/workspace//upload/tsv_sample/matrix_200.csv.gz'],
    metadata_paths=['/path/to/server/workspace//upload/tsv_sample/SRP092402.tsv'],
    dataset_name='Another Dataset'
)

connector.check_project_status(submit_result=submit_result)

Create project with multiple datasets with recipes

Create a new project with multiple dataset using recipes

This recipes file is a csv file that includes: 

    dataset_name: the name of dataset
    path_on_server: server path to the file in one dataset
    file_type: can be either matrix or metadata, identify the path_on_server type
    species: can be either human or mouse

Sample recipes.csv file:

dataset_name path_on_server file_type species
Dataset_1 /path/to/server/workspace//upload/example_data/count_mat.csv matrix human
Dataset_1 /path/to/server/workspace//upload/example_data/metadata.csv metadata human
Dataset_2 /path/to/server/workspace//upload/example_data/count_mat_2.csv matrix human
Dataset_2 /path/to/server/workspace//upload/example_data/count_mat_3.csv matrix human
Dataset_2 /path/to/server/workspace//upload/example_data/metadata_2.csv metadata human
Dataset_2 /path/to/server/workspace//upload/example_data/metadata_3.csv metadata human
connector.create_project_from_recipes(
    group_id='personal', 
    recipes_path='example_data/recipes.csv', 
    project_name='sample dataset',
)

Add new dataset to a project with a recipes file

connector.add_project_from_recipes(
    project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
    group_id='personal', 
    recipes_path='data/recipes.csv',
)

Resume add dataset to a project with a recipes file using tracelog

connector.create_project_from_recipes(
    group_id='personal',
    recipes_path='data/recipes.csv', 
    project_name='21_bulk_datasets',
    trace_log='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141/project_trace_log.json'
)

or

connector.add_project_from_recipes(
    project_id='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141',
    group_id='personal', 
    recipes_path='data/recipes.csv',
    trace_log='prj_8dfbc6a8-4e22-444f-b4a2-4df000c48141/project_trace_log.json'
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartbulk_connector-1.0.0.tar.gz (15.6 kB view details)

Uploaded Source

Built Distribution

smartbulk_connector-1.0.0-py3-none-any.whl (14.9 kB view details)

Uploaded Python 3

File details

Details for the file smartbulk_connector-1.0.0.tar.gz.

File metadata

  • Download URL: smartbulk_connector-1.0.0.tar.gz
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.15

File hashes

Hashes for smartbulk_connector-1.0.0.tar.gz
Algorithm Hash digest
SHA256 cd66a9f730789de5a3fda3d1f54d3e7f51ee65cb4d130a866c1f501b05072b28
MD5 29e4d934dfe0261ab74fdde56abf174d
BLAKE2b-256 2af56468e5c18c9caf721edc7fb15fe321b9d174e4350ceb08ce90e7b6ad09bd

See more details on using hashes here.

File details

Details for the file smartbulk_connector-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for smartbulk_connector-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ff37b975c9d808c232209deb0b2b2a5fb0115d2af80ece29ad372dbaf82f5a2f
MD5 9c3aee79ad7475a2bc05ecfc9625e67a
BLAKE2b-256 ba50ab3e340bca4a64caa7041f22e2e7e3c538a0c0da943a886734e41a57a206

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page