Skip to main content

Scrapes SABIO-RK for enzyme kinetics data for given BiGG Model for dFBA simulation.

Project description

Acquire SABIO-RK Kinetics Data for an arbitrary BiGG model

License

Reaction kinetics data is a pillar of biochemical research, and particularly computational biology. Sources of this data, however, are infrequently accessible to programmatic workflows, such as Dynamic Flux Balance Analysis (dFBA), which hinders research progress. The BiGG_SABIO library attempts to bridge this gab by scraping SABIO-RK kinetics data from any BiGG model-formatted JSON file, which is a powerful ability metabolic and dFBA researchers. SABIO-RK supports this use of this website in its statement of webservices. Examples Notebook are available in the examples directory of the BiGG_SABIO GitHub repository. Please submit errors, inquiries, or suggestions as GitHub issues where they can be addressed.


Installation

BiGG_SABIO is installed in a command prompt, Powershell, Terminal, or Anaconda Command Prompt via pip:

pip install bigg_sabio

__init__()

The scraping is initiated through four arguments:

import bigg_sabio
bgsb = bigg_sabio.SABIO_scraping(bigg_model_path, bigg_model_name = None, export_model_content = False, verbose = False)
  • bigg_model_path str: specifies the path to the JSON file of the BiGG model that will be parsed.

  • bigg_model_name str: specifies the name of the BiGG model, which will be used to identify the model and name the output folder directory, where None defaults the name of the file from the bigg_model_path parameter.

  • export_model_content bool: specifies where parsed information about the BiGG model will be of the SBML file for the BiGG model that will be simulated.

  • verbose & printing bool: specifies whether simulation details (which is valuable for trobuleshooting) and results, respectively, will be printed.

complete()

The complete scraping process is concisely conducted through a single function, which references the object variables that are defined through the __init__() function:

import bigg_sabio
bgsb = bigg_sabio.SABIO_scraping(bigg_model_path, bigg_model_name = None, export_model_content = False, verbose = False)
bgsb.complete()

Individual functions

The steps of acquiring and processing SABIO data into input files of kinetic data for dFBA simulations can be individual executed on demand. These steps and functions are detailed in the following sections.

scrape_bigg_xls()

This function is the first step in BiGG_SABIO workflow, where a Selenium WebDriver is directed through the advanced search options of SABIO and proceeds to download all of the search results that match annotations from the BiGG model. These numerous XLS files, at the end of the scraping process, are concatenated into a spreadsheet with the duplicate rows are removed to yield a complete CSV file of the SABIO kinetics data for the respective BiGG model. The identities and values for each parameter are subsequently scraped, and assembled and downloaded as a separate JSON file.

to_fba()

This is the final step in BiGG_SABIO workflow, where the complete assemblage of SABIO kinetics data is refined into a structure that is amenable with the dFBAy module.

Executing the individual functions

The individual functions can be executed through the following sequence:

import bigg_sabio
bgsb = bigg_sabio.SABIO_scraping(bigg_model_path, bigg_model_name = None, export_model_content = False, verbose = False)
bgsb.scrape_bigg_xls()
bgsb.to_fba()

Accessible content

A multitude of values are stored within the SABIO_scraping object that can be subsequently referenced and used in a workflow. The complete list of content within the SABIO_scraping object can be printed through the built-in dir() function:

# Scrape data for a BiGG model
from bigg_sabio import SABIO_scraping
bgsb = SABIO_scraping(bigg_model_path, bigg_model_name = None, export_model_content = False, verbose = False)
print(dir(bgsb))

The following list highlights stored content in the SABIO_scraping object after a simulation:

  • model & model_contents dict: The loaded BiGG model and a parsed form of the model, respectively, that are interpreted and guide the scraping of reaction enzymes.

  • sabio_df Pandas.DataFrame: A concatenated DataFrame that embodies all of the downloaded XLS files from the model enzymes.

  • paths, parameters, & variables dict: Dictionaries of 1) the essential paths from the scraping, which may be useful to locate and programmatically access each file; 2) important parameters that were parameterized; and 3) the variable values or files that derived from the scraping, respectively.

  • bigg_to_sabio_metabolites, sabio_to_bigg_metabolites, & bigg_reactions dict: Comprehensive dictionaries for the ID codes of BiGG metabolites and reactions, respectively. The bigg_to_sabio_metabolites dictionary is indexed with keys of BiGG ID and values of metabolite names that are recognized by SABIO and BiGG, whereas the bigg_to_sabio_metabolites dictionary is indexed with keys of SABIO metabolite names and values of the corresponding BiGG IDs.

  • driver & fp Selenium.Webdriver: The Firefox browser driver and profile, respectively, that are used programmatically by Selenium functions to access and navigate the SABIO-RK database website.

  • step_number int: An indication of the progression within the scraping workflow, which is enumerated in the main() function of the script.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

BiGG_SABIO-0.0.1.tar.gz (2.5 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page