Skip to main content

Python Subnet Discovery for Systems Biology

Project description

Build

SUBNET DISCOVERY FOR SBML MODELS

Motivation

Many advances in biomedical research are driven by structural analysis, a study of the interconnections between elements in biological systems (e.g., identifying drug target and phylogenetic analyses). Structural analysis appeals because structural information is much easier to obtain than dynamical data such as species concentrations and reaction fluxes. Our focus is on subnet discovery in chemical reaction networks (CRNs); that is, discovering a subset of a target CRN that is structurally identical to a reference CRN. Applications of subnet discovery include the discovery of conserved chemical pathways and the elucidation of the structure of complex CRNs. Although there are theoretical results for finding subgraphs, we are unaware of tools for CRN subnet discovery. This is in part due to the special characteristics of CRN graphs, that they are directed, bipartite, hypergraphs.

pySubnetSB

pySubnetSB is an open source python package for discovering subnets represented in the systems biology markup language (SBML) community standard. By subnet discovery we meaning filnding a specified reference CRN in a larger target CRN. This is an example of the subgraph finding problem in graph theory, which is very computationally demanding (NP-hard). We exploit special characteristics of CRNs to reduce the computational complexity. We also use a combination of vectorization and process parallelism to achieve considerable speedus.

Below, we summarize the pySubnetSB API.

Single Reference, Single Target: Simple Case

Here we illustrate pySubnetSB for two small networks using default values in the API call. The reference and target models are:

reference_model = """
    R1: S2 -> S3; k2*S2
    R2: S1 -> S2; k1*S1
    
    S1 = 5
    S2 = 0
    k1 = 1
    k2 = 1ef
    """
    
target_model = """
    T1: A -> B; k1*A
    T2: B -> C; k2*B
    T3: B + C -> ; k3*B*C
    
    A = 5
    B = 0
    k1 = 1
    k2 = 1
    k3 = 0.2
    """

To use pySubnetSB, execute

!pip install pySubnetSB
from pySubnetSB.api import ModelSpecification findReferenceInTarget, findReferencesInTargets, makeSerializationFile

The API call is

result = findReferenceInTarget(reference_model, target_model)

result.mapping_pairs describes how species and reactions in the target are mapped to the reference. This is a list with two elements.

Mapping pairs: [species: [1 2 0], reaction: [1 0]]

A mapping pair is a list of two lists. The first list is the species mapping. The i-th position in this list is for the i-th reference species. (Species and reactions are indexed by the sequence in which they are encountered in the model.) The first position in this list (Python index 0) contains a 1. This means that S2, the first reference species (index 0), is mapped to target species B (target species indexed as 1). The reaction list indicates that reaction R1 (index 0 in the reference) is mapped to reaction T2 (index 1 in the target).

We can construct the inferred network using

result.makeInferredNetwork()

In this example, the inferred network is

T2: B -> C
T1: A -> B

Single Reference, Single Target: More Advanced

Here, we illustrate pySubnetSB for a more compute intensive example. The reference model is an oscillating network.

reference_model = """
    J1: $S3 -> S2;  S3*19.3591127845924;
    J2: S0 -> S4 + S0;  S0*10.3068257839885;
    J3: S4 + S2 -> S4;  S4*S2*13.8915863630362;
    J4: S2 -> S0 + S2;  S2*0.113616698747501;
    J5: S4 + S0 -> S4;  S4*S0*0.240788980014622;
    J6: S2 -> S2 + S2;  S2*1.36258363821544;
    J7: S2 + S4 -> S2;  S2*S4*1.37438814584166;
    
    S0 = 2; S1 = 5; S2 = 7; S3 = 10; S4 = 1;
"""

The target model is BioModels 695.

URL = "https://www.ebi.ac.uk/biomodels/services/download/get-files/MODEL1701090001/3/BIOMD0000000695_url.xml"
result = findReferenceInTarget(
        reference_model,
        ModelSpecification(URL, specification_type="sbmlurl"),
        max_num_mapping_pair=1e14,
        num_process=2,
        identity="weak")

As before, the first two arguments of the API call are the reference and target models. Since the target is a URL, ModelSpecification is used to convert the URL to a model string. max_num_mapping_pair is used to manage computational demands by limiting the number mapping pairs that are considered. num_process specifies the number of processes (cores) that are used by pySubnetSB. By default, all cores are used. Last, identity specifies the kind of subnet to discover - "weak" or "strong" (default).

The identity argument requires more explanation. A subnet of the target is weakly identical (cn.ID_WEAK) to the reference if they have the same stoichiometry matrix. A target subnet is strongly identical (default) if it is just a renaming of the species and reactions in the reference network.

Running the foregoing code takes about 10 minutes on a two core machine. You will see a status bar as the command executes that indicates the number of mapping pairs processed.

mapping pairs: 100%|███████████████████████████████████████████████████████████████████████████████████| 483649090/483649090 [00:29<00:00, 16350750.64it/s]

As before result.mapping_pairs is a list of mapping pairs. You can display the inferred network form mapping pair 1 using result.makeInferredNetwork(1). There is some stochasticity to the order of the results.

result.InferredNetwork(1)

produces:

R_31: xFinal_2 -> xFinal_1
R_10:  -> xFinal_8
R_33: xFinal_1 + xFinal_8 + xFinal_2 -> xFinal_8 + xFinal_2
R_24: xFinal_2 -> xFinal_3 + xFinal_2
R_25: xFinal_3 -> 
R_32: xFinal_1 + xFinal_8 + xFinal_2 -> 2.0 xFinal_1 + xFinal_8 + xFinal_2
R_12: xFinal_8 -> 

This requires some elaboration. Note that although pySubnetSB matches R_10 in the target with J2 in the reference, the reactions look quite different. These reactions are:

R_10:  -> xFinal_8
J2: S0 -> S4 + S0

These reactions are a weakly identical because the match is based on the stoichiometry matrices. Recall that the stoichiometry matrix contains the difference between species in the products and those in the reactant. As such, S0 in the reactants is subtracted from S0 in the product and so J2 is equvalent to -> S4, which does look lik R_10.

Multiple References and Targets

pySubnetSB supports checking for multiple reference networks in multiple target networks. This can be done by having a directory of reference models and a directory of target models. pySubnetSB can serialize the structural characteristics of a model into a one line string. (See the discussion of the API call makeSerializationFile in the Jupyter notebook referenced in the Availability section.) This capability allows you to specify a serialization file instead of a directory, which is often more convenient.

reference_url = "http://raw.githubusercontent.com/ModelEngineering/pySubnetSB/main/examples/reference_serialized.txt"
target_url = "http://raw.githubusercontent.com/ModelEngineering/pySubnetSB/main/examples/target_serialized.txt"
result_df = findReferencesInTargets(reference_url, target_url)

The output of this API is a dataframe with information about the comparisons. Below is the output produced from this analysis for 3 columns in the dataframe.

print(f'Summary of results:\n{result_df[["reference_name", "target_name", "num_mapping_pair"]]}')

which produces:

Summary of results: 
        reference_name      target_name num_mapping_pair
    0   BIOMD0000000031  BIOMD0000000170               48
    1   BIOMD0000000031  BIOMD0000000228              240
    2   BIOMD0000000031  BIOMD0000000354               12
    3   BIOMD0000000031  BIOMD0000000960                 
    4   BIOMD0000000027  BIOMD0000000170               24
    5   BIOMD0000000027  BIOMD0000000228               60
    6   BIOMD0000000027  BIOMD0000000354               12
    7   BIOMD0000000027  BIOMD0000000960                 
    8   BIOMD0000000121  BIOMD0000000170                 
    9   BIOMD0000000121  BIOMD0000000228                 
    10  BIOMD0000000121  BIOMD0000000354                 
    11  BIOMD0000000121  BIOMD0000000960                6

Availability

pySubnetSB is installed using

pip install pySubnetSB

The package has been tested on linux (Ubuntu 22.04), Windows (Windows 10), and Mac OS (14.7.6). For each, tests were run for Python 3.9, 3.10, 3.11, and 3.12.

https://github.com/ModelEngineering/pySubnetSB/blob/main/examples/api_basics.ipynb is a Jupyter notebook that demonstrates pySubsetSB capabilities. https://github.com/ModelEngineering/pySubnetSB/blob/main/examples/api_basics_programmatic.py contains much of the code in the notebook. You can test your install of pySubnetSB by downloading this script and executing it using

python api_basics_programmatic.py

Version History

  • 1.0.9 8/2/2025 Changed "induced" to "inferred" in docs and code; fixed bugs in code in README
  • 1.0.8 7/21/2025 Improved example for using pySubnetSB and revised github actions workflows.
  • 1.0.7 7/20/2025 Finalized code and documentation
  • 1.0.6 7/19/2025 Workflows for Ubuntu, Windows, Macos and python 3.9, 3.10, 3.11, 3.12
  • 1.0.5 7/19/2025 Fix install issues with missing modules
  • 1.0.2 4/10/2025. ModelSpecification API accepts many kinds of model inputs, Antimony, SBML, roadrunner.
  • 1.0.1 4/09/2025. Improved generation of networks with subnets. Use "mapping_pair" in API. Bug fixes.
  • 1.0.0 2/27/2025. First beta release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysubnetsb-1.0.9.tar.gz (111.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pysubnetsb-1.0.9-py3-none-any.whl (141.7 kB view details)

Uploaded Python 3

File details

Details for the file pysubnetsb-1.0.9.tar.gz.

File metadata

  • Download URL: pysubnetsb-1.0.9.tar.gz
  • Upload date:
  • Size: 111.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pysubnetsb-1.0.9.tar.gz
Algorithm Hash digest
SHA256 c5c7a8bd38a847f6dcd4e8a25dfaa2f85300b1fdf1582b1e5bc6e84f43bba958
MD5 525918de99fd38ed3c6e8cb874b7e158
BLAKE2b-256 66cf2dd86c4d13b105a7dbc1dca6267a8eabfcdd68afb66d600dd26579f29de3

See more details on using hashes here.

File details

Details for the file pysubnetsb-1.0.9-py3-none-any.whl.

File metadata

  • Download URL: pysubnetsb-1.0.9-py3-none-any.whl
  • Upload date:
  • Size: 141.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pysubnetsb-1.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 1a30aa4902618f487db27e2c59620c4d7839759d48e5d25a1f8dd001ed8f3ec9
MD5 35901d1b3d2dc888f97e07fc25259c61
BLAKE2b-256 9961fc6c3f4c948836c5510100daff8e39c7f6922401954af21d72a31fd7cb29

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page