Skip to main content

Python Subnet Discovery for Systems Biology

Project description

DETECTING STRUCTURALLY IDENTICAL REACTION NETWORKS

Two chemical reaction networks (CRNs) are structurally identical if they have the same stoichiometry for reactants (reactant stoichiometry matrix) and products (product stoichiometry matrix), regardless of their rate laws. Because of renamings of chemical species and reactions, testing for structurally identical networks requires an element-wise comparison of every permutation of the rows and columns of the stoichiometry matrices. That is, if two networks are structurally identical then they have permutably identical stoichiometry reactant matrices and their product stoichiometry matrix are identical for the same permutations used to get equal reactant matrices. Clearly, this definition applies if we exchange "reactant" and "product".

Why do we use the above definition and not simply the stoichiometry matrix? This is best answered by showing an example. Consider the following two networks that consist of a single reaction:

// Network 1
S1 -> S1 + S2

// Network 2
S2 -> S2 + S2

These networks have the same Stoichiometry matrix that has a 0 for S1 and a 1 for S2. However, the reactant and product stoichiometry matrices are different for these two networks.

Problem Addressed

The above approach to finding structurally identical CRNs has a huge computational complexity. Let $N$ be the number of rows (species) in a stoichiometry matrix and M be the number of columns. Then, the computational complexity of a single pair-wise comparison is $O(N!M!)$. If each comparison takes 1 microsecond, then: (i) $N=8=M$ takes about an hour; (ii) N=10=M takes about a day; and (iii) $N=20=M$ takes longer than the current age of the Universe (14B years). In systems biology, $N=20=M$ is a modest size CRN.

Technical Approach

This project implements the DSIRN Algorithm, an efficient algorithm for detecting structurally identical CRNs. The key insight used by the algorithm is to eliminate the need for considering a large number of permutations. This is achieved by finding an order independent encoding (OIE) of rows and columns of the stoichiometry matrix so that rows (columns) are only compared if they have the same OIE. The stoichiometry matrix of many CRNs is dominated by -1, 0, 1 because of the wide prevalence of unit stoichiometries. So, we use the OIE

  • number of elements < 0
  • number of elements = 0
  • number of elements > 0

By so doing, we partition the rows (columns) so that we only need to consider the permutations in each partition. Let $N_P$ be the number of rows of with distinct OIE encodings for two structurally identical matrices, and $M_P$ be the same for the number of columns. Suppose that each partition contains the same number of elements; $\frac{N}{N_P}$ for species and $\frac{M}{M_P}$ for reactions. Then, the complexity for using partitions is $O( (\frac{N}{N_P}!)^{N_P} (\frac{M}{M_P}!)^{M_P})$. When $N_P = 1 = M_P$, we get $O(N!M!)$. When $N_P = N$, $M_P =M$, we get 1.

Design

  • A Matrix is a two dimensional numpy array.
  • A Network represents a CRN. It has a reactant PMatrix that represents the reactant stoichiometry matrix, and a product PMatrix that represents the product stoichiometry matrix. Network has a hash that is calculated from the reactant and product PMatrix.
  • and product stoichiometry matrices.
  • A NetworkCollection is a collection of Network. NetworkCollection provides a way to discover subsets that are structurally identical. The boolean attributeNetworkCollection.is_structurally_identical indicates if all Network`` in the collection are structurally identical.
  • ClusterBuilder clusters identical (or subsnetwork) networks in a NetworkCollection.

Structure of classes

Each major class has the following methods:

  • copy produces a replica of the object that is a deepcopy
  • __repr provides a human readable representation of the object's content
  • __eq__ tests for exact equality
  • isEquivalent tests for identity except for the object name
  • serialize creates a JSON string representation of the object and its subobjects
  • deserialize reconstitutes an object (and its subsobjects) from a JSON string serialization

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysubnetsb-0.0.5.tar.gz (112.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pySubnetSB-0.0.5-py3-none-any.whl (149.8 kB view details)

Uploaded Python 3

File details

Details for the file pysubnetsb-0.0.5.tar.gz.

File metadata

  • Download URL: pysubnetsb-0.0.5.tar.gz
  • Upload date:
  • Size: 112.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.6

File hashes

Hashes for pysubnetsb-0.0.5.tar.gz
Algorithm Hash digest
SHA256 e458042ceca71b98c09170891fad07870f16f2b3d2581fed9781f5507f60151a
MD5 0eca2aa77b41d0cc418bdbea52da2221
BLAKE2b-256 754aa7b545b2e3f0b6ab705f14271a4845617ea8007dab5342fdf7db49b2b8c9

See more details on using hashes here.

File details

Details for the file pySubnetSB-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: pySubnetSB-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 149.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.6

File hashes

Hashes for pySubnetSB-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 fc2d29d38b85b63620c869c7e677febc3e0073aa49405fd1a5c4d36931b68f2a
MD5 f0a409a1438250e85c4dce07f897d604
BLAKE2b-256 d5e049b7b5c24fd9cd70a93463441454b31c1f052b0506dd00aaebdf686d1e33

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page