Skip to main content

Python Subnet Discovery for Systems Biology

Project description

DETECTING STRUCTURALLY IDENTICAL REACTION NETWORKS

Two chemical reaction networks (CRNs) are structurally identical if they have the same stoichiometry for reactants (reactant stoichiometry matrix) and products (product stoichiometry matrix), regardless of their rate laws. Because of renamings of chemical species and reactions, testing for structurally identical networks requires an element-wise comparison of every permutation of the rows and columns of the stoichiometry matrices. That is, if two networks are structurally identical then they have permutably identical stoichiometry reactant matrices and their product stoichiometry matrix are identical for the same permutations used to get equal reactant matrices. Clearly, this definition applies if we exchange "reactant" and "product".

Why do we use the above definition and not simply the stoichiometry matrix? This is best answered by showing an example. Consider the following two networks that consist of a single reaction:

// Network 1
S1 -> S1 + S2

// Network 2
S2 -> S2 + S2

These networks have the same Stoichiometry matrix that has a 0 for S1 and a 1 for S2. However, the reactant and product stoichiometry matrices are different for these two networks.

Problem Addressed

The above approach to finding structurally identical CRNs has a huge computational complexity. Let $N$ be the number of rows (species) in a stoichiometry matrix and M be the number of columns. Then, the computational complexity of a single pair-wise comparison is $O(N!M!)$. If each comparison takes 1 microsecond, then: (i) $N=8=M$ takes about an hour; (ii) N=10=M takes about a day; and (iii) $N=20=M$ takes longer than the current age of the Universe (14B years). In systems biology, $N=20=M$ is a modest size CRN.

Technical Approach

This project implements the DSIRN Algorithm, an efficient algorithm for detecting structurally identical CRNs. The key insight used by the algorithm is to eliminate the need for considering a large number of permutations. This is achieved by finding an order independent encoding (OIE) of rows and columns of the stoichiometry matrix so that rows (columns) are only compared if they have the same OIE. The stoichiometry matrix of many CRNs is dominated by -1, 0, 1 because of the wide prevalence of unit stoichiometries. So, we use the OIE

  • number of elements < 0
  • number of elements = 0
  • number of elements > 0

By so doing, we partition the rows (columns) so that we only need to consider the permutations in each partition. Let $N_P$ be the number of rows of with distinct OIE encodings for two structurally identical matrices, and $M_P$ be the same for the number of columns. Suppose that each partition contains the same number of elements; $\frac{N}{N_P}$ for species and $\frac{M}{M_P}$ for reactions. Then, the complexity for using partitions is $O( (\frac{N}{N_P}!)^{N_P} (\frac{M}{M_P}!)^{M_P})$. When $N_P = 1 = M_P$, we get $O(N!M!)$. When $N_P = N$, $M_P =M$, we get 1.

Design

  • A Matrix is a two dimensional numpy array.
  • A Network represents a CRN. It has a reactant PMatrix that represents the reactant stoichiometry matrix, and a product PMatrix that represents the product stoichiometry matrix. Network has a hash that is calculated from the reactant and product PMatrix.
  • and product stoichiometry matrices.
  • A NetworkCollection is a collection of Network. NetworkCollection provides a way to discover subsets that are structurally identical. The boolean attributeNetworkCollection.is_structurally_identical indicates if all Network`` in the collection are structurally identical.
  • ClusterBuilder clusters identical (or subsnetwork) networks in a NetworkCollection.

Structure of classes

Each major class has the following methods:

  • copy produces a replica of the object that is a deepcopy
  • __repr provides a human readable representation of the object's content
  • __eq__ tests for exact equality
  • isEquivalent tests for identity except for the object name
  • serialize creates a JSON string representation of the object and its subobjects
  • deserialize reconstitutes an object (and its subsobjects) from a JSON string serialization

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysubnetsb-0.0.1.tar.gz (109.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pySubnetSB-0.0.1-py3-none-any.whl (145.0 kB view details)

Uploaded Python 3

File details

Details for the file pysubnetsb-0.0.1.tar.gz.

File metadata

  • Download URL: pysubnetsb-0.0.1.tar.gz
  • Upload date:
  • Size: 109.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.6

File hashes

Hashes for pysubnetsb-0.0.1.tar.gz
Algorithm Hash digest
SHA256 03f91bedf3bf2191faf1d435ff80890f2ee48bd3ab8a3b616f4c4aba2f577a2f
MD5 8e1feafd0e5428dcc8767845ff224421
BLAKE2b-256 2dff124c7345064b4fe1eeabe3528d8118433f573a5ffd55b13ebd6ca6e7ba49

See more details on using hashes here.

File details

Details for the file pySubnetSB-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: pySubnetSB-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 145.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.9.6

File hashes

Hashes for pySubnetSB-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f8241782fe1e21cd68295fa8d0439613a25479231ee9d7ebe767af00ea2f7dbd
MD5 5ad6486cb53b928901cf36619663e90a
BLAKE2b-256 08d03ced7054fd7b7a02f4e4b1d5b4ecaf1fca5c03ca9c91fef471494d453f63

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page