Skip to main content

Unambiguous representation of modified DNA, RNA, and proteins

Project description

PyPI package Documentation Test results Test coverage Code analysis License Analytics

BpForms: unambiguous representation of modified DNA, RNA, and proteins

BpForms is a set of tools for unambiguously representing the structures of modified forms of biopolymers such as DNA, RNA, and protein.

  • The BpForms notation can unambiguously represent the structure of modified forms of biopolymers. For example, the following represents a modified DNA molecule that contains a deoxyinosine monomer at the fourth position. ACG[id: "dI" | structure: InChI=1S /C10H12N4O4 /c15-2-6-5(16)1-7(18-6)14-4-13-8-9(14)11-3-12-10(8)17 /h3-7,15-16H,1-2H2,(H,11,12,17) /t5-,6+,7+ /m0 /s1 ]T

  • This concrete representation of modified biopolymers enables the BpForms software tools to calculate the chemical formulae, molecular weights, and charges of biopolymers, as well as to automatically calculate the major protonation and tautomerization state of biopolymers at specific pHs.

BpForms encompasses five tools:

BpForms was motivated by the need to concretely represent the biochemistry of DNA modification, DNA repair, post-transcriptional processing, and post-translational processing in whole-cell computational models. In addition, BpForms are a valuable tool for experimental proteomics. In particular, we developed BpForms because there were no notations, schemas, data models, or file formats for concretely representing modified forms of biopolymers, despite the existence of several databases and ontologies of DNA, RNA, and protein modifications and the ProForma Proteoform Notation.

The BpForms syntax was inspired by the ProForma Proteoform Notation. BpForms improves upon this syntax in several ways:

  • BpForms separates the representation of modified biopolymers from the chemical processes which generate them.

  • BpForms clarifies the representation of multiply modified monomers. This is necessary to represent the combinatorial complexity of modified DNA, RNA, and proteins.

  • BpForms can be customized to represent any modification and, therefore, is not limited to previously enumerated modifications. This is also necessary to represent the combinatorial complexity of modified DNA, RNA, and proteins.

  • BpForms supports two additional types of uncertainty in the structures of biopolymers: uncertainty in the position of a modified nucleotide/amino acid within the polymer sequence, and uncertainty in the chemical identity of modified nucleotide/amino acid as deviation from its expected mass or charge.

  • BpForms has a concrete grammar. This enables error checking, as well the calculation of chemical formulae, masses, and charges, which is essential for modeling.

Installation

  1. Install the third-party dependencies listed below. Detailed installation instructions are available in An Introduction to Whole-Cell Modeling.

  2. To use Marvin to calculate major protonation and tautomerization states, set JAVA_HOME to the path to your Java virtual machine (JVM) export JAVA_HOME=/usr/lib/jvm/default-java

  3. To use Marvin to calculate major protonation and tautomerization states, add Marvin to the Java class path export CLASSPATH=$CLASSPATH:/opt/chemaxon/marvinsuite/lib/MarvinBeans.jar

  4. Install this package

    • Install the latest release from PyPI. For most environments, the --process-dependency-links option is needed to install some of the dependencies from GitHub. pip install --process-dependency-links bpforms[all]

    • Install the latest revision from GitHub. For most environments, the --process-dependency-links option is needed to install some of the dependencies from GitHub. pip install --process-dependency-links git+git://github.com/KarrLab/bpforms#egg=bpforms[all]

Examples, tutorial, and documentation

Please see the documentation. An interactive tutorial is also available in the whole-cell modeling sandbox.

License

The package is released under the MIT license.

Development team

This package was developed by the Karr Lab at the Icahn School of Medicine at Mount Sinai in New York, USA.

Questions and comments

Please contact the Karr Lab with any questions or comments.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bpforms-0.0.2.tar.gz (2.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bpforms-0.0.2-py2.py3-none-any.whl (2.6 MB view details)

Uploaded Python 2Python 3

File details

Details for the file bpforms-0.0.2.tar.gz.

File metadata

  • Download URL: bpforms-0.0.2.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.8.0 tqdm/4.23.2 CPython/3.6.3

File hashes

Hashes for bpforms-0.0.2.tar.gz
Algorithm Hash digest
SHA256 943d5492c886e779f31ce9e78c9f519abff2e41d4249b449ad099034fe910ea4
MD5 0049c043ac250253badf78285a44a5f1
BLAKE2b-256 b28041a0c1a894a0bd3b5caf2262fd375378e83fe77ceb6b144362f69f1a53c7

See more details on using hashes here.

File details

Details for the file bpforms-0.0.2-py2.py3-none-any.whl.

File metadata

  • Download URL: bpforms-0.0.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 2.6 MB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.8.0 tqdm/4.23.2 CPython/3.6.3

File hashes

Hashes for bpforms-0.0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 9e4520371cf5dcaec317f1d6558356bb5cf83cda34fc881f8d334fc7ff896c74
MD5 8daaa23c6328b8205f009bab405a5395
BLAKE2b-256 06ea6cb8f50b87cf13a2f824d9225271300a7cdd24630673c7002e82a592bcc6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page