Skip to main content

Tools to parse the BRENDA database

Project description

A python parser for the BRENDA database

This project provides python classes and functions to parse the text file containing the entire BRENDA enzyme database (https://www.brenda-enzymes.org)

Due to BRENDA's license, BRENDA's database cannot be downloaded directly by the parser, instead, the user is asked to download the database as a text file after accepting usage conditions here.

This is an ongoing project!

Installation

  1. pip install brendapyrser

or

  1. Git clone project to local directory.

    In terminal navigate to directory and enter: python setup.py install

import numpy as np
from matplotlib import pyplot as plt
from brendapyrser import BRENDA

dataFile = 'data/brenda_download.txt'

1. Parsing BRENDA

# Let's load the database
brenda = BRENDA(dataFile)
brenda
Number of Enzymes7609
BRENDA copyrightCopyrighted by Dietmar Schomburg, Techn. University Braunschweig, GERMANY. Distributed under the License as stated at http:/www.brenda-enzymes.org
Parser version0.0.1
AuthorSemidán Robaina Estévez, 2020
# Plot all Km values in the database
BRENDA_KMs = np.array([v for r in brenda.reactions 
                       for v in r.KMvalues.get_values()])
values = BRENDA_KMs[(BRENDA_KMs < 1000) & (BRENDA_KMs >= 0)]
plt.hist(values)
plt.title(f'Median KM value: {np.median(values)}')
plt.xlabel('KM (mM)')
plt.show()
print(f'Minimum and maximum values in database: {values.min()} mM, {values.max()} mM')

png

Minimum and maximum values in database: 0.0 mM, 997.0 mM
# Plot all Km values in the database
BRENDA_Kcats = np.array([v for r in brenda.reactions 
                       for v in r.Kcatvalues.get_values()])
values = BRENDA_Kcats[(BRENDA_Kcats < 1000) & (BRENDA_Kcats >= 0)]
plt.hist(values)
plt.title(f'Median Kcat value: {np.median(values)}')
plt.xlabel('Kcat (1/s)')
plt.show()
print(f'Minimum and maximum values in database: {values.min()} 1/s, {values.max()} 1/s')

png

Minimum and maximum values in database: 5.83e-10 1/s, 997.0 1/s
# Plot all enzyme optimal temperature values in the database
BRENDA_TO = np.array([v for r in brenda.reactions 
                       for v in r.temperature.filter_by_condition(
                           'optimum').get_values()])
values = BRENDA_TO[(BRENDA_TO >= 0)]
plt.hist(values)
plt.title(f'Median Optimum Temperature: {np.median(values)}')
plt.xlabel('TO (${}^oC$)')
plt.show()
print(f'Minimum and maximum values in database: {values.min()} °C, {values.max()} °C')

png

Minimum and maximum values in database: 0.0 °C, 125.0 °C

We see that the median optimal temperature for all enzymes in the BRENDA database is 37 °C! That's interesting... perhaps all organisms have agreed to prefer that temperature over other ones... or, more likely, it could be that BRENDA database is biased towards mammals and microorganisms that live within mammals... such as human pathogens.

Let's filter results for a particular species, let's try with a hyperthermophylic baterial genus, Thermotoga

# Plot all enzyme optimal temperature values in the database
species = 'Thermotoga'
BRENDA_TO = np.array([v for r in brenda.reactions.filter_by_organism(species)
                       for v in r.temperature.filter_by_condition('optimum').filter_by_organism(species).get_values()])
values = BRENDA_TO[(BRENDA_TO >= 0)]
plt.hist(values)
plt.title(f'Median Optimum Temperature: {np.median(values)}')
plt.xlabel('TO (${}^oC$)')
plt.show()
print(f'Minimum and maximum values in database: {values.min()} °C, {values.max()} °C')

png

Minimum and maximum values in database: 20.0 °C, 105.0 °C

We can see that the median optimal temperature among all enzymes in the genus, 80°C, is much higher than in the case of the entire database. That's consistent with the fact that Thermotoga are hyperthermophylic... alright!

2. Extracting data for Pyruvate kinase

# We can retrieve an enzyme entry by its EC number like this
r = brenda.reactions.get_by_id('2.7.1.40')
r
Enzyme identifier2.7.1.40
NamePyruvate kinase
Systematic nameATP:pyruvate 2-O-phosphotransferase
Reaction typePhospho group transfer
ReactionATP + pyruvate <=> ADP + phosphoenolpyruvate
# Here are all the KM values for phosphoenolpyruvate associated with this enzyme class
compound = 'phosphoenolpyruvate'
kms = r.KMvalues.filter_by_compound(compound).get_values()
plt.hist(kms)
plt.xlabel('KM (mM)')
plt.title(f'{r.name} ({compound})')
plt.show()

png

# Here are all the KM values for phosphoenolpyruvate associated with this enzyme class
compound = 'phosphoenolpyruvate'
KMs = r.KMvalues.filter_by_compound(compound).get_values()
plt.hist(KMs)
plt.xlabel('KM (mM)')
plt.title(f'{r.name} ({compound})')
plt.show()

png

# And further filtered by organism
r.KMvalues.filter_by_organism('Bos taurus').filter_by_compound('phosphoenolpyruvate').get_values()
[0.051500000000000004, 0.18]
# Here are all the Kcat values for phosphoenolpyruvate associated with this enzyme class
compound = 'phosphoenolpyruvate'
kcats = r.Kcatvalues.filter_by_compound(compound).get_values()
plt.hist(kcats)
plt.xlabel('Kcat ($s^{-1}$)')
plt.title(f'{r.name} ({compound})')
plt.show()

png

3 Finding all KM values for a given substrate and organism

Next, we will retrieve KM values associated to a particular substrate for all enzymes in a given species. Will t he KM values distribute around a narrow or wider concentration range? Since substrate concentration in cytoplasma is the same for all enzymes it makes sense that all cytoplasmi enzymes utilizing that substrate have similar KM values. Let's test this idea with Escherichia coli and some common substrates participating in the central carbon metabolism.

species, compound = 'Escherichia coli', 'NADH'
KMs = np.array([v for r in brenda.reactions.filter_by_organism(species)
                for v in r.KMvalues.filter_by_compound(compound).filter_by_organism(species).get_values()])

if len(KMs) > 0:
    plt.hist(KMs)
    plt.xlabel('KM (mM)')
    plt.title(f'{species} KMs ({compound}), median = {np.median((KMs))}')
    plt.show()
else:
    print('No KM values for compound')

png

That's interesting! typical NADH concentrations are low in Escherichia coli, e.g., from BioNumbers we get a value of 0.083 mM. The median KM value for NADH among all enzymes binding it is lower as we see in the plot above! Hence, it looks like most enzymes are (nearly) saturated for NADH and thus fluxes are sort of independent of NADH concentration.

4 Filtering reactions by specific compound

We can also filter reactions in BRENDA by a specific compound: substrate, product or either of the two. Let's filter reactions containg geraniol as a substrate, product or both to exemplify this feature

substrate_rxns = brenda.reactions.filter_by_substrate("phosphoenolpyruvate")
substrate_rxns[2]
Enzyme identifier2.5.1.19
Name3-phosphoshikimate 1-carboxyvinyltransferase
Systematic namephosphoenolpyruvate:3-phosphoshikimate 5-O-(1-carboxyvinyl)-transferase
Reaction typeEnolpyruvate group transfer (#3,52,55# induced-fit mechanism, formation
Reactionphosphoenolpyruvate + 3-phosphoshikimate <=> phosphate +5-O-
compound_rxns = brenda.reactions.filter_by_compound("phosphoenolpyruvate")
compound_rxns[7]
Enzyme identifier2.5.1.7
NameUdp-n-acetylglucosamine 1-carboxyvinyltransferase
Systematic namephosphoenolpyruvate:UDP-N-acetyl-D-glucosamine
Reaction typeCarboxyvinyl group transfer
Reactionphosphoenolpyruvate + UDP-N-acetyl-alpha-D-glucosamine <=> phosphate +UDP-N-acetyl-3-O-

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

brendapyrser-0.0.2.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

brendapyrser-0.0.2-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file brendapyrser-0.0.2.tar.gz.

File metadata

  • Download URL: brendapyrser-0.0.2.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for brendapyrser-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8e95ed0d8b2940a752e1f7caca1f9ce6f56c24e37fb912d5c6c90ebf97b3c508
MD5 8378d93efced9926f5e94be0b909b596
BLAKE2b-256 8ded25d9b354174291425e324d31fd7cc9444d4b532e49d7a187bda344ed66e5

See more details on using hashes here.

File details

Details for the file brendapyrser-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for brendapyrser-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 924234335bcf90001ec8728d72ab8b648498dbacd0a1b307b7b45728cd2075e4
MD5 e032ae18a5d64da312034166649b126b
BLAKE2b-256 41709fcc9043f2d61c710c317d62a77bc53dc70faedfbf39e0a7fa6040fcf3b6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page