Skip to main content

Extract, Retrieve and Predict kcat values for a metabolic model to run enzyme constrained metabolic pipelines.

Project description

WILDkCAT

pypi stable documentation

WILDkCAT is a set of scripts designed to extract, retrieve, and predict enzyme turnover numbers (kcat) for genome-scale metabolic models.


WILDkCAT produces a .tsv file with the retrieved and predicted kcat values for each combination of enzyme, substrates in your genome-scale metabolic model. Each step of the pipeline also generates an HTML report that provides detailed information about the retrieval process to facilitate transparency and reproducibility. The HTML reports are generated automatically after each stage of the workflow (extraction, retrieval, prediction) and can be opened directly in any web browser.

WILDkCAT Report Demo

Access the report here

Installation

Install WILDkCAT directly from PyPI:

pip install wildkcat

Environment Setup

Provide your BRENDA login credentials and Entrez API email adress to query the BRENDA enzyme database and NCBI database.

Create a file named .env in the root of your project with the following content:

ENTREZ_EMAIL=your_registered_email@example.com
BRENDA_EMAIL=your_registered_email@example.com
BRENDA_PASSWORD=your_password

[!IMPORTANT]

  • Replace the placeholders with the credentials from the account you created on the BRENDA website.
  • Ensure this file is not shared publicly (e.g., add .env to your .gitignore) since it contains sensitive information.
  • The scripts will automatically read these environment variables to authenticate and retrieve kcat values.

Usage

WILDkCAT can be used as scripts or via the CLI.

Command-Line Interface (CLI)

After installation, you can use the WILDkCAT CLI:

wildkcat --help

Example Workflow:

# Extract kcat data
wildkcat extraction \
    path/to/my_model.json \
    path/to/folder_output

# Retrieve kcat values from databases
wildkcat retrieval \
    path/to/folder_output
    'Organism name' \
    20 30 \  # Temperature range
    6.5 8.5 \  # pH range

# Generate input for CataPro
wildkcat prediction-part1 \
    path/to/folder_output
    9  # Limit penalty score 

# Integrate CataPro prediction
wildkcat prediction-part2 \
    path/to/folder_output
    prediction_output.csv \
    9  # Limit penalty score

# Generate summary report
wildkcat report \
    path/to/my_model.json \
    path/to/folder_output

[!WARNING]
The SABIO-RK database is often experiencing server overload and queries can be very slow, especially for large models. In these cases, it is recommended to use only the 'brenda' database in the retrieval command.


Programatic Access

from wildkcat import run_extraction, run_retrieval, run_prediction_part1, run_prediction_part2, generate_summary_report

Example: E. coli Core Model

A ready-to-run example is available here. It demonstrates a full extraction, retrieval, and prediction workflow on the E. coli core model.


Key scripts

extract_kcat.py

  • Identify all enyme-reaction combination in the model
  • Verify if EC numbers are valid (incomplete or transferred via KEGG)
  • If multiple enzymes are provided, searches UniProt for catalytic activity.

retrieve_kcat.py

  • If the same enzyme is not found, computes identity percentages relative to the identified catalytic enzyme.
  • Applies Arrhenius correction to values within the appropriate pH range.
  • For rows with multiple scores, selects:
    • The best score
    • The highest identity percentage
    • The closest organism (if sequence is not available)
    • The highest kcat value

predict_kcat.py

  • Predict kcat values not retrieved in experimental databases using machine learning.

cf. Refer to the documentation for a more detailed explanation.


Feedback & Improvements

Contributions, suggestions, and feedback are very welcome! If you encounter any issues, have ideas for new features, or notice room for improvement, feel free to open an issue or submit a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wildkcat-0.1.6.tar.gz (92.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wildkcat-0.1.6-py3-none-any.whl (47.7 kB view details)

Uploaded Python 3

File details

Details for the file wildkcat-0.1.6.tar.gz.

File metadata

  • Download URL: wildkcat-0.1.6.tar.gz
  • Upload date:
  • Size: 92.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for wildkcat-0.1.6.tar.gz
Algorithm Hash digest
SHA256 f1a211475dfab3fc59622f2634eae99448cce50ce6564a4f6ef0a645bdf88173
MD5 65e5dd598713af3c95f4717b98fea3e6
BLAKE2b-256 3f8dce8292f6466e91211895f4be1aa454ed6a244036c9ebed962c398ea10b6d

See more details on using hashes here.

File details

Details for the file wildkcat-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: wildkcat-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 47.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for wildkcat-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 a5f839ac2b7fc4b87506c24496384b5d0fac11b6d0ed1152f2e52cbeca7f1060
MD5 ae06328be7aa7ef13bf2f34c6e03dda0
BLAKE2b-256 c4c12048dba0c9e53a6f50e13c5dda06a7e1635e5af6b5f2866af1789d889a11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page