Skip to main content

A way to extract specific information from CAZy

Project description

cazy-parser

A way to extract specific information from the Carbohydrate-Active enZYmes.

Downloads status unittests Codacy Badge Codacy Badge

Make sure to visit and cite the CAZy website!

  • http://www.cazy.org/
  • Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, Henrissat B (2014) The Carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res 42:D490–D495. [PMID: 24270786].

License: GNU GPLv3

RV Honorato. CAZy-parser a way to extract information from the Carbohydrate-Active enZYmes Database. The Journal of Open Source Software_, 1(8), dec 2016. 10.21105/joss.00053

Introduction

cazy-parser is a tool that extract information from CAZy in a more usable and readable format. Firstly, a script reads the HTML structure and creates a mirror of the database as a tab delimited file. Secondly, information is extracted from the database according to user inputted parameters and presented to the user as a set of accession codes.

Install / Upgrade

pip install --upgrade cazy-parser

Usage (internet connection required)

cazy-parser -h
usage: cazy-parser [-h] [-f FAMILY] [-s SUBFAMILY] [-c CHARACTERIZED] [-v] {GH,GT,PL,CA,AA}

positional arguments:
  {GH,GT,PL,CA,AA}

optional arguments:
  -h, --help            show this help message and exit
  -f FAMILY, --family FAMILY
  -s SUBFAMILY, --subfamily SUBFAMILY
  -c CHARACTERIZED, --characterized CHARACTERIZED
  -v, --version         show version

Example

Extract all fasta sequences from family 43 of Glycoside Hydrolase subfamily 1

$ cazy-parser GH -f 43 -s 1
 [2022-05-26 16:39:21,511 91 INFO] ------------------------------------------
 [2022-05-26 16:39:21,511 92 INFO]
 [2022-05-26 16:39:21,511 93 INFO] ┌─┐┌─┐┌─┐┬ ┬   ┌─┐┌─┐┬─┐┌─┐┌─┐┬─┐
 [2022-05-26 16:39:21,511 94 INFO] │  ├─┤┌─┘└┬┘───├─┘├─┤├┬┘└─┐├┤ ├┬┘
 [2022-05-26 16:39:21,511 95 INFO] └─┘┴ ┴└─┘ ┴    ┴  ┴ ┴┴└─└─┘└─┘┴└─ v2.0.1
 [2022-05-26 16:39:21,511 96 INFO]
 [2022-05-26 16:39:21,511 97 INFO] ------------------------------------------
 [2022-05-26 16:39:21,511 183 INFO] Fetching links for Glycoside-Hydrolases, url: http://www.cazy.org/Glycoside-Hydrolases.html
 [2022-05-26 16:39:22,454 189 INFO] Only using links of family 43 subfamily 1
 [2022-05-26 16:39:23,029 26 INFO] Dowloading 1415 fasta sequences...
 [2022-05-26 16:40:32,187 51 INFO] Dumping fasta sequences to file GH43_1_26052022.fasta

This will generate the following file GH43_1_DDMMYYY.fasta containing the fasta sequences.

To-do and how to contribute

Please refer to CONTRIBUTING 🤓

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cazy_parser-2.0.3.tar.gz (20.8 kB view details)

Uploaded Source

Built Distribution

cazy_parser-2.0.3-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file cazy_parser-2.0.3.tar.gz.

File metadata

  • Download URL: cazy_parser-2.0.3.tar.gz
  • Upload date:
  • Size: 20.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.2 Linux/5.15.90.1-microsoft-standard-WSL2

File hashes

Hashes for cazy_parser-2.0.3.tar.gz
Algorithm Hash digest
SHA256 f74fb33a9106a3d402870a3ca757d1cbf94e0ee8b6321695a97b8e7a28f632a9
MD5 c391b89f9918c12afde6a9c9ec5fc4ac
BLAKE2b-256 66ca7c4a75991dcc268b7be0256d05e9a7ca43137b8b0195907e6faf0446c3c5

See more details on using hashes here.

File details

Details for the file cazy_parser-2.0.3-py3-none-any.whl.

File metadata

  • Download URL: cazy_parser-2.0.3-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.2 Linux/5.15.90.1-microsoft-standard-WSL2

File hashes

Hashes for cazy_parser-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 beff5ec5845e2f1dc45d43b584a003920a68f3cb1c880bd74fd576edb177b9fa
MD5 4956403eb79d333861e1a25663787204
BLAKE2b-256 d21de3d8748d82c4f995b1599d5a574b169ea7b174c1c2a382bc194f4628db06

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page