A way to extract specific information from CAZy
Project description
cazy-parser
A way to extract specific information from the Carbohydrate-Active enZYmes.
Make sure to visit and cite the CAZy website!
- http://www.cazy.org/
- Lombard V, Golaconda Ramulu H, Drula E, Coutinho PM, Henrissat B (2014) The Carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res 42:D490–D495. [PMID: 24270786].
License: GNU GPLv3
RV Honorato. CAZy-parser a way to extract information from the Carbohydrate-Active enZYmes Database. The Journal of Open Source Software_, 1(8), dec 2016. 10.21105/joss.00053
Introduction
cazy-parser is a tool that extract information from CAZy in a more usable and readable format. Firstly, a script reads the HTML structure and creates a mirror of the database as a tab delimited file. Secondly, information is extracted from the database according to user inputted parameters and presented to the user as a set of accession codes.
Install / Upgrade
pip install --upgrade cazy-parser
Usage (internet connection required)
cazy-parser -h
usage: cazy-parser [-h] [-f FAMILY] [-s SUBFAMILY] [-c CHARACTERIZED] [-v] {GH,GT,PL,CA,AA}
positional arguments:
{GH,GT,PL,CA,AA}
optional arguments:
-h, --help show this help message and exit
-f FAMILY, --family FAMILY
-s SUBFAMILY, --subfamily SUBFAMILY
-c CHARACTERIZED, --characterized CHARACTERIZED
-v, --version show version
Example
Extract all fasta sequences from family 43 of Glycoside Hydrolase subfamily 1
$ cazy-parser GH -f 43 -s 1
[2022-05-26 16:39:21,511 91 INFO] ------------------------------------------
[2022-05-26 16:39:21,511 92 INFO]
[2022-05-26 16:39:21,511 93 INFO] ┌─┐┌─┐┌─┐┬ ┬ ┌─┐┌─┐┬─┐┌─┐┌─┐┬─┐
[2022-05-26 16:39:21,511 94 INFO] │ ├─┤┌─┘└┬┘───├─┘├─┤├┬┘└─┐├┤ ├┬┘
[2022-05-26 16:39:21,511 95 INFO] └─┘┴ ┴└─┘ ┴ ┴ ┴ ┴┴└─└─┘└─┘┴└─ v2.0.1
[2022-05-26 16:39:21,511 96 INFO]
[2022-05-26 16:39:21,511 97 INFO] ------------------------------------------
[2022-05-26 16:39:21,511 183 INFO] Fetching links for Glycoside-Hydrolases, url: http://www.cazy.org/Glycoside-Hydrolases.html
[2022-05-26 16:39:22,454 189 INFO] Only using links of family 43 subfamily 1
[2022-05-26 16:39:23,029 26 INFO] Dowloading 1415 fasta sequences...
[2022-05-26 16:40:32,187 51 INFO] Dumping fasta sequences to file GH43_1_26052022.fasta
This will generate the following file GH43_1_DDMMYYY.fasta
containing the fasta sequences.
To-do and how to contribute
Please refer to CONTRIBUTING 🤓
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cazy_parser-2.0.3.tar.gz
.
File metadata
- Download URL: cazy_parser-2.0.3.tar.gz
- Upload date:
- Size: 20.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.11.2 Linux/5.15.90.1-microsoft-standard-WSL2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f74fb33a9106a3d402870a3ca757d1cbf94e0ee8b6321695a97b8e7a28f632a9 |
|
MD5 | c391b89f9918c12afde6a9c9ec5fc4ac |
|
BLAKE2b-256 | 66ca7c4a75991dcc268b7be0256d05e9a7ca43137b8b0195907e6faf0446c3c5 |
File details
Details for the file cazy_parser-2.0.3-py3-none-any.whl
.
File metadata
- Download URL: cazy_parser-2.0.3-py3-none-any.whl
- Upload date:
- Size: 21.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.11.2 Linux/5.15.90.1-microsoft-standard-WSL2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | beff5ec5845e2f1dc45d43b584a003920a68f3cb1c880bd74fd576edb177b9fa |
|
MD5 | 4956403eb79d333861e1a25663787204 |
|
BLAKE2b-256 | d21de3d8748d82c4f995b1599d5a574b169ea7b174c1c2a382bc194f4628db06 |