UPFP, a package to parse UniProt FASTA files.
Project description
uniprot_fasta_parser
UniProt FASTA parser written in pure python.
Development setup
Create a venv
:
python -m venv venv
Activate it:
source venv/bin/activate
Install dependencies:
pip install -r requirements.txt
Install the package in editable mode:
pip install -e .
Install jupiter
playground:
pip install jupyter
ipython kernel install --user --name=uniprot_fasta_parser
Tutorial on converting FASTA sequences into CSV format
Get the latest FASTA from UniProt SwissProt:
wget ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz
The script upfp-fasta-to-csv
(installed with upfp
) can be used.
upfp-fasta-to-csv -h
usage: upfp-fasta-to-csv [-h] [-g] [-c CHUNK_SIZE] fasta_filepath csv_filepath
positional arguments:
fasta_filepath path to the FASTA file.
csv_filepath path where to store the CSV file.
optional arguments:
-h, --help show this help message and exit
-g, --gzipped flag to indicate whether the FASTA is gzipped.
Defaults to False.
-c CHUNK_SIZE, --chunk_size CHUNK_SIZE
size of the chunks used when writing the CSV file.
Defaults to 10000.
Provide as input the downloaded gzipped FASTA file and convert it to CSV:
upfp-fasta-to-csv uniprot_sprot.fasta.gz /path/to/file.csv -g
Revert CSV to FASTA
You might want to recreate FASTA format from a CSV resulting from upfp
with the script upfp-csv-to-fasta
.
upfp-csv-to-fasta -h
usage: upfp-csv-to-fasta [-h] [-g] [-c CHUNK_SIZE] csv_filepath fasta_filepath
positional arguments:
csv_filepath path to the CSV file or SMI file.
fasta_filepath path where to store the FASTA file
optional arguments:
-h, --help show this help message and exit
-g, --gzipped flag to indicate whether the FASTA should be gzipped.
Defaults to False.
-c CHUNK_SIZE, --chunk_size CHUNK_SIZE
size of the chunks used when writing the FASTA file.
Defaults to 10000.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
upfp-0.0.5.tar.gz
(6.8 kB
view details)
File details
Details for the file upfp-0.0.5.tar.gz
.
File metadata
- Download URL: upfp-0.0.5.tar.gz
- Upload date:
- Size: 6.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | af91c282429e558128e7d65bb18572105820a4fbc57fd4b964a954a2e3315f84 |
|
MD5 | 30120282bf82111cbd59a86481426cb5 |
|
BLAKE2b-256 | 2c4bc49b417141afb0ea99b289738762558ae3842496435cd44325d66e155d3e |