UPFP, a package to parse UniProt FASTA files.
Project description
uniprot_fasta_parser
UniProt FASTA parser written in pure python.
Development setup
Create a venv
:
python -m venv venv
Activate it:
source venv/bin/activate
Install dependencies:
pip install -r requirements.txt
Install the package in editable mode:
pip install -e .
Install jupiter
playground:
pip install jupyter
ipython kernel install --user --name=uniprot_fasta_parser
Tutorial on converting FASTA sequences into CSV format
Get the latest FASTA from UniProt SwissProt:
wget ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gz
The script upfp-fasta-to-csv
(installed with upfp
) can be used.
user@host $ upfp-fasta-to-csv -h
usage: upfp-fasta-to-csv [-h] [-g] [-c CHUNK_SIZE] fasta_filepath csv_filepath
positional arguments:
fasta_filepath path to the FASTA file
csv_filepath path where to store the CSV file
optional arguments:
-h, --help show this help message and exit
-g, --gzipped flag to indicate whether the FASTA is gzipped.
Defaults to False.
-c CHUNK_SIZE, --chunk_size CHUNK_SIZE
size of the chunks us
Provide as input the downloaded gzipped FASTA file and convert it to CSV:
upfp-fasta-to-csv uniprot_sprot.fasta.gz /path/to/file.csv -g
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
upfp-0.0.4.tar.gz
(5.3 kB
view hashes)
Built Distribution
upfp-0.0.4-py3-none-any.whl
(7.6 kB
view hashes)