A very simple fasta file parser.
Project description
FastaFrames
This Python package provides a set of functions to work with FASTA files. It allows you to read FASTA files, convert them to pandas dataframes, manipulate data, and write data back to FASTA files.
Features
- Read FASTA files into pandas DataFrames
- Write FASTA files from pandas DataFrames
Usage
To install fastaframes use pip:
pip install fastaframes
Reading FASTA files
To read a FASTA file and convert it to a pandas DataFrame:
from fastaframes import to_df
# IO input
with open('example.fasta', 'r') as fasta_io:
fasta_df = to_df(fasta_data=fasta_io)
# or
# File input
fasta_df = to_df(fasta_data='example.fasta')
print(fasta_df.head())
Writing FASTA files
To write a pandas DataFrame to a FASTA file:
from fastaframes import to_fasta
# Write StringIO to file
fasta_io = to_fasta(fasta_data=fasta_df) # outputs StringIO if file=None
with open('output.fasta', 'w') as output_file:
output_file.write(fasta_io.getvalue())
# or
# Write directly to file
to_fasta(fasta_data=fasta_df, file='output.fasta')
Example Fasta DataFrame:
db | unique_identifier | entry_name | protein_name | organism_name | organism_identifier | gene_name | protein_existence | sequence_version | protein_sequence | |
---|---|---|---|---|---|---|---|---|---|---|
0 | sp | A0A087X1C5 | CP2D7_HUMAN | Putative cytochrome P450 2D7 | Homo sapiens | 9606.0 | CYP2D7 | 5.0 | 1.0 | MGLEALVPLAMIVAIFLLLVDLMHRHQRWAARYPPGPLPLPGLGNLLHVDFQNTPYCFDQ |
1 | sp | A0A0B4J2F2 | SIK1B_HUMAN | Putative serine/threonine-protein kinase SIK1B | Homo sapiens | 9606.0 | SIK1B | 5.0 | 1.0 | MVIMSEFSADPAGQGQGQQKPLRVGFYDIERTLGKGNFAVVKLARHRVTKTQVAIKIIDKLVQ |
2 | sp | A0A0C5B5G6 | MOTSC_HUMAN | Mitochondrial-derived peptide MOTS-c | Homo sapiens | 9606.0 | MT-RNR1 | 1.0 | 1.0 | MRWQEMGYIFYPRKLR |
3 | sp | A0A0K2S4Q6 | CD3CH_HUMAN | Protein CD300H | Homo sapiens | 9606.0 | CD300H | 1.0 | 1.0 | MTQRAGAAMLPSALLLLCVPGCLTVSGPSTVMGAVGESLSVQCRYEEKYKTFNKYWCRQP |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fastaframes-0.0.3.tar.gz
(9.9 kB
view hashes)
Built Distribution
Close
Hashes for fastaframes-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3ee25f954801e1838623c25a127a1212ff8dd48134941dd712f2bfe5ed01dfd |
|
MD5 | 7db7e3c3920f1822f2a05f87db291cdb |
|
BLAKE2b-256 | a07b7e0990955b6d28b1e657b62e0c8da68f39f5e5fed87f74225fef3fd6bed4 |