seqpandas

Read bioinformatics sequence formats into a Pandas DataFrame

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Project description

SeqPandas

Import genomic data to get a custom Pandas & Biopython hybrid class object with fancy shortcuts to make Machine Learning preprocessing easy!

Free software: MIT license
Documentation: https://seqpandas.readthedocs.io.

Installation

pip install seqpandas

Usage

import seqpandas as spd

# Direct File Path
df = spd.read_seq('file.fasta', format='fasta')
df = spd.read_seq('file.sam', format='sam')
df = spd.read_vcf('file.vcf', format='vcf')
df = spd.read_bed('file.bed', format='bed')

# Just need BioPython Seqs? No problem!
seqrecords = spd.read('file.fasta', format='fasta')

# Already Opened BioPython Handle
from Bio import SeqIO
seqrecords = SeqIO.parse('file.fasta', format='fasta')
df = spd.BioDataFrame.from_seqrecords(seqrecords)

Tutorial

For a complete walkthrough and to use it for a machine learning pipeline please follow the tutorial notebook.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

History

0.0.1 (2022-02-17)

First release on PyPI.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Natural Language
- English
Programming Language

Release history Release notifications | RSS feed

This version

0.0.2

Feb 17, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seqpandas-0.0.2.tar.gz (11.4 MB view hashes)

Uploaded Feb 17, 2022 Source

Built Distribution

seqpandas-0.0.2-py2.py3-none-any.whl (10.1 kB view hashes)

Uploaded Feb 17, 2022 Python 2 Python 3

Hashes for seqpandas-0.0.2.tar.gz

Hashes for seqpandas-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`1d733298a2cb2faceedea23f4f7f9d99f6c99ed8577a21308ac36d93672db00e`
MD5	`66e5dd70f2c5fd98cdcd8e4066541237`
BLAKE2b-256	`bfa0f05beb99cc51fe3e6341ac8c221d6c90298ccdf0e608ce9b126783939106`

Hashes for seqpandas-0.0.2-py2.py3-none-any.whl

Hashes for seqpandas-0.0.2-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`f0552d8adeb3eca96a62a0928c368f1b6e75817bb7f9882ffca5c5e6be9dab43`
MD5	`62618cb48c01eaf233e0cfaa4deddce6`
BLAKE2b-256	`7816cede1a50e99a45f0d9735245d848bbc68f99d57b191d9a16f0efecf368b8`