seqann

Sequence Annotation

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)
Natural Language
- English
Programming Language
- Python :: 3.6

Project description

https://img.shields.io/travis/nmdp-bioinformatics/SeqAnn.svg

https://img.shields.io/pypi/v/seqann.svg

https://coveralls.io/repos/github/nmdp-bioinformatics/SeqAnn/badge.svg?branch=master

Python package for annotating gene features

Free software: LGPL 3.0
Documentation: https://seqann.readthedocs.io.
Jupyter Notebook

Overview

The seqann package allows users to annotate gene features in consensus sequences. Annotations can be created by passing consensus sequences to the annotate method in the BioSeqAnn class. No parameters are required when initalizing a BioSeqAnn class. However, annotations can be created significantly faster when using a BioSQL database. When a BioSQL database is not provided the lastest hla.dat file is downloaded and parsed. A BioSQL database containing all of IPD-IMGT/HLA is available on DockerHub and can be run on any machine that has docker installed.

Parameters

Below are the list of parameters and the default values used when initalizing a BioSeqAnn object.

Parameter	Type	Default	Description
server	BioSeqDatabase	None	A BioSQL database containing all of the sequence data from IPD-IMGT/HLA.
dbversion	str	Latest	The IPD-IMGT/HLA or KIR database release.
datfile	str	None	The IPD-IMGT/HLA or KIR dat file to use in place of the server parameter.
kir	bool	False	Flag for indicating the input sequences are from the KIR gene system.
align	bool	False	Flag for producing the alignments along with the annotations.
verbose	bool	False	Flag for running in verbose mode.
verbosity	int	None	Numerical value to indicate how verbose the output will be in verbose mode.
debug	Dict	None	A dictionary containing a process names as the key and verbosity as the value

Usage

To annotated a sequence initialize a new BioSeqAnn object and then pass the sequence to the annotate method. The sequence must be a Biopython Seq. The locus of the sequence is not required but it will improve the accuracy of the annotation.

from seqann import BioSeqAnn
seqann = BioSeqAnn()
ann = seqann.annotate(sequence, "HLA-A")

The annotation of sequence can be done with or without providing a BioSeqDatabase. To use a BioSQL database initialize a BioSeqDatabase with the parameters that match the database you have running. If you are running the imgt_biosqldb from DockerHub then the following parameters we be the same.

from seqann import BioSeqAnn
from BioSQL import BioSeqDatabase
server = BioSeqDatabase.open_database(driver="pymysql", user="root",
                                      passwd="my-secret-pw", host="localhost",
                                      db="bioseqdb", port=3306)
seqann = BioSeqAnn(server=server)
ann = seqann.annotate(sequence, "HLA-A")

Annotations

{
     'complete_annotation': True,
     'annotation': {'exon_1': SeqRecord(seq=Seq('AGAGACTCTCCCG', SingleLetterAlphabet()), id='HLA:HLA00630', name='HLA:HLA00630', description='HLA:HLA00630 DQB1*03:04:01 597 bp', dbxrefs=[]),
                    'exon_2': SeqRecord(seq=Seq('AGGATTTCGTGTACCAGTTTAAGGCCATGTGCTACTTCACCAACGGGACGGAGC...GAG', SingleLetterAlphabet()), id='HLA:HLA00630', name='HLA:HLA00630', description='HLA:HLA00630 DQB1*03:04:01 597 bp', dbxrefs=[]),
                    'exon_3': SeqRecord(seq=Seq('TGGAGCCCACAGTGACCATCTCCCCATCCAGGACAGAGGCCCTCAACCACCACA...ATG', SingleLetterAlphabet()), id='HLA:HLA00630', name='<unknown name>', description='HLA:HLA00630', dbxrefs=[])},
     'features': {'exon_1': SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(13), strand=1), type='exon_1'),
                  'exon_2': SeqFeature(FeatureLocation(ExactPosition(13), ExactPosition(283), strand=1), type='exon_2')
                  'exon_3': SeqFeature(FeatureLocation(ExactPosition(283), ExactPosition(503), strand=1), type='exon_3')},
     'method': 'nt_search and clustalo',
     'gfe': 'HLA-Aw2-1-1-1-1-1-1-1-1-1-1-1-1-1-1-1-4',
     'seq': SeqRecord(seq=Seq('AGAGACTCTCCCGAGGATTTCGTGTACCAGTTTAAGGCCATGTGCTACTTCACC...ATG', SingleLetterAlphabet()), id='HLA:HLA00630', name='HLA:HLA00630', description='HLA:HLA00630 DQB1*03:04:01 597 bp', dbxrefs=[])
}

Once a sequence has been annotated the gene features and their corresponding sequences are available in the returned Annotation object. If a full annotation is not able to be produced then nothing will be returned. Below is an example showing how the features can be accessed and printed out.

ann = seqann.annotate(sequence, "HLA-A")
for feat in ann.annotation:
    print(feat, ann.gfe, str(ann.annotation[feat].seq), sep="\t")

Install

pip install seqann

Dependencies

Clustal Omega 1.2.0 or higher
Python 3.6
blastn

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 2 - Pre-Alpha
Intended Audience
- Developers
License
- OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)
Natural Language
- English
Programming Language
- Python :: 3.6

Release history Release notifications | RSS feed

This version

1.0.0

Nov 16, 2018

0.0.45

Oct 7, 2018

0.0.44

Oct 6, 2018

0.0.43

Oct 6, 2018

0.0.42

Sep 26, 2018

0.0.41

Sep 26, 2018

0.0.40

Sep 26, 2018

0.0.39

Sep 25, 2018

0.0.38

Sep 25, 2018

0.0.37

Sep 25, 2018

0.0.36

Sep 10, 2018

0.0.35

Sep 10, 2018

0.0.34

Sep 10, 2018

0.0.33

Aug 6, 2018

0.0.32

Jul 26, 2018

0.0.31

Jul 16, 2018

0.0.30

Jun 28, 2018

0.0.29

Jun 14, 2018

0.0.28

Jun 14, 2018

0.0.27

Jun 8, 2018

0.0.26

Jun 6, 2018

0.0.25

Jun 4, 2018

0.0.24

May 31, 2018

0.0.23

May 30, 2018

0.0.22

May 29, 2018

0.0.20

May 17, 2018

0.0.19

May 17, 2018

0.0.18

May 16, 2018

0.0.17

May 16, 2018

0.0.16

May 16, 2018

0.0.15

May 9, 2018

0.0.14

Apr 6, 2018

0.0.13

Apr 5, 2018

0.0.12

Apr 5, 2018

0.0.11

Apr 3, 2018

0.0.10

Mar 26, 2018

0.0.9

Mar 26, 2018

0.0.8

Mar 26, 2018

0.0.7

Nov 14, 2017

0.0.6

Nov 14, 2017

0.0.5

Nov 13, 2017

0.0.4

Nov 13, 2017

0.0.3

Nov 13, 2017

0.0.2

Nov 12, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

seqann-1.0.0-py2.py3-none-any.whl (22.3 MB view details)

Uploaded Nov 16, 2018 Python 2Python 3

File details

Details for the file seqann-1.0.0-py2.py3-none-any.whl.

File metadata

Download URL: seqann-1.0.0-py2.py3-none-any.whl
Upload date: Nov 16, 2018
Size: 22.3 MB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.9.1 pkginfo/1.4.1 requests/2.18.4 setuptools/36.6.0 requests-toolbelt/0.8.0 tqdm/4.19.4 CPython/3.6.0

File hashes

Hashes for seqann-1.0.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`5431e35b85a54d4756069d11f0214730e8be082a9627a959ca7774da990c4795`
MD5	`a48ecc3688a526f198cbc361e361fc6e`
BLAKE2b-256	`569eaf959abb3af397cd9cb11ea05ff0e5a95c7be237ead6a215913ab1baad04`

See more details on using hashes here.

seqann 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Overview

Parameters

Usage

Annotations

Install

Dependencies

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes