Calculate allele frequency from a sequence multialignment.
Project description
allfreqs
Calculate allele frequencies from a sequence multialignment.
Free software: MIT license
Documentation: https://allfreqs.readthedocs.io
GitHub repo: https://github.com/robertopreste/allfreqs
Features
Calculate allele frequencies from a nucleotide multialignment in fasta or csv format.
Allele frequencies will be returned as a table in which each row is a nucleotide position (based on the provided reference sequence) and columns are A, C, G, T frequencies as well as gaps and other non-canonical nucleotides.
For example, given the following multialignment:
ID |
Sequence |
---|---|
ref |
ACGTACGT |
seq1 |
A-GTAGGN |
seq2 |
ACCAGCGT |
the resulting allele frequencies will be:
position |
A |
C |
G |
T |
gap |
oth |
---|---|---|---|---|---|---|
1.0_A |
1.0 |
0.0 |
0.0 |
0.0 |
0.0 |
0.0 |
2.0_C |
0.0 |
0.5 |
0.0 |
0.0 |
0.5 |
0.0 |
3.0_G |
0.0 |
0.5 |
0.5 |
0.0 |
0.0 |
0.0 |
4.0_T |
0.5 |
0.0 |
0.0 |
0.5 |
0.0 |
0.0 |
5.0_A |
0.5 |
0.0 |
0.5 |
0.0 |
0.0 |
0.0 |
6.0_C |
0.0 |
0.5 |
0.5 |
0.0 |
0.0 |
0.0 |
7.0_G |
0.0 |
0.0 |
1.0 |
0.0 |
0.0 |
0.0 |
8.0_T |
0.0 |
0.0 |
0.0 |
0.5 |
0.0 |
0.5 |
Frequencies of non-canonical (ambiguous) nucleotides are by default squashed into the oth column, but they can also be shown separately using a simple flag.
allfreqs can be used either as a command line tool or through its Python API.
For more information, please refer to the Usage section of the documentation.
Installation
PLEASE NOTE: allfreqs only supports Python >= 3.6!
The preferred installation method for allfreqs is using pip:
$ pip install allfreqs
For more information, please refer to the Installation section of the documentation.
Credits
This package was created with Cookiecutter and the cc-pypackage project template.
History
0.1.0 (2019-07-08)
First release.
0.1.1 (2019-08-08)
Read and process multialignments from fasta and csv files (Python module only).
0.1.2 (2019-10-17)
Add tests with and without reference included in multialignments;
Add tests with real datasets (coming from haplogroup-specific multialignments).
0.1.3 (2019-10-18)
Add more detailed tests for real datasets;
Implement more efficient frequency calculation;
Add dunder methods and sanity checks;
Fix requirements and testing framework;
Clean code.
0.2.0 (2020-03-07)
Remove numpy and pandas from requirements as they are installed by scikit-bio;
Move tests module inside allfreqs;
Add ci module for internal management;
Clean code.
0.3.0 (2020-04-02)
Add option to allow ambiguous nucleotides shown separately.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file allfreqs-0.3.0.tar.gz
.
File metadata
- Download URL: allfreqs-0.3.0.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1de866f53c3f78ce9629ab23d5d2c7076316e3581b24f5829d5696a1ab0d6105 |
|
MD5 | 09d8a73fead791a4421ff7a482515ab9 |
|
BLAKE2b-256 | 2410a7ea8ed77ef4996d452e3de6b03e7bd4da3e51a60c6c95e23d748bff51db |
File details
Details for the file allfreqs-0.3.0-py2.py3-none-any.whl
.
File metadata
- Download URL: allfreqs-0.3.0-py2.py3-none-any.whl
- Upload date:
- Size: 8.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf196005366b685c302df43010d7c5cee12146dad0bd93cab22cf898e6897889 |
|
MD5 | e07ba7a721eb4ca757ba61d8bc7ef81b |
|
BLAKE2b-256 | 2ec3d62b125d47f20ead0b258a4198f7f277fb991fd32d3020d65e1a927ca47b |