Takes SeqRecordExpanded objects and creates datasets for phylogenetic software
Project description
Dataset creator for phylogenetic software
tests |
|
---|---|
package |
Takes SeqRecordExpanded objects and creates datasets for phylogenetic software
Free software: BSD license
Installation
pip install dataset_creator
Usage
The list of SeqRecordExpanded objects should be sorted by gene_code first then by voucher_code.
>>> from seqrecord_expanded import SeqRecord
>>> from dataset_creator import Dataset
>>>
>>> # `table` is the Translation Table code based on NCBI
>>> seq_record1 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='RpS5',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record2 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='RpS5',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record3 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='wingless',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record4 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='winglesss',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_records = [
... seq_record1, seq_record2, seq_record3, seq_record4,
... ]
>>> # codon positions can be 1st, 2nd, 3rd, 1st-2nd, ALL (default)
>>> dataset = Dataset(seq_records, format='NEXUS', partitioning='by gene',
... codon_positions='1st',
... )
>>> print(dataset.dataset_str)
"""#NEXUS
blah blah
"""
Development
To run the all tests run:
tox
Changelog
0.3.6 (2015-10-30)
Fixed 3rd codon positions bug that returned FASTA datasets with 3rd codon positions even if they were not needed.
0.3.5 (2015-10-29)
If user provides outgroup, then TNT datasets will place its sequences in first position in the dataset blocks.
0.3.4 (2015-10-02)
Fixed bug that did not show DATATYPE=PROTEIN in Nexus files when aminoacid sequences were requested by user.
0.3.3 (2015-10-02)
Fixed bug that raised an exception when SeqExpandedRecords did not have data in the taxonomy field.
0.3.2 (2015-10-01)
Fixed bug that raised an exception when user wanted partitioned dataset as 1st-2nd and 3rd codon positions of only one codon.
0.3.1 (2015-10-01)
Fixed bug that raised an exception when user wanted partitioned dataset by codon positions of only one codon.
0.3.0 (2015-10-01)
Accepts voucher code as string that will be used to generate the outgroup string needed for NEXUS and TNT files.
0.2.0 (2015-09-30)
Creates datasets as degenerated sequences using the method by Zwick et al.
0.1.1 (2015-09-30)
It will issue errors if reading frames are not specified unless they are strictly necessary to build the dataset (datasets need to be divided by codon positions).
Added documentation using sphinx-doc
Creates datasets as aminoacid sequences.
0.1.0 (2015-09-23)
Creates Nexus, Tnt, Fasta, Phylip and Mega dataset formats.
0.0.1 (2015-06-10)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dataset_creator-0.3.6-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f13cf6238a923b0fedb041090d72f5f80072132ca739b378635e2568254f1802 |
|
MD5 | 0ef68453b4424a0a12ce6bf2611cfc37 |
|
BLAKE2b-256 | 8166c4d902e0585febf4af59339a6813bcee40193d18a7a2298fe4ceac9bc3e6 |