Takes SeqRecordExpanded objects and creates datasets for phylogenetic software
Project description
Dataset creator for phylogenetic software
tests |
|
---|---|
package |
Takes SeqRecordExpanded objects and creates datasets for phylogenetic software
Free software: BSD license
Installation
pip install dataset_creator
Usage
The list of SeqRecordExpanded objects should be sorted by gene_code first then by voucher_code.
>>> from seqrecord_expanded import SeqRecord
>>> from dataset_creator import Dataset
>>>
>>> # `table` is the Translation Table code based on NCBI
>>> seq_record1 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='RpS5',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record2 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='RpS5',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record3 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='wingless',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record4 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='winglesss',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_records = [
... seq_record1, seq_record2, seq_record3, seq_record4,
... ]
>>> # codon positions can be 1st, 2nd, 3rd, 1st-2nd, ALL (default)
>>> dataset = Dataset(seq_records, format='NEXUS', partitioning='by gene',
... codon_positions='1st',
... )
>>> print(dataset.dataset_str)
"""#NEXUS
blah blah
"""
Development
To run the all tests run:
tox
Changelog
0.3.3 (2015-10-02)
Fixed bug that raised an exception when SeqExpandedRecords did not have data in the taxonomy field.
0.3.2 (2015-10-01)
Fixed bug that raised an exception when user wanted partitioned dataset as 1st-2nd and 3rd codon positions of only one codon.
0.3.1 (2015-10-01)
Fixed bug that raised an exception when user wanted partitioned dataset by codon positions of only one codon.
0.3.0 (2015-10-01)
Accepts voucher code as string that will be used to generate the outgroup string needed for NEXUS and TNT files.
0.2.0 (2015-09-30)
Creates datasets as degenerated sequences using the method by Zwick et al.
0.1.1 (2015-09-30)
It will issue errors if reading frames are not specified unless they are strictly necessary to build the dataset (datasets need to be divided by codon positions).
Added documentation using sphinx-doc
Creates datasets as aminoacid sequences.
0.1.0 (2015-09-23)
Creates Nexus, Tnt, Fasta, Phylip and Mega dataset formats.
0.0.1 (2015-06-10)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dataset_creator-0.3.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35d231ea5776effb28c065b6e38743d273aebbb5fa189dd8f953277b62b3e7c2 |
|
MD5 | 2761afaa0b2195a3e3cc8020fbb7f3fe |
|
BLAKE2b-256 | c099b808d968a0e0e094ac88cd24798ac930594c3360d96a26faff03561251ab |