Takes SeqRecordExpanded objects and creates datasets for phylogenetic software
Project description
Dataset creator for phylogenetic software
tests |
|
---|---|
package |
Takes SeqRecordExpanded objects and creates datasets for phylogenetic software
Free software: BSD license
Installation
pip install dataset_creator
Usage
The list of SeqRecordExpanded objects should be sorted by gene_code first then by voucher_code.
>>> from seqrecord_expanded import SeqRecord
>>> from dataset_creator import Dataset
>>>
>>> # `table` is the Translation Table code based on NCBI
>>> seq_record1 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='RpS5',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record2 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='RpS5',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record3 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='wingless',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_record4 = SeqRecord('ACTACCTA', reading_frame=2, gene_code='winglesss',
... table=1, voucher_code='CP100-10',
... taxonomy={'genus': 'Aus', 'species': 'bus'})
>>>
>>> seq_records = [
... seq_record1, seq_record2, seq_record3, seq_record4,
... ]
>>> # codon positions can be 1st, 2nd, 3rd, 1st-2nd, ALL (default)
>>> dataset = Dataset(seq_records, format='NEXUS', partitioning='by gene',
... codon_positions='1st',
... )
>>> print(dataset.dataset_str)
"""#NEXUS
blah blah
"""
Development
To run the all tests run:
tox
Changelog
0.3.0 (2015-10-01)
Accepts voucher code as string that will be used to generate the outgroup string needed for NEXUS and TNT files.
0.2.0 (2015-09-30)
Creates datasets as degenerated sequences using the method by Zwick et al.
0.1.1 (2015-09-30)
It will issue errors if reading frames are not specified unless they are strictly necessary to build the dataset (datasets need to be divided by codon positions).
Added documentation using sphinx-doc
Creates datasets as aminoacid sequences.
0.1.0 (2015-09-23)
Creates Nexus, Tnt, Fasta, Phylip and Mega dataset formats.
0.0.1 (2015-06-10)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dataset_creator-0.3.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | de67a7d4797e9cfba292a0f1c2dd21cafed0f2069bd42095438f693a7920ddc4 |
|
MD5 | 2f02ba0a91801363ead307dc5065c6d1 |
|
BLAKE2b-256 | 91b611db60c89cf6230dc50d12a90575f94b07dbc408b99c6611ed6ddc1cd3ad |