Skip to main content

A nexus (phylogenetics) file reader (.nex, .trees)

Project description

python-nexus

A Generic phylogenetic nexus format (.nex, .trees) reader and writer for python.

Build Status codecov PyPI DOI

Description

python-nexus provides simple nexus file-format reading/writing tools, and a small collection of nexus manipulation scripts.

Please note that this library works with the phylogenetics data format (e.g. https://en.wikipedia.org/wiki/Nexus_file) and not the phyics data format (e.g. https://manual.nexusformat.org/).

Note: Due to a name clash with another python package, this package must be installed as pip install python-nexus but imported as import nexus.

Usage

CLI

python-nexus installs a command nexus for cli use. You can inspect its help via

nexus -h

Python API

Reading a Nexus:

>>> from nexus import NexusReader
>>> n = NexusReader.from_file('tests/examples/example.nex')

You can also load from a string:

>>> n = NexusReader.from_string('#NEXUS\n\nbegin foo; ... end;')

NexusReader will load each of the nexus blocks it identifies using specific handlers.

>>> n.blocks
{'foo': <nexus.handlers.GenericHandler object at 0x7f55d94140f0>}
>>> n = NexusReader('tests/examples/example.nex')
>>> n.blocks
{'data': <NexusDataBlock: 2 characters from 4 taxa>}

A dictionary mapping blocks to handlers is available as `nexus.reader.HANDLERS:

>>> from nexus.reader import HANDLERS
>>> HANDLERS
{
    'trees': <class 'nexus.handlers.tree.TreeHandler'>, 
    'taxa': <class 'nexus.handlers.taxa.TaxaHandler'>, 
    'characters': <class 'nexus.handlers.data.CharacterHandler'>, 
    'data': <class 'nexus.handlers.data.DataHandler'>
}

Any blocks that aren't in this dictionary will be parsed using nexus.handlers.GenericHandler.

NexusReader can then write the nexus to a string using NexusReader.write or to another file using NexusReader.write_to_file:

>>> output = n.write()
>>> n.write_to_file("mynewnexus.nex")

Note: if you want more fine-grained control over generating nexus files, then try NexusWriter discussed below.

Block Handlers:

There are specific "Handlers" to parse certain known nexus blocks, including the common 'data', 'trees', and 'taxa' blocks. Any blocks that are unknown will be parsed with GenericHandler.

ALL handlers extend the GenericHandler class and have the following methods.

  • __init__(self, name=None, data=None) __init__ is called by NexusReader to parse the contents of the block (in data) appropriately.

  • write(self) write is called by NexusReader to write the contents of a block to a string (i.e. for regenerating the nexus format for saving a file to disk)

generic block handler

The generic block handler simply stores each line of the block in .block:

n.blockname.block
['line1', 'line2', ... ]

data block handler

These are the main blocks encountered in nexus files - and contain the data matrix.

So, given the following nexus file with a data block:

#NEXUS 

Begin data;
Dimensions ntax=4 nchar=2;
Format datatype=standard symbols="01" gap=-;
    Matrix
Harry              00
Simon              01
Betty              10
Louise             11
    ;
End;

begin trees;
    tree A = ((Harry:0.1,Simon:0.2),Betty:0.2)Louise:0.1;;
    tree B = ((Simon:0.1,Harry:0.2),Betty:0.2)Louise:0.1;;
end;

You can do the following:

Find out how many characters:

>>> n.data.nchar
2

Ask about how many taxa:

>>> n.data.ntaxa
4

Get the taxa names:

>>> n.data.taxa
['Harry', 'Simon', 'Betty', 'Louise']

Get the format info:

>>> n.data.format
{'datatype': 'standard', 'symbols': '01', 'gap': '-'}

The actual data matrix is a dictionary, which you can get to in .matrix:

>>> n.data.matrix
defaultdict(<class 'list'>, {'Harry': ['0', '0'], 'Simon': ['0', '1'], 'Betty': ['1', '0'], 'Louise': ['1', '1']})

Or, you could access the data matrix via taxon:

>>> n.data.matrix['Simon']
['0', '1']

Or even loop over it like this:

>>> for taxon, characters in n.data:
...     print(taxon, characters)
...     
Harry ['0', '0']
Simon ['0', '1']
Betty ['1', '0']
Louise ['1', '1']

You can also iterate over the sites (rather than the taxa):

>>> for site, data in n.data.characters.items():
...     print(site, data)
...     
0 {'Harry': '0', 'Simon': '0', 'Betty': '1', 'Louise': '1'}
1 {'Harry': '0', 'Simon': '1', 'Betty': '0', 'Louise': '1'}

..or you can access the characters matrix directly:

>>> n.data.characters[0]
{'Harry': '0', 'Simon': '0', 'Betty': '1', 'Louise': '1'}

Note: that sites are zero-indexed!

trees block handler

If there's a trees block, then you can do the following

You can get the number of trees:

>>> n.trees.ntrees
2

You can access the trees via the .trees dictionary:

>>> n.trees.trees[0]
'tree A = ((Harry:0.1,Simon:0.2):0.1,Betty:0.2):Louise:0.1);'

Or loop over them:

>>> for tree in n.trees:
...     print(tree)
... 
tree A = ((Harry:0.1,Simon:0.2):0.1,Betty:0.2):Louise:0.1);
tree B = ((Simon:0.1,Harry:0.2):0.1,Betty:0.2):Louise:0.1);

For further inspection of trees via the newick package, you can retrieve a nexus.Node object for a tree:

>>> print(n.trees.trees[0].newick_tree.ascii_art())
                  ┌─Harry
         ┌────────┤
──Louise─┤        └─Simon
         └─Betty

taxa block handler

Programs like SplitsTree understand "TAXA" blocks in Nexus files:

BEGIN Taxa;
DIMENSIONS ntax=4;
TAXLABELS
[1] 'John'
[2] 'Paul'
[3] 'George'
[4] 'Ringo'
;
END; [Taxa]

In a taxa block you can get the number of taxa and the taxa list:

>>> n.taxa.ntaxa
4
>>> n.taxa.taxa
['John', 'Paul', 'George', 'Ringo']

NOTE: with this alternate nexus format the Characters blocks should be parsed by DataHandler.

Writing a Nexus File using NexusWriter

NexusWriter provides more fine-grained control over writing nexus files, and is useful if you're programmatically generating a nexus file rather than loading a pre-existing one.

>>> from nexus import NexusWriter
>>> n = NexusWriter()
>>> #Add a comment to appear in the header of the file
>>> n.add_comment("I am a comment")

Data are added by using the "add" function - which takes 3 arguments, a taxon, a character name, and a value.

>>> n.add('taxon1', 'Character1', 'A')
>>> n.data
{'Character1': {'taxon1': 'A'}}
>>> n.add('taxon2', 'Character1', 'C')
>>> n.add('taxon3', 'Character1', 'A')

Characters and values can be strings or integers (but you cannot mix string and integer characters).

>>> n.add('taxon1', 2, 1)
>>> n.add('taxon2', 2, 2)
>>> n.add('taxon3', 2, 3)

NexusWriter will interpolate missing entries (i.e. taxon2 in this case)

>>> n.add('taxon1', "Char3", '4')
>>> n.add('taxon3', "Char3", '4')

... when you're ready, you can generate the nexus using make_nexus or write_to_file:

>>> data = n.make_nexus(interleave=True, charblock=True, preserve_order=False)
>>> n.write_to_file("output.nex", interleave=True, charblock=True, preserve_order=False)

... you can make an interleaved nexus by setting interleave to True, and you can include a character block in the nexus (if you have character labels for example) by setting charblock to True. Furthermore you can specify whether the order of added taxa and characters should be preserved by setting preserve_order to True, otherwise they will be sorted alphanumerically.

There is rudimentary support for handling trees e.g.:

>>> n.trees.append("tree tree1 = (a,b,c);")
>>> n.trees.append("tree tree2 = (a,b,c);")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-nexus-2.9.0.tar.gz (34.2 kB view details)

Uploaded Source

Built Distribution

python_nexus-2.9.0-py2.py3-none-any.whl (39.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file python-nexus-2.9.0.tar.gz.

File metadata

  • Download URL: python-nexus-2.9.0.tar.gz
  • Upload date:
  • Size: 34.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.10

File hashes

Hashes for python-nexus-2.9.0.tar.gz
Algorithm Hash digest
SHA256 9eea1a0e79dc20b84310a84d4cc90665b1a359a74c17cc0a7602e54156188204
MD5 87fcf4840cba180422f2736a3f3e8890
BLAKE2b-256 3112094a553953695c5f0ecc0c11fd3a97e25560933a3649782e736459cda0c6

See more details on using hashes here.

File details

Details for the file python_nexus-2.9.0-py2.py3-none-any.whl.

File metadata

  • Download URL: python_nexus-2.9.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 39.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.8.10

File hashes

Hashes for python_nexus-2.9.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 9da55f94cb16526d5f3afa94cc5a3cc771f6a1db8c26b5f359f2481a08f52dbb
MD5 ba2adb671eb5d7bd86a853264ace8e98
BLAKE2b-256 1e9b8da2c6ea98c3a88197c027c9a1d73fc7ac8a67abed47caf1998203f0925c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page