Skip to main content

No project description provided

Project description

Msprime is a reimplementation of Hudson’s classical ms program for modern datasets. It is under heavy development at the moment, and the present code should be considered beta quality. Bugs may be present, and the API might change considerably. We provide an ms-compatible command line interface mspms, which should be useful to fit into existing workflows. However, the Python API is the recommended way to use msprime to gain full advantage of its internal structures.

See the Examples section below for some quick examples on how to use the API. Full documentation will be forthcoming.

Installation

The simplest method of installation is to use PyPI and pip:

$ pip install msprime --user

will install msprime your user Python installation. This should work in most cases (but see the Requirements section).

To use the ms-compatible command line program, you must ensure that the ~/.local/bin/ directory is in your PATH.

Requirements

Msprime requires Python 2.7+ (Python 3 versions are fully supported from 3.1 onwards), and the GNU Scientific Library, which is available for all major platforms. For example, to install on Debian/Ubuntu use:

# apt-get install libgsl0-dev

Examples

To simulate a tree under the standard coalescent for a sample of 4 individuals, use:

>>> import msprime
>>> pi, tau = msprime.simulate_tree(4)

Here, pi is an oriented tree, and tau is the node time list. See the documentation for ercs for an explanation of oriented forests. The oriented forest is a simple list of integers describing the genealogy of the sample in a concise and elegant manner:

>>> pi
[-1, 6, 5, 6, 5, 7, 7, 0]

To simulate the history of the sample over multiple loci with recombination, use the simulate_trees(n, m, r) function. Here, n is the sample size, m is the number of loci and r is the scaled recombination rate 4 * Ne * rho and rho is the per-generation recombination rate (note that we do not scale the recombination rate by the number of loci as ms does). For example:

>>> for l, pi, tau in msprime.simulate_trees(4, 20, 0.1):
...     print(l, ":", pi, ":", tau)
...
5 : [-1, 5, 6, 5, 7, 6, 7, 0] : [-1, 0, 0, 0, 0, 0.008097067475318909, 0.29228582978248596, 1.4635169506072998]
2 : [-1, 5, 6, 5, 7, 6, 7, 0] : [-1, 0, 0, 0, 0, 0.008097067475318909, 0.29228582978248596, 1.1387423276901245]
1 : [-1, 5, 6, 5, 7, 6, 7, 0] : [-1, 0, 0, 0, 0, 0.008097067475318909, 0.29228582978248596, 1.4635169506072998]
4 : [-1, 5, 6, 5, 7, 6, 7, 0] : [-1, 0, 0, 0, 0, 0.008097067475318909, 0.29228582978248596, 1.4635169506072998]
3 : [-1, 5, 6, 5, 7, 6, 7, 0] : [-1, 0, 0, 0, 0, 0.008097067475318909, 0.29228582978248596, 1.2797138690948486]
3 : [-1, 5, 7, 5, 6, 6, 7, 0] : [-1, 0, 0, 0, 0, 0.008097067475318909, 0.23547497391700745, 0.29228582978248596]
2 : [-1, 5, 6, 5, 6, 7, 7, 0] : [-1, 0, 0, 0, 0, 0.008097067475318909, 0.10243768990039825, 0.23547497391700745]

The simulate_trees function returns an iterator over the trees generated by the coalescent with recombination (pi and tau) along with the length of each tree, l. For instance, the first tree in the simulation above is 5 loci long, and the length of the second is 2 loci.

Demography

Msprime supports demographic events in the history of the sample. These are described using PopulationModel objects. A population model describes the dynamics of a population starting from a given time point (as we look into the past), and may be either a fixed size or exponentially growing (or shrinking). The models (and parameters) correspond exactly to the -eN and -eG options of ms.

For example, to simulate a history in which the population size was twice its current size between 0.5 and 1.0 time units ago and half its size after that, we might use:

>>> models = [
...    msprime.ConstantPopulationModel(0.5, 2.0),
...    msprime.ConstantPopulationModel(1.0, 0.5)
...]
>>> pi, tau = msprime.simulate_tree(10, models)

Similarly, if we wanted to simulate a population currently growing exponentially, with growth rate alpha = 6, we might use:

>>> models = [msprime.ExponentialPopulationModel(0.0, 6.0)]
>>> pi, tau = msprime.simulate_tree(10, models)

These population models can be mixed freely, and should behave precisely the same way as in ms.

Project details


Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page