Skip to main content

Demographic inference from Ancestral Recombination Graphs.

Project description

install with bioconda

mrpast

Infer demographic parameters from Ancestral Recombination Graphs (ARGs). See the preprint for details on the method and results.

Install

Python 3.8 or newer is supported.

Install from PyPi:

pip install mrpast

On Linux, this will use prebuilt binaries. On MacOS, this will trigger a source code build, which requires CMake and gcc or clang (C++17 support required).

You can also install the conda package via the bioconda channel: conda install mrpast.

Build/Install from repository

Recommend using a virtual environment, the below creates and activates one:

python3 -m venv MyEnv
source MyEnv/bin/activate

Clone repo, then build and install:

git clone --recursive https://github.com/aprilweilab/mrpast.git
pip install mrpast/

Alternative installation options

  1. Compile for the native CPU; this can speed up the numerical solver, but makes the resulting package less portable.
MRPAST_ENABLE_NATIVE=1 pip install mrpast/ 
  1. Build the solver in debug mode, so GDB can be attached.
MRPAST_DEBUG=1 pip install mrpast/

IMPORTANT: NEW MODEL FORMAT

If you used MrPast prior to August 14, 2025, you may have models in the "old" format. The new format attempts to be more user friendly. See the examples for the new format.

To convert an "old style" model, foo.yaml to the new style, just do:

mrpast init --from-old-mrpast foo.yaml > foo.new.yaml

The documentation is still not updated w/r/t the new model format, but hopefully the examples are sufficient to explain the changes.

Usage

See the documentation for examples and details.

There are three primary subcommands to mrpast, and they are usually run in this order:

  1. mrpast simulate
  2. mrpast process
  3. mrpast solve

These steps describe the "Simulated ARG" workflow, where no ARG inference is performed. See the documentation for workflows making use of inferred ARGs.

Simulation

In order to test out a demographic model, it is recommended that you start out by simulating that model and verifying that mrpast can recover the model parameters with the necessary accuracy. The simulation is done via msprime and produces an ancestral recombination graph (ARG) in the form of a tree-sequence file (.trees).

Example:

# Simulate the model 10 times, using a DNA sequence length of 100Kbp and the default recombination rate
mrpast simulate --replicates 10 --seq-len 100000 --debug-demo examples/5deme1epoch.yaml 5de1

This creates 10 tree-sequence files (ARGs) that are named like 5de1*.trees, using the given model.

Processing

Given an ARG in tree-sequence format, either from simulation (see above) or from ARG inference, we then extract coalescence information.

Example:

# Use 10 CPU threads to process the data and produce 10 replicates (expanded models) to be solved (later).
# `--bootstrap` creates 100 bootstrap samples by default, the average of which is used for input the maximum
# likelihood function
mrpast process --jobs 10 --replicates 10 --suffix trial1 --bootstrap coalcounts examples/5deme1epoch.yaml 5de1

See mrpast process --help for more options that control time discretization, distance between sampled trees, etc.

If we want, we could use --solve to run the solver as soon as processing completed. Otherwise, see the next section.

Solving

If you didn't pass --solve to mrpast process then you can run the solver via:

mrpast solve --jobs 10 5deme1epoch.*.solve_in.*.json

The resulting output files will be listed, and the best output (best likelihood) will be listed as well. The JSON files for the output contains the parameter values, their bounds, their initialized values, and (if present) their ground truth values.

Other workflows

Simulated Data, Inferred ARG

The simulated data, inferred ARG workflow is:

  1. mrpast simulate: Simulate your model with some ground-truth parameter values.
  2. mrpast sim2vcf -p: Convert all .trees files with the given prefix to VCF files, and emit the corresponding .popmap.json files (which maps each sample to a population).
  3. mrpast arginfer: Infer ARG from the VCF files, and then attach the population IDs to the ARG (.trees files) using the .popmap.json
  4. mrpast process: Process and solve the inferred ARGs

Real Data, Inferred ARG

The real data workflow is:

  1. Manually create a .popmap.json file for your VCF dataset. See the documentation for more details.
  2. mrpast arginfer: Infer ARG from the VCF files, and then attach the population IDs to the ARG (.trees files) using the .popmap.json
  3. mrpast process: Process and solve the inferred ARGs

Modeling

The demographic model is specified via YAML. See the examples directory for example models. See the documentation for details on model syntax and behavior.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mrpast-0.3.tar.gz (5.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mrpast-0.3-cp313-cp313-manylinux_2_24_x86_64.whl (865.8 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64

mrpast-0.3-cp312-cp312-manylinux_2_24_x86_64.whl (865.8 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64

mrpast-0.3-cp311-cp311-manylinux_2_24_x86_64.whl (865.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64

mrpast-0.3-cp310-cp310-manylinux_2_24_x86_64.whl (865.8 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64

mrpast-0.3-cp39-cp39-manylinux_2_24_x86_64.whl (865.8 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.24+ x86-64

mrpast-0.3-cp38-cp38-manylinux_2_24_x86_64.whl (865.6 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.24+ x86-64

File details

Details for the file mrpast-0.3.tar.gz.

File metadata

  • Download URL: mrpast-0.3.tar.gz
  • Upload date:
  • Size: 5.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mrpast-0.3.tar.gz
Algorithm Hash digest
SHA256 823a0394b07b5d9af6ca547700159c329e8abc87341e3420f7d05867f4ca4027
MD5 ead6c97f73a584f623559738d079e101
BLAKE2b-256 e91445d5b0fe4fbe392aa2e699807d0c9c09db6183fa5e9c53786c333cc4580c

See more details on using hashes here.

File details

Details for the file mrpast-0.3-cp313-cp313-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.3-cp313-cp313-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 4ee29fb837a114941077ad773ba8519451031d55d3e59d5719c96e33b3d86e87
MD5 97bbe7b0aaec2773a2e2ade63f83d261
BLAKE2b-256 3451e5e7323f5a4f66affae9009fe641eb9215e57d491ca00b28c21cc0e3a304

See more details on using hashes here.

File details

Details for the file mrpast-0.3-cp312-cp312-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.3-cp312-cp312-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 5a82995ed73317b210c50399d5906f92279c82ab7112910df4d0c6ae498b0eee
MD5 294471e9d292c920d65cbf0d684e9860
BLAKE2b-256 49437eda46cbba5cf6184e64a6f4d1cfaa8701752b0f83242ede1156e72f9ed3

See more details on using hashes here.

File details

Details for the file mrpast-0.3-cp311-cp311-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.3-cp311-cp311-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 f14e7bd4c83a39a86e16d5dc81156189d9e57a2fab1acd66959bd7010f4e7e21
MD5 c35649455cc2bf60f57d4e455bde2474
BLAKE2b-256 833715d38ce5ec5e1fc382d0c9e1ed696d85d81107bd23ab47a375dfcf5b219b

See more details on using hashes here.

File details

Details for the file mrpast-0.3-cp310-cp310-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.3-cp310-cp310-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 3e38b76b6612324495849cc7c51c5cf4083700af41be00462bf10a87580e590f
MD5 e9f74165c51f386ae7a246236ce2acd3
BLAKE2b-256 2234197aef6ee191bdb1cac6cb20e07cc413e4ac1f23b0b2cc6d2eb9f7a4fd6e

See more details on using hashes here.

File details

Details for the file mrpast-0.3-cp39-cp39-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: mrpast-0.3-cp39-cp39-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 865.8 kB
  • Tags: CPython 3.9, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mrpast-0.3-cp39-cp39-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 b45e9e9e422f053ab7e68cccf053a230cd1091c44cff903c0b4a871a5b343ccf
MD5 a809529c039dddc4c7588e5cc6a57825
BLAKE2b-256 932ca4a231d7d384fd824158a054032a626940e0c8b8cd31bdeccd284be8577b

See more details on using hashes here.

File details

Details for the file mrpast-0.3-cp38-cp38-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: mrpast-0.3-cp38-cp38-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 865.6 kB
  • Tags: CPython 3.8, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for mrpast-0.3-cp38-cp38-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 7dd8fb4cc1ffebe1c40e63782ea9db3e1b4ba05654a69dfdf448a85b2d89ab4c
MD5 8daa62948ae21b8a54cc136636464986
BLAKE2b-256 4c3148a4cf7814eaf45010f8b37d082ca7f9d9070c5b2820afe1701e62c1e6d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page