Skip to main content

Demographic inference from Ancestral Recombination Graphs.

Project description

mrpast

Infer demographic parameters from Ancestral Recombination Graphs (ARGs). See the preprint for details on the method and results.

Install

Python 3.8 or newer is supported.

Install from PyPi:

pip install mrpast

On Linux, this will use prebuilt binaries. On MacOS, this will trigger a source code build, which requires CMake and gcc or clang (C++17 support required).

Build/Install from repository

Recommend using a virtual environment, the below creates and activates one:

python3 -m venv MyEnv
source MyEnv/bin/activate

Clone repo, then build and install:

git clone --recursive https://github.com/aprilweilab/mrpast.git
pip install mrpast/

Alternative installation options

  1. Compile for the native CPU; this can speed up the numerical solver, but makes the resulting package less portable.
MRPAST_ENABLE_NATIVE=1 pip install mrpast/ 
  1. Build the solver in debug mode, so GDB can be attached.
MRPAST_DEBUG=1 pip install mrpast/

IMPORTANT: NEW MODEL FORMAT

If you used MrPast prior to August 14, 2025, you may have models in the "old" format. The new format attempts to be more user friendly. See the examples for the new format.

To convert an "old style" model, foo.yaml to the new style, just do:

mrpast init --from-old-mrpast foo.yaml > foo.new.yaml

The documentation is still not updated w/r/t the new model format, but hopefully the examples are sufficient to explain the changes.

Usage

See the documentation for examples and details.

There are three primary subcommands to mrpast, and they are usually run in this order:

  1. mrpast simulate
  2. mrpast process
  3. mrpast solve

These steps describe the "Simulated ARG" workflow, where no ARG inference is performed. See the documentation for workflows making use of inferred ARGs.

Simulation

In order to test out a demographic model, it is recommended that you start out by simulating that model and verifying that mrpast can recover the model parameters with the necessary accuracy. The simulation is done via msprime and produces an ancestral recombination graph (ARG) in the form of a tree-sequence file (.trees).

Example:

# Simulate the model 10 times, using a DNA sequence length of 100Kbp and the default recombination rate
mrpast simulate --replicates 10 --seq-len 100000 --debug-demo examples/5deme1epoch.yaml 5de1

This creates 10 tree-sequence files (ARGs) that are named like 5de1*.trees, using the given model.

Processing

Given an ARG in tree-sequence format, either from simulation (see above) or from ARG inference, we then extract coalescence information.

Example:

# Use 10 CPU threads to process the data and produce 10 replicates (expanded models) to be solved (later).
# `--bootstrap` creates 100 bootstrap samples by default, the average of which is used for input the maximum
# likelihood function
mrpast process --jobs 10 --replicates 10 --suffix trial1 --bootstrap coalcounts examples/5deme1epoch.yaml 5de1

See mrpast process --help for more options that control time discretization, distance between sampled trees, etc.

If we want, we could use --solve to run the solver as soon as processing completed. Otherwise, see the next section.

Solving

If you didn't pass --solve to mrpast process then you can run the solver via:

mrpast solve --jobs 10 5deme1epoch.*.solve_in.*.json

The resulting output files will be listed, and the best output (best likelihood) will be listed as well. The JSON files for the output contains the parameter values, their bounds, their initialized values, and (if present) their ground truth values.

Other workflows

Simulated Data, Inferred ARG

The simulated data, inferred ARG workflow is:

  1. mrpast simulate: Simulate your model with some ground-truth parameter values.
  2. mrpast sim2vcf -p: Convert all .trees files with the given prefix to VCF files, and emit the corresponding .popmap.json files (which maps each sample to a population).
  3. mrpast arginfer: Infer ARG from the VCF files, and then attach the population IDs to the ARG (.trees files) using the .popmap.json
  4. mrpast process: Process and solve the inferred ARGs

Real Data, Inferred ARG

The real data workflow is:

  1. Manually create a .popmap.json file for your VCF dataset. See the documentation for more details.
  2. mrpast arginfer: Infer ARG from the VCF files, and then attach the population IDs to the ARG (.trees files) using the .popmap.json
  3. mrpast process: Process and solve the inferred ARGs

Modeling

The demographic model is specified via YAML. See the examples directory for example models. See the documentation for details on model syntax and behavior.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mrpast-0.2.tar.gz (5.0 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mrpast-0.2-cp313-cp313-manylinux_2_24_x86_64.whl (856.6 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64

mrpast-0.2-cp312-cp312-manylinux_2_24_x86_64.whl (856.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64

mrpast-0.2-cp311-cp311-manylinux_2_24_x86_64.whl (856.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64

mrpast-0.2-cp310-cp310-manylinux_2_24_x86_64.whl (856.6 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64

mrpast-0.2-cp39-cp39-manylinux_2_24_x86_64.whl (856.6 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.24+ x86-64

mrpast-0.2-cp38-cp38-manylinux_2_24_x86_64.whl (856.4 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.24+ x86-64

File details

Details for the file mrpast-0.2.tar.gz.

File metadata

  • Download URL: mrpast-0.2.tar.gz
  • Upload date:
  • Size: 5.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mrpast-0.2.tar.gz
Algorithm Hash digest
SHA256 33c0335c6980ab53841b7d5e5848bc6415ff673c23d15bbaeefcfa52aaeb22c5
MD5 cc1dad6ac8990650c23c1957823d14f7
BLAKE2b-256 80e5e3130a4516f35029c41b83fa787597d8d25de756b5f0a8d85e7e9a4e523a

See more details on using hashes here.

File details

Details for the file mrpast-0.2-cp313-cp313-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.2-cp313-cp313-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 04d315e780738c6675c514b710d348658f50ca43e2b1754492f2d1ba64525a5c
MD5 c405ba74e810a7123d12fca1b14b0838
BLAKE2b-256 2823e91313ca70459113f05b9d929812de348dae8e51001e35853e14265fc3e5

See more details on using hashes here.

File details

Details for the file mrpast-0.2-cp312-cp312-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.2-cp312-cp312-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 fc36bef56875b421a2be867b17608a7e772ec333b32fa47f290886d043814313
MD5 65f5482f26a574baed11e2202caba39f
BLAKE2b-256 dcc70f76a1007cda54071306b072de9f6de5da5b0d9f0ca15f222823a4768355

See more details on using hashes here.

File details

Details for the file mrpast-0.2-cp311-cp311-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.2-cp311-cp311-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 49e31744f3773553a65d54f0c079ff6f47f6dcf0f485fa3f4cf4414e1db4af3b
MD5 59f536573d16a303b10584e05e33f85c
BLAKE2b-256 45d1331427d7280d7a601526a0a519cf2a28bdf82936d5b3b425714a39676548

See more details on using hashes here.

File details

Details for the file mrpast-0.2-cp310-cp310-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.2-cp310-cp310-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 dc769cc626bf4635fe326b29c8254b3b0213aed00850539ddb89327b20e1cd36
MD5 ca6681c0e928270a7f0ed5398f8a0b25
BLAKE2b-256 f01947e52e9016a8acf43d262d8a0f879c65fd0d08fdb8ffaac04e79096fc8e7

See more details on using hashes here.

File details

Details for the file mrpast-0.2-cp39-cp39-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: mrpast-0.2-cp39-cp39-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 856.6 kB
  • Tags: CPython 3.9, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mrpast-0.2-cp39-cp39-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 b08a5b2b6300b80202348eeaee8cd88169ed929e9d479d49d94a1390891374e1
MD5 5d51d0d6b6feb1fd89f8a7470f91d656
BLAKE2b-256 a75ac132bdb73871b179ae2cbfa73cd10e49bb3dd589a45e97f1bfab41eee7c4

See more details on using hashes here.

File details

Details for the file mrpast-0.2-cp38-cp38-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: mrpast-0.2-cp38-cp38-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 856.4 kB
  • Tags: CPython 3.8, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mrpast-0.2-cp38-cp38-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 a4f4c39b56e302d1a0e9c8f2a545949d7759d27c660de951151a5c941c38c24b
MD5 85a7ef2b52e87dc08e906eb2d7f940cf
BLAKE2b-256 ae7bd50561c9d50aa9e70765251795c68109b3f563c837bf747da26a4b8b652c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page