Skip to main content

Demographic inference from Ancestral Recombination Graphs.

Project description

mrpast

Infer demographic parameters from Ancestral Recombination Graphs (ARGs).

Install

Not yet published to PyPi.

Build/Install from repository

Requires Python 3.9 or newer, CMake, and a version of gcc or clang that supports C++17.

Recommend using a virtual environment, the below creates and activates one:

python3 -m venv MyEnv
source MyEnv/bin/activate

Clone repo, then build and install:

git clone --recursive https://github.com/aprilweilab/mrpast.git
pip install mrpast/

Alternative installation options

  1. Compile for the native CPU; this can speed up the numerical solver, but makes the resulting package less portable.
MRPAST_ENABLE_NATIVE=1 pip install mrpast/ 
  1. Build the solver in debug mode, so GDB can be attached.
MRPAST_DEBUG=1 pip install mrpast/

IMPORTANT: NEW MODEL FORMAT

If you used MrPast prior to August 14, 2025, you may have models in the "old" format. The new format attempts to be more user friendly. See the examples for the new format.

To convert an "old style" model, foo.yaml to the new style, just do:

mrpast init --from-old-mrpast foo.yaml > foo.new.yaml

The documentation is still not updated w/r/t the new model format, but hopefully the examples are sufficient to explain the changes.

Usage

There are three primary subcommands to mrpast, and they are usually run in this order:

  1. mrpast simulate
  2. mrpast process
  3. mrpast solve

These steps describe the "Simulated ARG" workflow, where no ARG inference is performed. See the documentation for workflows making use of inferred ARGs.

Simulation

In order to test out a demographic model, it is recommended that you start out by simulating that model and verifying that mrpast can recover the model parameters with the necessary accuracy. The simulation is done via msprime and produces an ancestral recombination graph (ARG) in the form of a tree-sequence file (.trees).

Example:

# Simulate the model 10 times, using a DNA sequence length of 100Kbp and the default recombination rate
mrpast simulate --replicates 10 --seq-len 100000 --debug-demo examples/5deme1epoch.yaml 5de1

This creates 10 tree-sequence files (ARGs) that are named like 5de1*.trees, using the given model.

Processing

Given an ARG in tree-sequence format, either from simulation (see above) or from ARG inference, we then extract coalescence information.

Example:

# Use 10 CPU threads to process the data and produce 10 replicates (expanded models) to be solved (later).
# `--bootstrap` creates 100 bootstrap samples by default, the average of which is used for input the maximum
# likelihood function
mrpast process --jobs 10 --replicates 10 --suffix trial1 --bootstrap coalcounts examples/5deme1epoch.yaml 5de1

See mrpast process --help for more options that control time discretization, distance between sampled trees, etc.

If we want, we could use --solve to run the solver as soon as processing completed. Otherwise, see the next section.

Solving

If you didn't pass --solve to mrpast process then you can run the solver via:

mrpast solve --jobs 10 5deme1epoch.*.solve_in.*.json

The resulting output files will be listed, and the best output (best likelihood) will be listed as well. The JSON files for the output contains the parameter values, their bounds, their initialized values, and (if present) their ground truth values.

Other workflows

Simulated Data, Inferred ARG

The simulated data, inferred ARG workflow is:

  1. mrpast simulate: Simulate your model with some ground-truth parameter values.
  2. mrpast sim2vcf -p: Convert all .trees files with the given prefix to VCF files, and emit the corresponding .popmap.json files (which maps each sample to a population).
  3. mrpast arginfer: Infer ARG from the VCF files, and then attach the population IDs to the ARG (.trees files) using the .popmap.json
  4. mrpast process: Process and solve the inferred ARGs

Real Data, Inferred ARG

The real data workflow is:

  1. Manually create a .popmap.json file for your VCF dataset. See the documentation for more details.
  2. mrpast arginfer: Infer ARG from the VCF files, and then attach the population IDs to the ARG (.trees files) using the .popmap.json
  3. mrpast process: Process and solve the inferred ARGs

Modeling

The demographic model is specified via YAML. See the examples directory for example models. See the documentation for details on model syntax and behavior.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mrpast-0.1.tar.gz (900.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mrpast-0.1-cp313-cp313-manylinux_2_24_x86_64.whl (856.5 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64

mrpast-0.1-cp312-cp312-manylinux_2_24_x86_64.whl (856.4 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64

mrpast-0.1-cp311-cp311-manylinux_2_24_x86_64.whl (856.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64

mrpast-0.1-cp310-cp310-manylinux_2_24_x86_64.whl (856.4 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64

mrpast-0.1-cp39-cp39-manylinux_2_24_x86_64.whl (856.4 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.24+ x86-64

mrpast-0.1-cp38-cp38-manylinux_2_24_x86_64.whl (856.2 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.24+ x86-64

File details

Details for the file mrpast-0.1.tar.gz.

File metadata

  • Download URL: mrpast-0.1.tar.gz
  • Upload date:
  • Size: 900.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mrpast-0.1.tar.gz
Algorithm Hash digest
SHA256 c687fefcf549f07333ba559591cebbd4e32dd29f8ffd13741c5126d4e69c5651
MD5 18e0a6b88d8d0115f2ae20955cef87c8
BLAKE2b-256 7dfdebe9fd6318d1256eb1c70fdc611afda70d084c63471d4f8062c552bc9191

See more details on using hashes here.

File details

Details for the file mrpast-0.1-cp313-cp313-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.1-cp313-cp313-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 04cd9ce0974f27d44a7071d7c98e0602306402b66680992b968b129dee1318b6
MD5 7a18ddb6e1dd55089730ab7218258644
BLAKE2b-256 0ba74b0c80fc66cad44e1ec3db523a30f70e18c971a6a55c631bc57dac07ae30

See more details on using hashes here.

File details

Details for the file mrpast-0.1-cp312-cp312-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.1-cp312-cp312-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 b9ee024469fd4d4e4673de3dbe81e62ae6e7440589a657c8cf60a4623d490654
MD5 b12125e40c4413a45edfb4960a95653c
BLAKE2b-256 165fbfded362d4511415d24e2eb0bc6888b6fb8b5ecabf33aeb1246683c7aea3

See more details on using hashes here.

File details

Details for the file mrpast-0.1-cp311-cp311-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.1-cp311-cp311-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 53804a412a5c48f7d393e448e31c2297c5c9b8dcf1ffa97db51a7ff869bc02a8
MD5 b682562ab512a63f002c1e948098720e
BLAKE2b-256 d6760779f5546cbc29073a7c8fdb58e1ecc1e119958ed25161904ad6e92c6006

See more details on using hashes here.

File details

Details for the file mrpast-0.1-cp310-cp310-manylinux_2_24_x86_64.whl.

File metadata

File hashes

Hashes for mrpast-0.1-cp310-cp310-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 517ff37b079a4f10451604c51ff93dfa2ca1e1cfae5970cd65a3d057983d15a5
MD5 b6a8f973e62d4e758ee5663e38423aa9
BLAKE2b-256 b4613bcc6c7b873f7ad859b6520ed18272f46607c893403b708a457209710af7

See more details on using hashes here.

File details

Details for the file mrpast-0.1-cp39-cp39-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: mrpast-0.1-cp39-cp39-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 856.4 kB
  • Tags: CPython 3.9, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mrpast-0.1-cp39-cp39-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 83e745598aebc66973486f27071290a4f0966450106b01bbfc74bc3606064ab9
MD5 b3a124b77049b84fecf69f6a51242cbe
BLAKE2b-256 7e3e17401fe143053ec239715f0f7f47cce2a8732413f4d8ec4e2ecdfc52c8ad

See more details on using hashes here.

File details

Details for the file mrpast-0.1-cp38-cp38-manylinux_2_24_x86_64.whl.

File metadata

  • Download URL: mrpast-0.1-cp38-cp38-manylinux_2_24_x86_64.whl
  • Upload date:
  • Size: 856.2 kB
  • Tags: CPython 3.8, manylinux: glibc 2.24+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for mrpast-0.1-cp38-cp38-manylinux_2_24_x86_64.whl
Algorithm Hash digest
SHA256 c972e681e15fc5b62a9c77f620b4942fcaf4bbf63c301695856c908b12fabbef
MD5 e1db6f8329b61ce1f33086e266a8eb42
BLAKE2b-256 64f12d4caa2fe6000065504fc0cc8f226b9bcef69ad1aa939ad29fd161950b4a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page