Skip to main content

Highly scalable inference of ancestral recombination graphs (ARGs)

Project description

Threads

Highly scalable inference of ancestral recombination graphs (ARGs)

Installation

pip install threads_arg

Usage

ARG inference

You will need

  • genotypes in pgen format
  • list of variants in bim or pvar format (with the same prefix as the pgen)
  • genetic map with 4 columns: Chromosome, SNP, cM, bp
  • demography file with two columns: generations in the past, effective population size in haploids

Minimal usage using the provided example data:

threads infer \
    --pgen example/example_data.pgen \
    --map_gz example/example_data.map \
    --demography example/Ne10000.demo \
    --out example/example_data.threads

threads convert \
    --threads example/example_data.threads \
    --argn example/example_data.argn

This will write a .threads file to path/to/output.threads.

threads infer accepts more options:

threads infer \
    --pgen path/to/input.pgen \
    --map_gz path/to/genetic_map.gz \
    --demography path/to/demography \
    --out path/to/output.threads \
    --modality [wgs|array] (default: wgs) \
    --query_interval (default: 0.01) \
    --match_group_interval (default: 0.5) \
    --max_sample_batch_size (default: None) \
    --mutation_rate (default: 1.4e-8) \
    --region 1234-56789 (default: whole region, end-inclusive) \
    --num_threads 8 (default: 1)

--modality array can be set for inference from arrays.

--query_interval and --match_group_interval can be raised to save memory for inference over long genomic regions, this will have little impact on accuracy, especially for sparse variants.

The HMM mutation rate can be set with --mutation_rate. This defaults to a realistic human rate of 1.4e-8 per site per generation.

Specifying a --region start-end means the output ARG is truncated to those base-pair coordinates (end-inclusive). The whole input set will still be used for inference.

Parallelism can be enabled by specifying --num_threads

ARG conversion

.threads files can be converted to .argn and .tsz using

threads convert \
    --threads arg.threads \
    --argn arg.argn

and

threads convert \
    --threads arg.threads \
    --tsz arg.tsz

Phasing/imputation/variant mapping

These functions are in an experimental stage and will be released later.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

threads_arg-0.1.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (295.5 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

threads_arg-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (255.7 kB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

threads_arg-0.1.0-cp312-cp312-macosx_10_15_x86_64.whl (281.3 kB view hashes)

Uploaded CPython 3.12 macOS 10.15+ x86-64

threads_arg-0.1.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (295.2 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

threads_arg-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (254.1 kB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

threads_arg-0.1.0-cp311-cp311-macosx_10_15_x86_64.whl (279.3 kB view hashes)

Uploaded CPython 3.11 macOS 10.15+ x86-64

threads_arg-0.1.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (295.1 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

threads_arg-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (254.2 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

threads_arg-0.1.0-cp310-cp310-macosx_10_15_x86_64.whl (279.3 kB view hashes)

Uploaded CPython 3.10 macOS 10.15+ x86-64

threads_arg-0.1.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (295.4 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

threads_arg-0.1.0-cp39-cp39-macosx_11_0_arm64.whl (254.3 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

threads_arg-0.1.0-cp39-cp39-macosx_10_15_x86_64.whl (279.4 kB view hashes)

Uploaded CPython 3.9 macOS 10.15+ x86-64

threads_arg-0.1.0-cp38-cp38-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (294.9 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.27+ x86-64 manylinux: glibc 2.28+ x86-64

threads_arg-0.1.0-cp38-cp38-macosx_10_15_x86_64.whl (279.2 kB view hashes)

Uploaded CPython 3.8 macOS 10.15+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page