Tools for converting VCF to IGD files and processing them.
Project description
igdtools
igdtools can convert from .vcf(.gz) to IGD, and once you have an IGD file it can perform
various operations such as filtering, computing basic statistics, and generally transforming
IGD files.
Run igdtools --help for more information on commands.
For more general reading and modification of IGD files, see pyigd.
Installation
igdtools is a C++ binary, with a small Python wrapper to make installation easier. You can install
via:
pip install igdtools
which will install prebuilt binaries for most Linux systems, and install via a source distribution for other systems (such as MacOS). The source distribution requires CMake 3.10 or newer, the zlib development headers, and a version of clang or GCC that supports C++11.
Usage Examples
Convert .vcf(.gz) to IGD
Conversion will copy the variant identifiers ("ID" column in VCF) and individual identifiers (the
sample column names in VCF) to the IGD file, unless --no-var-ids and --no-indiv-ids flags
are specified (respectively).
igdtools input.vcf.gz -o output.igd
Convert .vcf(.gz) to IGD and export metadata
igdtools can export metadata fields as simple text files, each of which can be
loaded by numpy.loadtxt().
You can use --export-metadata to export this metadata during conversion:
igdtools input.vcf.gz -o output.igd --export-metadata qual,filter,info
The list of metadata types you can export are qual,filter,chrom,info. You can also
specify all to just export all of them without listing them out. By default,
no metadata is exported during VCF to IGD conversion.
Just export .vcf(.gz) metadata
If you already have an IGD file, and want to go back and export the metadata from the corresponding VCF, you can just do the export:
igdtools input.vcf.gz --export-metadata qual,filter,info
Note that the naming will differ. When just exporting the metadata, the naming
follows input.meta.*.txt, whereas the previous example with convering to IGD
and exporting metadata would have named based on output.meta.*.txt.
IGD file header info
To examine the header information in the IGD file:
igdtools -i test.igd
which will print out something like
Variants: 329556
Individuals: 1000
Ploidy: 2
Phased?: true
Source: true_data/simulation-source-1000-100mb.vcf
Genome range: 115-99999629
Has individual IDs? Yes
Has variant IDs? No
Similarly, some simple statistics can be emitted by specifying -s, which causes
the entire IGD file to be scanned (so will be slower than -i).
Copy variants within range
Create a copy of an IGD file, but only keep variants in a particular base-pair
range. Here we show range [50000, 150000] (50KB to 150KB, inclusive):
igdtools input.igd -o output.igd -r 50000-150000
Copy variants with frequency
Create a copy of an IGD file, but only keep variants that have a particular
allele frequency range. Here we show range [0.1, 0.4) (inclusive, exclusive).
igdtools input.igd -o output.igd -f 0.1-0.4
Copy to unphased data
Sometimes it is useful to perform "unphased" calculations. For example, when computing runs of homozygosity (ROH) it is easier to work with unphased diploid data that tracks the number of copies each individual has of an allele (0, 1, or 2). Create a copy of an IGD file, but store it unphased:
igdtools test.igd -o test.unphased.igd --force-unphased
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file igdtools-2.3.tar.gz.
File metadata
- Download URL: igdtools-2.3.tar.gz
- Upload date:
- Size: 195.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f32e8f35b991e4a8c5545ffabde2195bc8de645edc327fb527651ae17402185e
|
|
| MD5 |
c737d5a86d6cf542a2b5dbaffb67292f
|
|
| BLAKE2b-256 |
eace390c63bbd0ada63b7d66a1a65d944977844be131b3de897364a27c83c515
|
File details
Details for the file igdtools-2.3-cp313-cp313-manylinux_2_24_x86_64.whl.
File metadata
- Download URL: igdtools-2.3-cp313-cp313-manylinux_2_24_x86_64.whl
- Upload date:
- Size: 171.7 kB
- Tags: CPython 3.13, manylinux: glibc 2.24+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9994fe84813720d034587ddfc3394124f666863ee0fd1922e872cec4d68cc42b
|
|
| MD5 |
be0961c2db61b0f8a93ff0ac856ab047
|
|
| BLAKE2b-256 |
2e2abde453f43c5139dd948d417785c4b446c22be84668d7100b3cf14229b2bf
|
File details
Details for the file igdtools-2.3-cp312-cp312-manylinux_2_24_x86_64.whl.
File metadata
- Download URL: igdtools-2.3-cp312-cp312-manylinux_2_24_x86_64.whl
- Upload date:
- Size: 171.7 kB
- Tags: CPython 3.12, manylinux: glibc 2.24+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ba3d86a526e5eaad91c3de9c73ec120bd859753d9b3d544934214d5a9691d3a
|
|
| MD5 |
c63d231e6cf7377914ec7fdf8aa31111
|
|
| BLAKE2b-256 |
a408de331508441215f40eaeecb40e78730ee235c3b78ccd7e80049356a3b77c
|
File details
Details for the file igdtools-2.3-cp311-cp311-manylinux_2_24_x86_64.whl.
File metadata
- Download URL: igdtools-2.3-cp311-cp311-manylinux_2_24_x86_64.whl
- Upload date:
- Size: 171.7 kB
- Tags: CPython 3.11, manylinux: glibc 2.24+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4fd118d793949ec2c0e6371fe64bf6b06659edb89acb207c27f83bf0b5a935e6
|
|
| MD5 |
483b979267e57afde54d05524bcee53b
|
|
| BLAKE2b-256 |
fafca481d7d93bb9b14aad825743c94375871c0705882f05c51fdef2bc3c8d26
|
File details
Details for the file igdtools-2.3-cp310-cp310-manylinux_2_24_x86_64.whl.
File metadata
- Download URL: igdtools-2.3-cp310-cp310-manylinux_2_24_x86_64.whl
- Upload date:
- Size: 171.7 kB
- Tags: CPython 3.10, manylinux: glibc 2.24+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
684126e1f901b617be44961fe565e23bc30e10046621aa61d6706a8f376fbae7
|
|
| MD5 |
c66cd25ad47621cda7fe67835e1e9643
|
|
| BLAKE2b-256 |
5e160901538c4566f16731dc314fc9a8c6499017df0c293d4c707320c4c26b41
|
File details
Details for the file igdtools-2.3-cp39-cp39-manylinux_2_24_x86_64.whl.
File metadata
- Download URL: igdtools-2.3-cp39-cp39-manylinux_2_24_x86_64.whl
- Upload date:
- Size: 171.7 kB
- Tags: CPython 3.9, manylinux: glibc 2.24+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59ef4fe7746525df2a567e62e766ee9b12c7833144d27b380a8d9a3631907389
|
|
| MD5 |
fd3c22ea5d5a34840c01db0113265c14
|
|
| BLAKE2b-256 |
da24a83571e679fd448f1f564926d983dc3d5a8b5199df5871b6a8c6e219bd13
|
File details
Details for the file igdtools-2.3-cp38-cp38-manylinux_2_24_x86_64.whl.
File metadata
- Download URL: igdtools-2.3-cp38-cp38-manylinux_2_24_x86_64.whl
- Upload date:
- Size: 171.7 kB
- Tags: CPython 3.8, manylinux: glibc 2.24+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50397ccac224b9f1798584f37625b099c9db1658d15324f002984ff45c226415
|
|
| MD5 |
4e6d2cc7d6de512a027274f17b25f7e8
|
|
| BLAKE2b-256 |
91aa98981852b0b63649ced11052c6c90a4154dbfb838221e02fadf9ac93ee3c
|