Skip to main content

pleasingly pythonic pedigree manipulation

Project description

tools for pedigree files
------------------------

[![PyPI version](https://badge.fury.io/py/peddy.svg)](http://badge.fury.io/py/peddy)
[![Build Status](https://travis-ci.org/brentp/peddy.svg?branch=master)](https://travis-ci.org/brentp/peddy)
[![Documentation Status](https://readthedocs.org/projects/peddy/badge/?version=latest)](http://peddy.readthedocs.org/en/latest/?badge=latest)


Quickstart
----------

Most users will only need to run as a command-line tool with a ped and VCF, e.g:

```
python -m peddy --plot --prefix ceph-1463 ceph1463.vcf.gz ceph1463.ped
```

That will create 3 QC files and 3 QC plots where `_error` columns will
indicate:
+ discrepancies between reported and inferred relations
+ discrepancies between reported and inferred sex
+ higher levels of HET calls or more variance in allele frequencies for het calls.

It will also create **ceph-1463.html** which you can open in any browser to
interactively explore your data.

Overview
--------


**NOTE** this module used to be named to "pedagree".

`peddy` is a python library for querying, QC'ing, and manipulating pedigree files.

It currently makes it simple to extract things like:

+ parent-child pairs
+ trios
+ sibs
+ stats on number of males/females/trios/affecteds/unaffecteds
+ families.
+ families with at least N members
+ families with at least N children
+ [not yet] families with at least N generations
+ coefficient of relatedness given relation defined in the pedigree.

Also, given a pedigree file and a VCF file peddy provides tools to:

+ find likely sample mixups (or PED errors)
- sex mixups on X-Chrom
- family mixups by inferring relatedness with VCF

+ find mendelian errors


Usage
-----

```Python
>>> from peddy import Ped, SEX, PHENOTYPE

>>> p = Ped('my.ped')
# not yet.
#>>> p.dot() # draw the pedigree with graphviz

# not yet
# find any obvious issues (3 parents, mom as male, etc).
>>> p.validate()

# number of affecteds, un, males, females, etc. (contingency table?)
>>> p.summary()

# iterable
>>> p.samples()

>>> p.samples(phenotype=PHENOTYPE.AFFECTED, sex=SEX.MALE)

# sample object
>>> s = next(p.samples())

>>> s.phenotype

>>> s.sex

>>> s.mom

>>> s.dad

>>> s.siblings

>>> s.kids
```

Quality Control
---------------

If cyvcf2 is installed, then, given a ped-file and a VCF, we can look for cases where the relationships
defined in the ped file do not match the relationships derived from the genotypes in the VCF.

```Python
>>> from peddy import Ped
>>> p = Ped('cohort.ped')
>>> df = p.ped_check('cohort.vcf.gz')
>>> df[df.error] # show pairs of samples where the inferred differs from the reported.

```

[![relplot](http://peddy.readthedocs.org/en/latest/_images/ped-check.png)](http://github.com/brentp/cyvcf2/)

We don't see any obvious errors in this pedigree. An obvious error would be when a red colored dot clusters with blue dots.
The *outlined dots* have a very low IBS0 rate, indicating that they are likely parent-child pairs.

By looking for the frequency of heterozygotes in the non-PAR regions of
the X chromosome, we can determine sex from a VCF:

```Python
>>> from peddy import Ped
>>> p = Ped('cohort.ped')
>>> p.sex_check('cohort.vcf.gz', plot=True)
... List of all samples with number of HETs, HOMREF, HOMALT on X
```
This will also create an image like this one where we can
see a clear sample mixup.

[![sex_plot](https://raw.githubusercontent.com/brentp/peddy/master/images/sex_check.png)](http://github.com/brentp/cyvcf2/)


On creating a pedigree object (via Ped('some.ped'). `peddy` will print warnings to STDERR as appropriate like:

```
pedigree warning: '101811-101811' is dad but has female sex
pedigree warning: '101897-101897' is dad but has female sex
pedigree warning: '101896-101896' is mom of self
pedigree warning: '102110-102110' is mom but has male sex
pedigree warning: '102110-102110' is mom of self
pedigree warning: '101381-101381' is dad but has female sex
pedigree warning: '101393-101393' is mom but has male sex

unknown sample: 102498-102498 in family: K34175
unknown sample: 11509-11509 in family: K567331
unknown sample: 5180-5180 in family: K8565
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

peddy-0.1.1.tar.gz (37.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page