Python package to download HPO annotations
Project description
Python package to download HPO annotations
How do I install this package?
As usual, just download it using pip:
pip install hpo_downloader
Tests Coverage
Since some software handling coverages sometime get slightly different results, here’s three of them:
Usage examples
The library offers mainly two methods:
Map HPO ids to Uniprot ids
To map the available HPO ids to Uniprot Ids (when possible, not all geneIds used in HPO map to Uniprot Ids) use the following method:
from hpo_downloader import map_phenotype_to_uniprot
phenotype_to_uniprot = map_phenotype_to_uniprot(
cafa4_only=False # To not filter for uniprot_ids present only in CAFA4
)
phenotype_to_uniprot_cafa4_only = map_phenotype_to_uniprot(
cafa4_only=True # To filter for uniprot_ids present only in CAFA4
)
The resulting dataframe will look like this:
Uniprot_ID |
HPO-Term-ID |
---|---|
A2MG_HUMAN |
HP:0410054 |
A2MG_HUMAN |
HP:0001425 |
A2MG_HUMAN |
HP:0001300 |
A2MG_HUMAN |
HP:0000006 |
A2MG_HUMAN |
HP:0000726 |
A2MG_HUMAN |
HP:0002423 |
A2MG_HUMAN |
HP:0002185 |
The last version with all the mapping is available here in tab format. Similarly, the CAFA4 only mapping is available here in tab format.
N.B.: CURRENTLY 55 gene IDs (1.28% of total) are not mapped by uniprot to curresponding uniprot IDs
N.B.: CURRENTLY 16187 CAFA4 Uniprot_Ids (79.23% of total) are not mapped by uniprot to curresponding HPO IDs
Download HPO Phenotype annotations
To download the phenotype annotations you can use the following method:
from hpo_downloader import get_phenotype_annotations
annotations = get_phenotype_annotations()
The resulting dataframe will look like this:
DB |
DB_Object_ID |
DB_Name |
Qualifier |
HPO_ID |
DB_Reference |
Evidence_Code |
Onset modifier |
Frequency |
Sex |
Modifier |
Aspect |
Date_Created |
Assigned_By |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DECIPHER |
1 |
Wolf-Hirschhorn Syndrome |
HP:0001249 |
DECIPHER:1 |
IEA |
P |
WOLF-HIRSCHHORN SYNDROME |
HPO:skoehler[2013-05-29] |
|||||
DECIPHER |
1 |
Wolf-Hirschhorn Syndrome |
HP:0001250 |
DECIPHER:1 |
IEA |
P |
WOLF-HIRSCHHORN SYNDROME |
HPO:skoehler[2013-05-29] |
|||||
DECIPHER |
1 |
Wolf-Hirschhorn Syndrome |
HP:0001252 |
DECIPHER:1 |
IEA |
P |
WOLF-HIRSCHHORN SYNDROME |
HPO:skoehler[2013-05-29] |
|||||
DECIPHER |
1 |
Wolf-Hirschhorn Syndrome |
HP:0001518 |
DECIPHER:1 |
IEA |
P |
WOLF-HIRSCHHORN SYNDROME |
HPO:skoehler[2013-05-29] |
|||||
DECIPHER |
14 |
Prader-Willi syndrome (Type 1) |
HP:0000135 |
DECIPHER:14 |
IEA |
P |
PRADER-WILLI SYNDROME (TYPE 1) |
HPO:skoehler[2013-05-29] |
|||||
DECIPHER |
14 |
Prader-Willi syndrome (Type 1) |
HP:0001249 |
DECIPHER:14 |
IEA |
P |
PRADER-WILLI SYNDROME (TYPE 1) |
HPO:skoehler[2013-05-29] |
Download CAFA4 Ids
To download the CAFA4 and Uniprot Ids mapping use the following method:
from hpo_downloader import load_cafa4_uniprot_ids
cafa_mapping = load_cafa4_uniprot_ids()
The resulting dataframe will look like this:
CAFA_Id |
Uniprot_Id |
---|---|
T96060000001 |
1433B_HUMAN |
T96060000002 |
1433E_HUMAN |
T96060000003 |
1433F_HUMAN |
T96060000004 |
1433G_HUMAN |
T96060000005 |
1433S_HUMAN |
T96060000006 |
1433T_HUMAN |
T96060000007 |
1433Z_HUMAN |
T96060000008 |
1A01_HUMAN |
T96060000009 |
1A02_HUMAN |
T96060000010 |
1A03_HUMAN |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.