Tools for programmatically annotating VCFs with the Seshat TP53 database.
Project description
tp53
Tools for programmatically annotating VCFs with the Seshat TP53 database.
Installation
The package can be installed with pip:
pip install tp53
Upload a VCF to the Seshat TP53 Annotation Server
Upload a VCF to the Seshat TP53 annotation server using a headless browser.
❯ python -m tp53.seshat.upload_vcf \
--input "input.vcf" \
--email "example@gmail.com"
INFO:tp53.seshat.upload_vcf:Uploading 0 %...
INFO:tp53.seshat.upload_vcf:Uploading 53%...
INFO:tp53.seshat.upload_vcf:Uploading 53%...
INFO:tp53.seshat.upload_vcf:Uploading 60%...
INFO:tp53.seshat.upload_vcf:Uploading 60%...
INFO:tp53.seshat.upload_vcf:Uploading 66%...
INFO:tp53.seshat.upload_vcf:Uploading 66%...
INFO:tp53.seshat.upload_vcf:Uploading 80%...
INFO:tp53.seshat.upload_vcf:Uploading 80%...
INFO:tp53.seshat.upload_vcf:Upload complete!
This tool is used to programmatically configure and upload batch variants in VCF format to the Seshat annotation server. The tool works by building a headless Chrome browser instance and then interacting with the Seshat website directly through simulated key presses and mouse clicks. Unfortunately, Seshat does not provide a native programmatic API and one could not be reverse engineered. Seshat also utilizes custom JavaScript in their form processing, so a lightweight approach of simply interacting with the HTML form elements was also not possible.
VCF Input Requirements
Seshat will not let the user know why a VCF fails to annotate, but it has been observed that Seshat can fail to parse some of VarDictJava's structural variants (SVs) as valid variant records. One solution that has worked in the past is to remove SVs. The following command will exclude all variants with a non-empty SVTYPE INFO key:
❯ bcftools view in.vcf --exclude 'SVTYPE!="."' > out.noSV.vcf
Automation
There are no terms and conditions posted on the Seshat annotation server's website, and there is no server-side robots.txt rule set.
In lieu of usage terms, we strongly encourage all users of this script to respect the Seshat resource by adhering to the following best practice:
- Minimize Load: Limit the rate of requests to the server
- Minimize Connections: Limit the number of concurrent requests
If you need to batch process dozens, or hundreds, of VCF callsets, you may consider improving this underlying Python script to randomize the user agent and IP address of your headless browser session to prevent from being labelled as a bot.
Environment Setup
This script relies on Google Chrome:
❯ brew install --cask google-chrome
Distributions of MacOS may require you to authenticate the Chrome driver (link).
Development and Testing
See the contributing guide for more information.
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tp53-0.3.0.tar.gz.
File metadata
- Download URL: tp53-0.3.0.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a651e78bafc738caa82b738ba68be10cfbf070f6c11d25e69084303bcdb3774
|
|
| MD5 |
a149bc4bd4663f40bc0c885fbdbf9244
|
|
| BLAKE2b-256 |
dc94e4f0558b539b68236282c1161abbc4fbbb61da2f7f1130f20fe8f04edbdf
|
Provenance
The following attestation bundles were made for tp53-0.3.0.tar.gz:
Publisher:
publish_tp53.yml on clintval/tp53
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tp53-0.3.0.tar.gz -
Subject digest:
0a651e78bafc738caa82b738ba68be10cfbf070f6c11d25e69084303bcdb3774 - Sigstore transparency entry: 155273525
- Sigstore integration time:
-
Permalink:
clintval/tp53@4f1506452cb6ece0ba512e62b6fe59e411962e1b -
Branch / Tag:
refs/tags/0.3.0 - Owner: https://github.com/clintval
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish_tp53.yml@4f1506452cb6ece0ba512e62b6fe59e411962e1b -
Trigger Event:
push
-
Statement type:
File details
Details for the file tp53-0.3.0-py3-none-any.whl.
File metadata
- Download URL: tp53-0.3.0-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.0.1 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d4ec23f4b389b9a5e0c63d3aab13b0167cb8f32d5cc3bae40580844ffe333a7
|
|
| MD5 |
6dbf17ecd7015791301afc3544cdd5cd
|
|
| BLAKE2b-256 |
c14883eb590ff200d04e71c0f064505212c47c63fd85e7346e9dba9a048414be
|
Provenance
The following attestation bundles were made for tp53-0.3.0-py3-none-any.whl:
Publisher:
publish_tp53.yml on clintval/tp53
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tp53-0.3.0-py3-none-any.whl -
Subject digest:
4d4ec23f4b389b9a5e0c63d3aab13b0167cb8f32d5cc3bae40580844ffe333a7 - Sigstore transparency entry: 155273527
- Sigstore integration time:
-
Permalink:
clintval/tp53@4f1506452cb6ece0ba512e62b6fe59e411962e1b -
Branch / Tag:
refs/tags/0.3.0 - Owner: https://github.com/clintval
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish_tp53.yml@4f1506452cb6ece0ba512e62b6fe59e411962e1b -
Trigger Event:
push
-
Statement type: