No project description provided
Project description
Python pakcage for genomic variant analysis
How to use?
pip install variant
๐งฌ variant motif
subcommand can fetch motif sequence around given site.
Usage: variant motif [OPTIONS]
Fetch genomic motif.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --input -i TEXT Input position file. โ
โ --output -o TEXT Output annotation file. โ
โ * --fasta -f TEXT reference fasta file. [required] โ
โ --npad -n TEXT Number of padding base to call motif. If you โ
โ want to set different left and right pads, โ
โ use comma to separate them. (eg. 2,3) โ
โ --with-header -H With header line in input file. โ
โ --columns -c TEXT Sets columns for site info. โ
โ (Chrom,Pos,Strand) โ
โ [default: 1,2,3] โ
โ --to-upper -u Convert motif to upper case. โ
โ --wrap-site -w Wrap motif site. โ
โ --help -h Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
demo:
I would like to get the 2 bases before the given sites, and 3 bases after the given sites, meanwhile, wrap the give sites with bracket. Moreover, the strand information should be taken into account.
use -n 2,3 -w
๐งซ variant effect
subcommand can infer the effect of a mutation
Usage: variant effect [OPTIONS]
Annotation genomic variant effect.
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --input -i TEXT Input position file. โ
โ --output -o TEXT Output annotation file โ
โ --reference -r TEXT reference species โ
โ --reference-gtf TEXT Customized reference gtf file. โ
โ --reference-transcript TEXT Customized reference transcript โ
โ fasta file. โ
โ --reference-protein TEXT Customized reference protein fasta โ
โ file. โ
โ --reference-mapping TEXT Mapping file for chrom name, first โ
โ column is chrom in the input, second โ
โ column is chrom in the reference db โ
โ (sep by tab) โ
โ --release -e INTEGER ensembl release โ
โ --strandness -s Use strand infomation or not? โ
โ --pU-mode -u Make rRNA, tRNA, snoRNA into top โ
โ priority. โ
โ --npad -n INTEGER Number of padding base to call โ
โ motif. โ
โ --all-effects -a Output all effects. โ
โ --with-header -H With header line in input file. โ
โ --columns -c TEXT Sets columns for site info. โ
โ (Chrom,Pos,Strand,Ref,Alt) โ
โ [default: 1,2,3,4,5] โ
โ --help -h Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
demo:
Store the following table in file (sites.tsv
).
Chrom | Position | Strand | Ref | Alt |
---|---|---|---|---|
chr1 | 230703034 | - | C | T |
chr12 | 69353439 | + | A | T |
chr14 | 23645352 | + | G | T |
chr2 | 215361150 | - | A | T |
chr2 | 84906537 | + | C | T |
chr22 | 39319077 | - | T | A |
chr22 | 39319095 | - | T | A |
chr22 | 39319098 | - | T | A |
Run command:
variant-effect -i sites.tsv -H -r human -e 108 -t RNA -H -c 1,2,3
-i
specify the input file-H
means the file is with header line, and the first row will be skipped;-r
use the specific genome, default is human-e
specify the Ensembl release version-c
means only use some of the columns in the input file. default will use the first 5 columns.
You will have this output
Chrom | Position | Strand | Ref | Alt | mut_type | gene_type | gene_name | gene_pos | transcript_name | transcript_pos | transcript_motif | coding_pos | codon_ref | aa_pos | aa_ref | distance2splice |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
chr1 | 230703034 | - | C | T | ThreePrimeUTR | protein_coding | ENSG00000135744(AGT) | 42543 | ENST00000680041(AGT-208) | 1753 | TGTGTCACCCCCAGTCTCCCA | None | None | None | None | 295 |
chr12 | 69353439 | + | A | T | ThreePrimeUTR | protein_coding | ENSG00000090382(LYZ) | 5059 | ENST00000261267(LYZ-201) | 695 | TAGAACTAATACTGGTGAAAA | None | None | None | None | 286 |
chr14 | 23645352 | + | G | T | ThreePrimeUTR | protein_coding | ENSG00000100867(DHRS2) | 15238 | ENST00000344777(DHRS2-202) | 1391 | CTGCCATTCTGCCAGACTAGC | None | None | None | None | 210 |
chr2 | 215361150 | - | A | T | ThreePrimeUTR | protein_coding | ENSG00000115414(FN1) | 74924 | ENST00000323926(FN1-201) | 8012 | GGCCCGCAATACTGTAGGAAC | None | None | None | None | 476 |
chr2 | 84906537 | + | C | T | ThreePrimeUTR | protein_coding | ENSG00000034510(TMSB10) | 882 | ENST00000233143(TMSB10-201) | 327 | CCTGGGCACTCCGCGCCGATG | None | None | None | None | 148 |
chr22 | 39319077 | - | T | A | Intronic | protein_coding | ENSG00000100316(RPL3) | 1313 | ENST00000216146(RPL3-201) | None | None | None | None | None | None | None |
chr22 | 39319095 | - | T | A | Intronic | protein_coding | ENSG00000100316(RPL3) | 1295 | ENST00000216146(RPL3-201) | None | None | None | None | None | None | None |
chr22 | 39319098 | - | T | A | Intronic | protein_coding | ENSG00000100316(RPL3) | 1292 | ENST00000216146(RPL3-201) | None | None | None | None | None | None | None |
โณโณโณ more functions will be supported in the future
TODO:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
variant-0.0.75.tar.gz
(12.4 kB
view details)
Built Distribution
variant-0.0.75-py3-none-any.whl
(12.5 kB
view details)
File details
Details for the file variant-0.0.75.tar.gz
.
File metadata
- Download URL: variant-0.0.75.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2c8dc60c670f95a8c509820c7750d1b03e643c04678096d7d46694fc88624aa |
|
MD5 | 7fc37b95c7ce8dd4622df9465fb2ea31 |
|
BLAKE2b-256 | 0284fbb4408446d573727657dae732b173397d29fe0732507925f70e19272175 |
File details
Details for the file variant-0.0.75-py3-none-any.whl
.
File metadata
- Download URL: variant-0.0.75-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.13 Darwin/23.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94cf4656c6b51f6a37dc303138fb14339126a32ee850708ddc54037a96fad33c |
|
MD5 | f07e3e9606b4cdb9f40bd6e433fbe4cf |
|
BLAKE2b-256 | ddc1cd625dc2feb27bc61cd4ff2d3a14d2233fa7cf393fbc1e5683e664aa71e9 |