A novel way to find target gene in suppressor or forward genetic screening

Project description

G-angler

pip install Gangler

PLEASE INSTALL: pandas, matplotlib, seaborn BEFORE USE

This package can help find candidate genes in high throughput mutagenesis and suppressor screening experiments without mapping. Please call variants via freebayes and annotate with snpeff before using this package. Any advise is welcomed, please contact

[e-mail]:guozhengyang980525@yahoo.co.jp

genetorch.reader

a = Gangler.prepare.pool(filepath)

multiple renamed vcf files must be included in the filepath. examples:

filepath
|---juz113.vcf
|---juz114.vcf
|---juz115.vcf
|---juz116.vcf
|---juz117.vcf
|---juz118.vcg

a = Gangler.prepare.getpool(filepath,filename)

multiple renamed folders must be included in the filepath, and a vcf file with filename must be included in the folders, the name of the folder must be splited with '_' to divide the folder name into strain name and WGS order name: examples:

filepath
|---juz113_20221011jxskaosdosh---filename.vcf
|---juz114_20221011jxskaosdosh---filename.vcf
|---juz115_20221011jxskaosdosh---filename.vcf
|---juz116_20221011jxskaosdosh---filename.vcf
|---juz117_20221011jxskaosdosh---filename.vcf
|---juz118_20221011jxskaosdosh---filename.vcf

a temp folder will be automatically created in the filepath including renamed vcf files:

filepath
|----temp
|     |---juz113.vcf
|     |---juz114.vcf
|     |---juz115.vcf
|     |---juz116.vcf
|     |---juz117.vcf
|     |---juz118.vcg
|---juz113_20221011jxskaosdosh---filename.vcf
|---juz114_20221011jxskaosdosh---filename.vcf
|---juz115_20221011jxskaosdosh---filename.vcf
|---juz116_20221011jxskaosdosh---filename.vcf
|---juz117_20221011jxskaosdosh---filename.vcf
|---juz118_20221011jxskaosdosh---filename.vcf

a = Gangler.prepare.pool() a = Gangler.prepare.getpool()

a.taglist : a list of Dataframes which included columns: 'gene', 'ID', 'type', 'base', 'protein'，'tag' column 'tag' will be filled with strain name

example: a.taglist[1]:

gene	ID	type	base	protein	tag
ttn-1	WBGenexxxx	missense	C<G	Asp666Asn	juz113
cla-1	WBGenexxxx	missense	C<G	Asp223Asn	juz114

a = Gangler.pool.snpool(poollist,targetlist)

a.result will contain all the result you need, small m_value indicate that there is high possibility that this gene is the target gene in this screening. Details will be explained in bioRxiv paper.

Examples


import Gangler as gl
a = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneA")
b = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneB")
c = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneC")
d = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneD")
e = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneE")
f = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneF")
j = gl.pool.snpool([a,b,c,d,e,f],['geneA','geneB',geneC','geneD','geneE','geneF'])
j.result.to_csv(r"C:\Users\YOUNG\Desktop\temp.csv")

Project details

Release history Release notifications | RSS feed

0.0.4

Jul 18, 2022

This version

0.0.3

Jul 4, 2022

0.0.2

Jul 4, 2022

0.0.1

May 25, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Gangler-0.0.3.tar.gz (9.1 kB view hashes)

Uploaded Jul 4, 2022 Source

Built Distribution

Gangler-0.0.3-py3-none-any.whl (9.6 kB view hashes)

Uploaded Jul 4, 2022 Python 3

Hashes for Gangler-0.0.3.tar.gz

Hashes for Gangler-0.0.3.tar.gz
Algorithm	Hash digest
SHA256	`e9202920ce745e36986a37bfd41c428469e06b32efdf3ad98b55930ae0392e88`
MD5	`bb934589ab9a5181403ad8beaf5d1147`
BLAKE2b-256	`2dc967b95f90a127591e9ffee7f48ae45e708c35ffa2816a62485319a84c7c85`

Hashes for Gangler-0.0.3-py3-none-any.whl

Hashes for Gangler-0.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`801f10e49ee481454414d8a06a9ca603af0f57682a06b75a3457297ce0ac74a9`
MD5	`d675adfc287015b71e02f639048fa47e`
BLAKE2b-256	`4970b6da0cb76ef2cc7b7e677d4b841b37c6238f6aef8ad283c7febedf92c8e4`