A novel way to find target gene in suppressor or forward genetic screening
Project description
G-angler 0.0.1
pip install Gangler
PLEASE INSTALL: pandas, matplotlib, seaborn BEFORE USE
This package can help find candidate genes in high throughput mutagenesis and suppressor screening experiments without mapping. Please call variants via freebayes and annotate with snpeff before using this package. Any advise is welcomed, please contact
[e-mail]:guozhengyang980525@yahoo.co.jp
genetorch.reader
a = Gangler.prepare.pool(filepath)
multiple renamed vcf files must be included in the filepath. examples:
filepath
|---juz113.vcf
|---juz114.vcf
|---juz115.vcf
|---juz116.vcf
|---juz117.vcf
|---juz118.vcg
a = Gangler.prepare.getpool(filepath,filename)
multiple renamed folders must be included in the filepath, and a vcf file with filename must be included in the folders, the name of the folder must be splited with '_' to divide the folder name into strain name and WGS order name: examples:
filepath
|---juz113_20221011jxskaosdosh---filename.vcf
|---juz114_20221011jxskaosdosh---filename.vcf
|---juz115_20221011jxskaosdosh---filename.vcf
|---juz116_20221011jxskaosdosh---filename.vcf
|---juz117_20221011jxskaosdosh---filename.vcf
|---juz118_20221011jxskaosdosh---filename.vcf
a temp folder will be automatically created in the filepath including renamed vcf files:
filepath
|----temp
| |---juz113.vcf
| |---juz114.vcf
| |---juz115.vcf
| |---juz116.vcf
| |---juz117.vcf
| |---juz118.vcg
|---juz113_20221011jxskaosdosh---filename.vcf
|---juz114_20221011jxskaosdosh---filename.vcf
|---juz115_20221011jxskaosdosh---filename.vcf
|---juz116_20221011jxskaosdosh---filename.vcf
|---juz117_20221011jxskaosdosh---filename.vcf
|---juz118_20221011jxskaosdosh---filename.vcf
a = Gangler.prepare.pool()
a = Gangler.prepare.getpool()
a.taglist : a list of Dataframes which included columns: 'gene', 'ID', 'type', 'base', 'protein','tag' column 'tag' will be filled with strain name
example: a.taglist[1]:
gene | ID | type | base | protein | tag |
---|---|---|---|---|---|
ttn-1 | WBGenexxxx | missense | C<G | Asp666Asn | juz113 |
cla-1 | WBGenexxxx | missense | C<G | Asp223Asn | juz114 |
a = Gangler.pool.snpool(poollist,targetlist)
a.result will contain all the result you need, small m_value indicate that there is high possibility that this gene is the target gene in this screening. Details will be explained in bioRxiv paper.
Examples
import Gangler as gl
a = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneA")
b = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneB")
c = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneC")
d = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneD")
e = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneE")
f = gl.prepare.pool(r"C:\Users\YOUNG\Desktop\geneF")
j = gl.pool.snpool([a,b,c,d,e,f],['geneA','geneB',geneC','geneD','geneE','geneF'])
j.result.to_csv(r"C:\Users\YOUNG\Desktop\temp.csv")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.