Xu Yuxing's personal comparative genomics tools

Project description

yxcompgen

Xu Yuxing's personal comparative genomics tools

Installation

pip install yxcompgen

Usage

Example for orthogroups analysis

orthogroups tsv file: The format which OrthoFinder outputs (Orthogroups.tsv), with the first column as orthogroup ID and the rest columns as gene IDs from different species, separated by tab (\t) and gene IDs separated by a comma and a space (, ). File have a header line, which first column is Orthogroup and the rest columns are species names.

read a orthogroups file:

from yxcompgen import OrthoGroups
OGs = OrthoGroups(OG_tsv_file="/path/to/Orthogroups.tsv")
# get orthogroup information
OGs.get(('OG0000000', 'Ath'))

species info file: An Excel file with columns sp_id, taxon_id, species_name, genome_file, gff_file, pt_file, cDNA_file, cds_file. sp_id is the species ID, taxon_id is the taxon ID, species_name is the species name, genome_file is the genome file path, gff_file is the GFF file path, pt_file is the protein sequence file path, cDNA_file is the cDNA sequence file path, cds_file is the CDS sequence file path.

read a species info file:

from yxcompgen import read_species_info
ref_xlsx = '/path/to/species_info.xlsx'
sp_info_dict = read_species_info(ref_xlsx)

Example for synteny blocks building

input: gff file and gene pair file

gff file should be in gff3 format, and gene pair file should be a tab-delimited file with two columns, each row is a gene pair from two species.

Cca_Gene1 Sly_Gene1
Cca_Gene2 Sly_Gene2
...

sp1_id = 'Cca'
sp1_gff = '/path/to/Cca.gff3'
sp2_id = 'Sly'
sp2_gff = '/path/to/Sly.gff3'
gene_pair_file = '/path/to/gene_pair.txt'

build synteny blocks

from yxcompgen import GenomeSyntenyBlockJob
sb_job = GenomeSyntenyBlockJob(
    sp1_id, sp1_gff, sp2_id, sp2_gff, gene_pair_file)
sb_job.build_synteny_blocks()

write synteny blocks to file

output file is in MCScan format

mcscan_output_file = "/path/to/collinearity_output.txt"
sb_job.write_mcscan_output(mcscan_output_file)

Or you can read synteny blocks from file

sb_job = GenomeSyntenyBlockJob(
    sp1_id, sp1_gff, sp2_id, sp2_gff)
sb_job.read_mcscan_output(mcscan_output_file)

You can also work with only one genome

sb_job = GenomeSyntenyBlockJob(
    sp1_id, sp1_gff, gene_pair_file=gene_pair_file)

Example for synteny blocks plot

sb_job.plot()
highlight_sb_list = [65, 178, 237, 331]
sb_job.plot(mode='loci', reverse=True, highlight_synteny_blocks=highlight_sb_list)

Project details

Release history Release notifications | RSS feed

This version

0.0.1

Oct 20, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yxcompgen-0.0.1.tar.gz (71.1 kB view details)

Uploaded Oct 20, 2024 Source

Built Distribution

yxcompgen-0.0.1-py3-none-any.whl (74.6 kB view details)

Uploaded Oct 20, 2024 Python 3

File details

Details for the file yxcompgen-0.0.1.tar.gz.

File metadata

Download URL: yxcompgen-0.0.1.tar.gz
Upload date: Oct 20, 2024
Size: 71.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.7

File hashes

Hashes for yxcompgen-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`8999fe841da5aca881c19f19fc20c5653d7a6bc6158cf6965c05e069b4184f78`
MD5	`0ed4a807c4847eafeb9979134fd2c215`
BLAKE2b-256	`ecb81bf4a95e66a1ab330d5dce87f5e6809e968c6e3e32ce6d2baf028cb6401f`

See more details on using hashes here.

File details

Details for the file yxcompgen-0.0.1-py3-none-any.whl.

File metadata

Download URL: yxcompgen-0.0.1-py3-none-any.whl
Upload date: Oct 20, 2024
Size: 74.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.7

File hashes

Hashes for yxcompgen-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`857e311bf4830fa059b3a7f11783058ac6ab5ec1d46a4016761d81f188c3b4af`
MD5	`2f5148832fc34f9c6c1deb5ee0c299e4`
BLAKE2b-256	`731928a0fea799458ae64bbcc2a9cae2678445277ef343fa25f6f70002a1e672`