To get taxa information of sequences from BOLD system
Extract any CDS or rNRA or tRNA DNA sequences of genes from Genbank file.
common cmd tools to be used in Python3 scripts. By Guanliang MENG, see https://github.com/linzhi2013/mglcmdtools.
To get taxonomy ranks information with ETE3 from NCBI Taxonomy database.
To run commands line by line, and check each exit status
To extract some codon positions (1st, 2nd, 3rd) from a CDS alignment.
To tar/compress files/directories in batch mode.
Extract some fastq reads (PE/SE) from the beginning of the files
To collect SGE job information with a damemon
To extract the sequence depth from depthfile.
To check for the internal stop codon in Genbank or FASTA file (CDS), then substitute the internal stop codon with NNN.
To get the coordinates of a given CIGAR string. By Guanliang MENG, see https://github.com/linzhi2013/cigar_coordinates.
To return CIGAR strings of a multiple sequences alignment
To filter the sequences by translating the protein coding genes (PCGs) with proper genetic code table, if one of the PCGs has interal stop codon, filter out this sequence. See https://github.com/linzhi2013/breakSeqInNs_then_translate
To extract specific fasta sequences from a fasta file. By Guanliang MENG, see https://github.com/linzhi2013
to get specific lines from the subject file which maps the query ids. Written by Guanliang MENG
To extract some sites from a multiple sequence alignment. By Guanliang MENG, go to https://github.com/linzhi2013 for more details.
To stat the counts and percentage of each base in fasta file
To convert multiple alignment sequences (msa) to different format