20 projects
gbseqextractor
Extract any CDS or rNRA or tRNA DNA sequences of genes from Genbank file.
msaconverter
To convert multiple alignment sequences (msa) to different format
extract-specific-lines
to get specific lines from the subject file which maps the query ids. Written by Guanliang MENG
rawdatafilter
raw_data_filter wrapper
bold-identification
To get taxa information of sequences from BOLD system
mglcmdtools
common cmd tools to be used in Python3 scripts. By Guanliang MENG, see https://github.com/linzhi2013/mglcmdtools.
taxonomy-ranks
To get taxonomy ranks information with ETE3 from NCBI Taxonomy database.
batch-run-cmd
To run commands line by line, and check each exit status
extract-codon-alignment
To extract some codon positions (1st, 2nd, 3rd) from a CDS alignment.
batch-tar
To tar/compress files/directories in batch mode.
extractfq
Extract some fastq reads (PE/SE) from the beginning of the files
sgejob
To collect SGE job information with a damemon
depth-stat
To extract the sequence depth from depthfile.
polish-genbank
To check for the internal stop codon in Genbank or FASTA file (CDS), then substitute the internal stop codon with NNN.
cigar-coordinates
To get the coordinates of a given CIGAR string. By Guanliang MENG, see https://github.com/linzhi2013/cigar_coordinates.
msa-cigars
To return CIGAR strings of a multiple sequences alignment
breakSeqInNs-then-translate
To filter the sequences by translating the protein coding genes (PCGs) with proper genetic code table, if one of the PCGs has interal stop codon, filter out this sequence. See https://github.com/linzhi2013/breakSeqInNs_then_translate
extract-fasta-seq
To extract specific fasta sequences from a fasta file. By Guanliang MENG, see https://github.com/linzhi2013
extract-specific-sites-from-msa
To extract some sites from a multiple sequence alignment. By Guanliang MENG, go to https://github.com/linzhi2013 for more details.
atgcN-count
To stat the counts and percentage of each base in fasta file