VTAM - Validation and Taxonomic Assignation of Metabarcoding Data is a metabarcoding pipeline. The analysis starts from high throughput sequencing (HTS) data of amplicons of one or several metabarcoding markers and produces an amplicon sequence variant (ASV) table of validated variants assigned to taxonomic groups.
Project description
VTAM is a metabarcoding package with various commands to process high throughput sequencing (HTS) data of amplicons of one or several metabarcoding markers in FASTQ format and produce a table of amplicon sequence variants (ASVs) assigned to taxonomic groups. If you use VTAM in scientific works, please cite the following article:
González, A., Dubut, V., Corse, E., Mekdad, R., Dechatre, T. and Meglécz, E.. VTAM: A robust pipeline for processing metabarcoding data using internal controls. bioRxiv: 10.1101/2020.11.06.371187v1.
Commands for a quick installation:
conda create --name vtam python=3.9 -y
conda activate vtam
Then install dependencies
python3 -m pip install cutadapt
conda install -c bioconda blast -y
conda install -c bioconda vsearch -y
python3 -m pip install vtam
Commands for a quick working example:
vtam example
cd example
snakemake --printshellcmds --resources db=1 --snakefile snakefile.yml --cores 4 --configfile asper1/user_input/snakeconfig_mfzr.yml --until asvtable_taxa
The table of amplicon sequence variants (ASV) is here:
(vtam) user@host:~/vtam/example$ head -n4 asper1/run1_mfzr/asvtable_default_taxa.tsv
run marker variant sequence_length read_count tpos1_run1 tnegtag_run1 14ben01 14ben02 clusterid clustersize chimera_borderlineltg_tax_id ltg_tax_name ltg_rank identity blast_db phylum class order family genus species sequence
run1 MFZR 25 181 478 478 0 0 0 25 1 False 131567 cellular organisms no rank 80 coi_blast_db_20200420 ACTATACCTTATCTTCGCAGTATTCTCAGGAATGCTAGGAACTGCTTTTAGTGTTCTTATTCGAATGGAACTAACATCTCCAGGTGTACAATACCTACAGGGAAACCACCAACTTTACAATGTAATCATTACAGCTCACGCATTCCTAATGATCTTTTTCATGGTTATGCCAGGACTTGTT
run1 MFZR 51 181 165 0 0 0 165 51 1 False coi_blast_db_20200420 ACTATATTTAATTTTTGCTGCAATTTCTGGTGTAGCAGGAACTACGCTTTCATTGTTTATTAGAGCTACATTAGCGACACCAAATTCTGGTGTTTTAGATTATAATTACCATTTGTATAATGTTATAGTTACGGGTCATGCTTTTTTGATGATCTTTTTTTTAGTAATGCCTGCTTTATTG
run1 MFZR 88 175 640 640 0 0 0 88 1 False 1592914 Caenis pusilla species 100 coi_blast_db_20200420 Arthropoda Insecta Ephemeroptera Caenidae Caenis Caenis pusilla ACTATATTTTATTTTTGGGGCTTGATCCGGAATGCTGGGCACCTCTCTAAGCCTTCTAATTCGTGCCGAGCTGGGGCACCCGGGTTCTTTAATTGGCGACGATCAAATTTACAATGTAATCGTCACAGCCCATGCTTTTATTATGATTTTTTTCATGGTTATGCCTATTATAATC
The database of intermediate data is here:
(vtam) user@host:~/vtam/example$ sqlite3 asper1/db.sqlite '.tables'
FilterChimera Sample
FilterChimeraBorderline SampleInformation
FilterCodonStop SortedReadFile
FilterIndel TaxAssign
FilterLFN Variant
FilterLFNreference VariantReadCount
FilterMinReplicateNumber wom_Execution
FilterMinReplicateNumber2 wom_FileInputOutputInformation
FilterMinReplicateNumber3 wom_Option
FilterPCRerror wom_TableInputOutputInformation
FilterRenkonen wom_TableModificationTime
Marker wom_ToolWrapper
ReadCountAverageOverReplicates wom_TypeInputOrOutput
Run
The VTAM documentation is hosted at ReadTheDocs.
VTAM is maintained by Aitor González (aitor dot gonzalez at univ-amu dot fr) and Emese Meglécz (emese dot meglecz at univ-amu dot fr).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.