decOM: K-mer method for aOral metagenome decontamination
Project description
decOM: Microbial source tracking for contamination assessment of ancient oral samples using k-mer-based methods
decOM
is a high-accuracy microbial source tracking method that is suitable for contamination quantification in paleogenomics, namely the analysis of collections of possibly contaminated ancient oral metagenomic data sets. In simple words, if you want to know how contaminated your ancient oral metagenomic sample is, this tool will help :)
System requirements
decOM
has been developed and tested under a Linux environment.
It requires certain packages/tools in order to be installed/used:
Installation
Install decOM
through conda:
conda install -c camiladuitama decom
To make the decOM
command available, it is advised to include the absolute path of decOM
in your PATH environment variable by adding the following line to your ~/.bashrc
file:
export PATH=/absolute/path/to/decOM:${PATH}
Before running decOM
The users of decOM
can represent their own metagenomic sample as a presence/absence vector of k-mers using kmtricks, and compare this new sink against the collection of sources we have put together. This means that before running decOM
you must first download the folder decOM_sources.tar.gz and decompress it
wget https://zenodo.org/record/6513520/files/decOM_sources.tar.gz
tar -xf decOM_sources.tar.gz
Test
You can test if decOM
is working by using one of the aOral samples present in the test/sample/
folder, ex: SRR13355787.
decOM -s SRR13355787 -p_sources decOM_sources/ -k SRR13355787_key.fof -mem 10GB -t 5 -o decOM_output/
Note: The final memory allocated for each run of decOM
will be your input in -mem times the number of cores. In the previous run we used 10GB * 5 = 50 GB.
Output files
decOM
will output one .csv file with the k-mer counts and proportions, a folder with the vector representing the sink and a barplot if indicated by the user
decOM_output/
├──{sink}_OM_output.csv
├──result_plot_{sink}.pdf
├──{sink}_vector/
Example from an input fastq/fasta file
You can use as input your fastq/fasta file from your own experiment, you can download an ancient oral sample of interest from the AncientMetagenomeDir or from the SRA.
Once you have downloaded the folder with the matrix of sources decOM_sources.tar.gz , and your fastq file(s) of interest (from now on called sink), you have to create a key.fof
file per sink.
The key.fof
has one line of text depending on your type of data:
-Paired-end :
s : path/to/file/s_1.fastq.gz
-Single-end:
s : path/to/file/s_1.fastq.gz; path/to/file/s_2.fastq.gz
Note: As decOM
relies on kmtricks
, you might use a FASTA or FASTQ format, gzipped or not.
Which means you have to change the key.fof
file accordingly.
Since you now have the fasta/fastq file of your sink, the folder with the matrix of sources and the key file, simply run decOM as follows:
decOM -s {SINK} -p_sources decOM_sources/ -k {KEY.FOF} -mem {MEMORY} -t {THREADS} -o {OUTPUT}
Command line options
usage: modules [-h] -s SINK -p_sources PATH_SOURCES -k KEY -mem MEMORY -t THREADS -o OUTPUT [-p PLOT] [-V]
Microbial source tracking for contamination assessment of ancient oral samples using k-mer-based methods
optional arguments:
-h, --help show this help message and exit
-s SINK, --sink SINK Write down the name of your sink
-p_sources PATH_SOURCES, --path_sources PATH_SOURCES
path to folder downloaded from https://zenodo.org/record/6385193#.Ym-wTy8RphA
-k KEY, --key KEY filtering key (a kmtricks fof with only one sample).
-mem MEMORY, --memory MEMORY
Write down how much memory you want to use for this process. Ex: 20GiB
-t THREADS, --threads THREADS
Number of threads to use. Ex: 5
-o OUTPUT, --output OUTPUT
Path to output folder, where you want decOM to write the results
-p PLOT, --plot PLOT True if you want a plot with the source proportions of the sink, else False
-V, --version Show version number and exit
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file decOM-0.0.16.tar.gz
.
File metadata
- Download URL: decOM-0.0.16.tar.gz
- Upload date:
- Size: 33.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c90df4f35038f699a528f0d6a58d144e42ff1df87b8cb2f6a6760ca9def60f76 |
|
MD5 | 5e3d0807409b587c84210b72aca69f1f |
|
BLAKE2b-256 | a0c3dfbf2134b94c6f732543c84fd923ddb7f2f631f483a208ae254e48faa109 |
File details
Details for the file decOM-0.0.16-py3-none-any.whl
.
File metadata
- Download URL: decOM-0.0.16-py3-none-any.whl
- Upload date:
- Size: 31.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12ce4892041e15ec2a033790d6effb80d9c7e8e258a1e6b77a9365539bb253b1 |
|
MD5 | b00478b26efeb4397354306798bc7428 |
|
BLAKE2b-256 | 0b5c0a99b3dc6d682518d0e8d973d7788214271ca43c38b02d8883865ee8b711 |