eplacer

Machine learning platform for taxonomic classification

These details have not been verified by PyPI

License
- Public Domain
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

ePlacer

ePlacer is a taxonomic classification tool that uses deep-learning approaches to incorporate both sequence information and biogeographic information into taxonomic assignment of DNA sequences.

Why use ePlacer

The machine learning architecture of ePlacer enables powerful prediction beyond sequence-only classification tools (e.g. sequence alignment with blast or naive-bayes classifiers) by directly incorporating additional data into the probabalistic estimate of taxonomy, specifically developed for metabarcoding data. This novel applciation of deep-learning is immensely useful, as there can be many cases in metabarcoding data where two reference species have 100% sequence overlap, but distinct geographic ranges. This tool discriminates these cases and provides additional data for downstream taxonomic curation. Due to this, ePlacer provides enhanced interoperability between metabarcoding datasets.

Currently, ePlacer offers pre-trained models for two popular metabarcoding regions: the MiFish and the ecoPrimer, or Riaz, marker gene regions. For these two regions, ePlacer offers the following benefits:

Interoperability. ePlacer is trained on global datasets, allowing for direct comparison between metabarcoding datasets, regardless of geographic region.
Portability. ePlacer has pre-trained models available for both MiFish and Riaz marker gene regions containerized and available for out-of-the-box use
Interactive Visualization. ePlacer provides an interactive GUI and curation tool that allows
Increased Accuracy. The ePlacer model architecture provides increased accuracy, precision, and recall as compared to blast, Naive-Bayes, or least common ancestor approachers
Trainability In addition to the two provided barcodes, this code repository provides tools for training new models.

For other barcode regions, there will be significant advantages with the training of new models. If you are interested in training a new model for ePlacer, please do not hesitate to reach out!

Installation

Users can install the current version of ePlacer with conda.

conda install bioconda::eplacer

Using ePlacer for classification

The ePlacer taxonomic assignment tool can be run two ways: natively (through the ePlacer CLI or API) or with a QIIME2 plugin. Here, the documentation will be detailing the native usage. Details on usage of the QIIME2 plugin can be found in the linked git repository.

ePlacer taxonomically classified ASV sequences using two distinct types of information:

Sequence information (inferred from ASVs)
Biogeography (inferred from sample metadata and count tables)

Although not strictly required for assignment, blast results are also used to automatically check "solvable" taxonomic assignments and resolve them more accurately as an automated curation step.

Using this information, ePlacer generates a raw confidence of presence across all possible taxonomic labels.

In order to run classification with ePlacer, four data files are required. Properly formatted examples can be seen here:

A fasta file of ASVs

>ASV1
CCGTAAACTTAGATAAATTAGTACAACAAATATCGGCCCGGGAACT
>ASV2
CGGTAAACTTAGATATATTAGTACAACAAATATCGGCCCGGGAACT
>ASV3
CGGTAAACTTAGATATATTAGTACAACAAATATCGGCCCGGGAACT

A geography metadata file

#SampleID	Latitude	Longitude
Sample1	39.645946	-71.746641
Sample2	39.645946	-71.746641

A count table

#OTU ID	Sample1	Sample2
ASV1	15	0
ASV2	5	22
ASV3	0	10

blast data output (generated with -outfmt "6 qseqid sseqid pident evalue length qlen slen qstart qend sstart send sseq")

ASV1	SubjectRef_A	100.00	1.45e-45	98	98	98	1	98	1	98	GCCGTAAACTTAGATAAATTAGTACAACAAATATCGGCCCGGGAACTACGAGCGCCAGCTTATAACCCAAAGGACTTGGCGCTGCTTCAGACCCCCCT
ASV2	SubjectRef_B	99.00	2.12e-42	98	98	98	1	98	1	98	GCGGTAAACTTAGATATATTAGTACAACAAATATCGGCCCGGGAACTACGAGCGCCTGCTTAAAACCCAAAGGTCTTGGCGGTGCTTCAGACCCCCCT
ASV3	SubjectRef_C	100.00	1.45e-45	98	98	98	1	98	1	98	GCGGTAAACTTAGATATATTAGTACAACAAATATCGGCCCGGGAACTACGAGCGCCTGCTTAAAACCCAAAGGTCTTGGCGGTGCTTCAGACCCCCCT

Acquiring pre-trained models.

Pre-trained models can be acquired from Zenodo (doi:10.5281/zenodo.20820029). Currently, only 12S-V5 ecoprimer and mifish primers are available, but others will be created and stored in the future. If you develop your own model, please don't hesitate to reach out.

Natively trained models contain directories of information and can be obtained in the following manner:

wget https://zenodo.org/records/20820029/files/mifish.tar.gz
tar -xzf mifish.tar.gz
wget https://zenodo.org/records/20820029/files/riaz.tar.gz
tar -xzf riaz.tar.gz

Note we also provide pre-compiled *.qza models for use with QIIME2. These can be found in the same zenodo repository.

Running Classification with Pre-trained models

For users that have generated their own models, use the following code:

eplacer run-model --fasta <fasta path> --counts <count matrix> --geoData <geoData path> --confidence <threshold> --model <model path> --maskrate 0

Training new ePlacer models

Training new ePlacer models is very simple! All that is required is an aligned fasta file for the barcode of interest (containing all available references of interest), a flat taxonomy file, and a reference file for biogeography (currently, eplacer supports the OBIS csv download).

ePlacer also supports custom references for biogeography, formatted as follows:

#Species	Latitude	Longitude
SpeciesLabelA	39.645946	-71.746641
SpeciesLabelB	39.645946	-71.746641

To run the training, use the following:

eplacer train-model --fasta <alignment file> --taxa <taxonomy file> \
            --out <output directory> --taxlevel SPECIES \
            --geoData <obis data> --augments <Several parameters should be test here> \
            --maskrate <Several parameters should be test here> --threads 1

==============================================================

This repository is a scientific product and is not official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA GitHub project code is provided on an ‘as is’ basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this GitHub project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.

Project details

These details have not been verified by PyPI

License
- Public Domain
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.1

Jun 23, 2026

This version

0.1.0

Jun 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eplacer-0.1.0.tar.gz (28.2 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

eplacer-0.1.0-py3-none-any.whl (28.5 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file eplacer-0.1.0.tar.gz.

File metadata

Download URL: eplacer-0.1.0.tar.gz
Upload date: Jun 23, 2026
Size: 28.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for eplacer-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`91a17596dcf9d2e6e990ee255e3d55b76e35e9dfde0abb6a8963d588bb921b5f`
MD5	`c6c03c9dbbd1e8a0117174bb2ebe603a`
BLAKE2b-256	`8a01b2b16a01def6ae956545445942501e79fa1040910da2686c342721ef657f`

See more details on using hashes here.

File details

Details for the file eplacer-0.1.0-py3-none-any.whl.

File metadata

Download URL: eplacer-0.1.0-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 28.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for eplacer-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c0303ed7e772a011290cd5eca8b3d194407819895ca874376cf46a518e79ba78`
MD5	`79f383dcbace6219b6e347587cfebf17`
BLAKE2b-256	`aa65eb60de19afc757f2a788480478b864e6065909429c531630a772ad564c57`

See more details on using hashes here.

eplacer 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

ePlacer

Why use ePlacer

Installation

Using ePlacer for classification

Acquiring pre-trained models.

Running Classification with Pre-trained models

Training new ePlacer models

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes