Advanced Pipeline for Simple yet Comprehensive AnaLysEs of DNA metabarcoding data

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

apscale

Advanced Pipeline for Simple yet Comprehensive AnaLysEs of DNA metabarcoding data

Introduction

Apscale is a metabarcoding pipeline that handles the most common tasks in metabarcoding pipelines like paired-end merging, primer trimming, quality filtering and denoising, swarm and threshold based clustering as well as basic data handling operations such as replciate merging and the removal of reads found in the negative controls. It uses a simple command line interface and is configured via a single configuration file. To add metadata to the dataset a simple, browser-based interface is introduced in version 4.0. It automatically uses the available ressources on the machine it runs on while still providing the option to use less if desired. All modules can be run on their own or as a comprehensive workflow.

For further details see the manual available here

A graphical user interface version for apscale is available here.

Programs used:

vsearch (PE merging, quality filtering, denoising, chimera removal, species grouping)
cutadapt (primer trimming)
swarm (swarm clustering, if enabled)

Input:

demultiplexed gzipped reads

Output:

log files, project report, ESV / OTU tables

Installation

Apscale can be installed on all common operating systems (Windows, Linux, MacOS). Apscale requires Python 3.11 or higher and can be easily installed via pip in any command line:

pip install apscale

To update apscale run:

pip install --upgrade apscale

Further dependencies - vsearch and swarm

Apscale calls vsearch as well as swarm for multiple modules. It should be installed and be in PATH to be executed from anywhere on the system.

Check the vsearch Github page for further info:

https://github.com/torognes/vsearch https://github.com/torognes/swarm

Support for compressed files with zlib is necessary. For Unix based systems this is shipped with vsearch, for Windows the zlib.dll can be downloaded via:

zlib for Windows

The dll has to be in the same folder as the vsearch executable. If you need help with adding a folder to PATH in windows please take a look at the first answer on this stackoverflow issue:

How to add a folder to PATH Windows

To check if everything is correctly set up please type this into your command line:

vsearch --version swarm --version

It should return a message similar to this:

vsearch v2.19.0_win_x86_64, 31.9GB RAM, 24 cores
https://github.com/torognes/vsearch

Rognes T, Flouri T, Nichols B, Quince C, Mahe F (2016)
VSEARCH: a versatile open source tool for metagenomics
PeerJ 4:e2584 doi: 10.7717/peerj.2584 https://doi.org/10.7717/peerj.2584

Compiled with support for gzip-compressed files, and the library is loaded.
zlib version 1.2.5, compile flags 65
Compiled with support for bzip2-compressed files, but the library was not found.

Swarm 3.1.5
Copyright (C) 2012-2024 Torbjorn Rognes and Frederic Mahe
https://github.com/torognes/swarm

Mahe F, Rognes T, Quince C, de Vargas C, Dunthorn M (2014)
Swarm: robust and fast clustering method for amplicon-based studies
PeerJ 2:e593 https://doi.org/10.7717/peerj.593

Mahe F, Rognes T, Quince C, de Vargas C, Dunthorn M (2015)
Swarm v2: highly-scalable and high-resolution amplicon clustering
PeerJ 3:e1420 https://doi.org/10.7717/peerj.1420

Mahe F, Czech L, Stamatakis A, Quince C, de Vargas C, Dunthorn M, Rognes T (2022)
Swarm v3: towards tera-scale amplicon clustering
Bioinformatics 38:1, 267-269 https://doi.org/10.1093/bioinformatics/btab493

Further dependencies - cutadapt

Apscale also calls cutadapt with the primer trimming module. Cutadapt should be downloaded and installed automatically with the Apscale installation. To check this, type:

cutadapt --version

and it should return the version number, for example:

5.1

How to use

Create a new apscale project

Apscale is organized in projects with the following structure:

C:\USERS\DOMINIK\DESKTOP\EXAMPLE_PROJECT
â”œâ”€â”€â”€01_raw_data
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€02_demultiplexing
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€03_PE_merging
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€04_primer_trimming
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€05_quality_filtering
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€06_dereplication
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€07_denoising
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€08_swarm_clustering
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€09_replicate_merging
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€10_nc_removal
â”‚   â””â”€â”€â”€data
â”œâ”€â”€â”€11_read_table
â”œâ”€â”€â”€12_analyze
    â””â”€â”€â”€data

A new project can be initialized with the command:

apscale --create_project NAME

If you prefer to have your data all in one place you can copy the raw data into 1_raw_data/data. Demultiplexing won't be handled by Apscale because there are to many different tagging systems out there to implement in a single pipeline. If you are using inline barcodes you can take a look at https://github.com/DominikBuchner/demultiplexer2. If you are already starting with demultiplexed data please copy them into 2_demultiplexing/data.

Configuring the general settings

Associated with every newly created project, Apscale will generate an Excel sheet in the project folder called "Settings.xlsx". It is divided into seperate sheets for every module and a 0_general_settings tab. By default Apscale will set 'cores to use' to all available cores on the system the project is created - 2, but this can be lowered if the capacity is needed for other processes on your computer. Apscale only works with compressed data, so it takes compressed data as input and has compressed data as output. The compression level can be set in the general settings as well. Its default value is 6 since this is the default value of gzip. The higher the compression level, the longer Apscale will take to process the data and vice versa, so if runtime is an issue, you can lower the compression level and if disk space is a concern, the compression level can be set to 9 as maximum value.

Configuring the specific settings

Apscale gives default values for most of its settings. They can be changed if desired, please refer to the manual of vsearch and cutadapt as well as the manual for further information. The only value Apscale needs from the user is the primers used and the expected length of the fragment excluding the primers which is used for quality filtering. Version 4 added many new features. To create the behaviour of apscale 3.0.1, the default values can be used. After these are set, Apscale is ready to run!

Running Apscale

Navigate to the project folder you would like to process. Apscale can be run from anywhere on the system, but then needs a PATH to the project folder.

apscale -h

Will give help on the different functions of apscale. To run an all-in-one analysis on your dataset run:

apscale --run_apscale [PATH]

The PATH argument is optional and only needs to be used if you are not located in the project folder. This will automatically do PE merging, primer trimming, quality filter and denoising of the data. The individual modules can also be run separately (see apscale -h for respective commands). A project report will be saved in the project folder as well as an individual report for the individual steps of the pipeline. Information about the versions of the programs used as well as how many reads where used and passed the module as well as a timestamp when the file finished.

The main output of Apscale will be an ESV table, as well as a .fasta files, which can be used for taxonomic assignment. For example, for COI sequences, BOLDigger3 (https://github.com/DominikBuchner/BOLDigger3) can be used directly with the output of Apscale to assign taxonomy to ESVs using the Barcode of Life Data system (BOLD) database. Furthermore, the ESV tables are compatible with TaxonTableTools (https://github.com/TillMacher/TaxonTableTools), which can be used for DNA metabarcoding specific analyses.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

4.0.5

Jul 11, 2025

4.0.4

Jul 11, 2025

4.0.3

Jul 11, 2025

4.0.2

Jul 11, 2025

4.0.1

Jul 11, 2025

4.0.0

Jul 10, 2025

3.0.2

Mar 14, 2025

3.0.1

Mar 14, 2025

3.0.0

Feb 24, 2025

2.1.1

Dec 16, 2024

2.1.0

Dec 8, 2024

2.0.4

Dec 3, 2024

2.0.3

Oct 1, 2024

2.0.2

Aug 8, 2024

2.0.1

Aug 8, 2024

2.0.0

Aug 8, 2024

1.7.1

Apr 2, 2024

1.7.0

Apr 2, 2024

1.6.3

Feb 21, 2023

1.6.2

Jan 10, 2023

1.6.1

Jan 6, 2023

1.6.0

Jan 6, 2023

1.5.6

Dec 7, 2022

1.5.5

Apr 5, 2022

1.5.4

Feb 28, 2022

1.5.3

Feb 25, 2022

1.5.2

Feb 24, 2022

1.5.1

Feb 22, 2022

1.5.0

Feb 22, 2022

1.4.2

Feb 11, 2022

1.4.1

Feb 11, 2022

1.4.0

Feb 11, 2022

1.3.8

Feb 11, 2022

1.3.7

Feb 10, 2022

1.3.6

Feb 9, 2022

1.3.5

Feb 9, 2022

1.3.4

Feb 9, 2022

1.3.3

Feb 9, 2022

1.3.2

Feb 5, 2022

1.3.1

Feb 4, 2022

1.3.0

Feb 3, 2022

1.2.2

Feb 3, 2022

1.2.1

Feb 1, 2022

1.2.0

Jan 31, 2022

1.1.1

Jan 28, 2022

1.1.0

Jan 27, 2022

1.0.8

Jan 25, 2022

1.0.7

Jan 19, 2022

1.0.6

Jan 12, 2022

1.0.5

Jan 12, 2022

1.0.4

Jan 10, 2022

1.0.3

Jan 10, 2022

1.0.2

Jan 10, 2022

1.0.1

Jan 10, 2022

1.0.0

Jan 10, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

apscale-4.0.5.tar.gz (46.9 kB view details)

Uploaded Jul 11, 2025 Source

Built Distribution

apscale-4.0.5-py3-none-any.whl (60.3 kB view details)

Uploaded Jul 11, 2025 Python 3

File details

Details for the file apscale-4.0.5.tar.gz.

File metadata

Download URL: apscale-4.0.5.tar.gz
Upload date: Jul 11, 2025
Size: 46.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for apscale-4.0.5.tar.gz
Algorithm	Hash digest
SHA256	`d034c0c3d43901570ecc3d0ad75556f1916ff44182ab9a9928f4af2876610316`
MD5	`3f4f4cbf1a6ed25e806592230e2f7ab2`
BLAKE2b-256	`9e2e6d9bdcac465571c2581f14302a984deb3e9d317cf246390e3d32740bb426`

See more details on using hashes here.

File details

Details for the file apscale-4.0.5-py3-none-any.whl.

File metadata

Download URL: apscale-4.0.5-py3-none-any.whl
Upload date: Jul 11, 2025
Size: 60.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for apscale-4.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7c7d2b268a8d4037c02670b994981dbcba910a0a477e654f84e9ef13f21b74e8`
MD5	`241f84dbc51a9e179b4ba478d31bf8f2`
BLAKE2b-256	`d32797815d638c2ce6ac1b3efeef4f37117ee9ec18301f2e9886eb7378ce751a`

See more details on using hashes here.

apscale 4.0.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

apscale

Introduction

Installation

Further dependencies - vsearch and swarm

Further dependencies - cutadapt

How to use

Create a new apscale project

Configuring the general settings

Configuring the specific settings

Running Apscale

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes