Skip to main content

Divide amplicon sequences

Project description

Helper for join and split fastq files.

Required python 3.6 or above.


# Linux
## split
python3 split -m fastq_file
## join
python3 join -f forward.fastq -r reverse.fastq
# Windows
## split
python split -m fastq_file
## join
python join -f forward.fastq -r reverse.fastq

Use -t to set linker text, by default the program use "JOINTEXT".

When split, "fastq_file" could be multiple files, use "*.fastq" (include quotation mark) to represent all ".fastq" files in current folder.

Divide NGS data by barcode and primer.


  • Python 3.5 or above
  • Biopython
  • regex
  • vsearch (Optional)

To install Biopython and regex, run as administrator:

pip install biopython regex



Support ambiguous base.


Extend vsearch options. Improve output


Integrate vsearch.


Use regex instead of BLAST. Faster and easier.


Parallel version, use BLAST.


Single core version. Use BLAST.



Sequence structure

It can handle merged pair-end sequence like this:


Or just handle one direction:


Sequences will be divided by barcode according to given barcode file. If barcode is wrong even only one base, it will be dropped.


Some one adds sequence between barcode and primer, if you do not have it, just set adapter length to zero by "--adapter 0". The default value is 14.

Barcode mode

Use "-m" to set barcode mode, like "8*1", means barcode with length 5 repeats only 1 times. The default is "5*2", i.e., 5-base barcode repeats twice.

Note that the forward and reverse barcode may be different sequence, but they SHOULD FOLLOW THE SAME MODE!

Strict option

Use "-s" or "--strict" to use strict version. If set, the program will check barcode in head and tail is equal or not and whether barcode in tail (3') is correct. If not, it will only check barcode in head (5') of sequence.

Barcode file

Barcode file looks like this:






The barcode-f means barcode in 5' direction and barcode-r means barcode in 3' direction. All sequences should be forward.

If forward and reverse barcode are same, you can omit the reverse barcode in the table.

To avoid potential error, please do not use space in sample info.

And notice that here it use English comma to seperate two fields rather than Chinese comma.

Primer file

Primer file looks like this:









You can use Microsoft Excel to prepare these two files and save as CSV format, or use any text editor you prefer.

Make sure you don't miss the first line.

If you use PBS task submitting system, you can use this script to submit the task, and you can finish the work from combine two direction sequence by flash and to divide them.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

divide_seq-5.22-py3-none-any.whl (18.8 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page