Amplicon read simualtor
Project description
Bygul: Amplicon Read Simulator
A tool for Amplicon read simulation for waste water sequencing or other aplications. Users can easily simulate reads from mutiple samples with different proportions using the tool.
Usage
If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) https://github.com/andersen-lab/Bygul repository.
Installation
Bygul is written in python 3 but it requires wgsim and mason simulator to simulate reads.
Local build from source
git clone https://github.com/andersen-lab/Bygul
cd Bygul
pip install -e .
Please note that pip does not install all the requirements, some packages need to be installed via Conda or be built from source.
Installing via Conda
pip install git+https://github.com/andersen-lab/Bygul- Create a conda environment as bygul and install the dependencies:
conda create -n bygul
conda activate bygul
conda env update --file environment.yml
Example commands
Run the tool using the following command.
bygul simulate-proportions [SAMPLE1.fasta,SAMPLE2.fasta,..] [primer.bed] [reference.fasta] --proportions [0.8,0.2,..] --outdir [output_directory]
Simulate reads from different samples without defining proportions (will be assigned randomly, proportions can be found in results/sample_proportions.txt) and allowing upto 2 SNPs mistmatches in the primer regions.
bygul simulate-proportions sample.fasta,sample2.fasta primer.bed reference.fasta --outdir results/ --maxmismatch 2
Simulate reads with user-defined proportions and specifing read simulator. bygul uses wgsim as a simulator but you can change it to mason.
bygul simulate-proportions sample.fasta,sample2.fasta primer.bed reference.fasta --proportions 0.2,0.8 --simulator mason
Simulate reads with user-defined proportions and number of reads per amplicon.
bygul simulate-proportions sample.fasta,sample2.fasta primer.bed reference.fasta --proportions 0.2,0.8 --readcnt 1000
Simulate reads with additional parameters such as base error rate, read length and indels fraction
bygul simulate-proportions sample.fasta,sample2.fasta primer.bed reference.fasta --proportions 0.2,0.8 --readcnt 1000 --error_rate 0.001 --read_length 400 --indel_fraction 0.001
Notes
Number of reads per amplicon
It is recommended to define the number of reads per amplicon to be greater than the number of contigs in your amplicon file. This is particularly important when your primers are designed for whole genome sequencing, where each amplicon may contain a substantial number of contigs. Setting too few reads per amplicon may result in empty read files for certain amplicons, leading to incomplete simulated reads.
Primer bed file
Please remember that the primer file must contain a column containing primer sequence. The maximum number of mismatches allowed for each primer sequence is 1 SNP. To change this number, you may use the --maxmismatches flag.
Complete set of available parameters
To learn more about how to adjust other parameters use bygul simulate-proportions --help
Simulated reads output
Simulated reads from all samples are located in provided_output_path/reads.fastq
Information about amplicon dropouts
In order to find more about amplicon dropouts, please refer to provided_output_path/sample_name/amplicon_stats.csv file. This file will have right/left primer matching coordinates as zero if no matches found.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bygul-2025.4.tar.gz.
File metadata
- Download URL: bygul-2025.4.tar.gz
- Upload date:
- Size: 10.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
746c61bcc64c1ff0c1eeb06cfe267df0b66cd0db0c5be5c790a1518b99e7bad2
|
|
| MD5 |
ff3b9a7669e7168f84e89e3e2e77682a
|
|
| BLAKE2b-256 |
fd8aa1d39fe30594cfd17556e219eb4d63ea150a5abc8dbc45e2c04ee35781f5
|
Provenance
The following attestation bundles were made for bygul-2025.4.tar.gz:
Publisher:
github_actions.yml on andersen-lab/Bygul
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bygul-2025.4.tar.gz -
Subject digest:
746c61bcc64c1ff0c1eeb06cfe267df0b66cd0db0c5be5c790a1518b99e7bad2 - Sigstore transparency entry: 200299646
- Sigstore integration time:
-
Permalink:
andersen-lab/Bygul@309fce8f7c6a118b4a282b25e4504e55dd92594c -
Branch / Tag:
refs/tags/V1.0.1 - Owner: https://github.com/andersen-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
github_actions.yml@309fce8f7c6a118b4a282b25e4504e55dd92594c -
Trigger Event:
release
-
Statement type:
File details
Details for the file bygul-2025.4-py3-none-any.whl.
File metadata
- Download URL: bygul-2025.4-py3-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
600d88c522547fb83a4efb63d7da9422fd8cdb8fd0a85e21a1f79821b4d2b706
|
|
| MD5 |
05aa9845e2b07112be5c12d12ab85ed8
|
|
| BLAKE2b-256 |
d87f692587b5b8d12061cf23435232b3ef648cdebc496eefcddeacdb34357488
|
Provenance
The following attestation bundles were made for bygul-2025.4-py3-none-any.whl:
Publisher:
github_actions.yml on andersen-lab/Bygul
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bygul-2025.4-py3-none-any.whl -
Subject digest:
600d88c522547fb83a4efb63d7da9422fd8cdb8fd0a85e21a1f79821b4d2b706 - Sigstore transparency entry: 200299648
- Sigstore integration time:
-
Permalink:
andersen-lab/Bygul@309fce8f7c6a118b4a282b25e4504e55dd92594c -
Branch / Tag:
refs/tags/V1.0.1 - Owner: https://github.com/andersen-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
github_actions.yml@309fce8f7c6a118b4a282b25e4504e55dd92594c -
Trigger Event:
release
-
Statement type: