Amplicon read simualtor
Project description
Bygul: Amplicon & Metagenomics Read Simulator
Bygul is a Python 3 tool designed for simulating sequencing reads in wastewater surveillance and other metagenomic applications. It allows users to simulate complex multi-sample datasets with customizable proportions using industry-standard backends like wgsim and mason.
🏗 Installation
Bygul requires Python 3. Since it relies on external simulators (wgsim and mason), we recommend using Conda to manage dependencies.For more info on wgsim and mason simulator please check their documentations.
Option 1: Via Conda (Recommended)
conda create -n bygul bioconda::bygul
Option 2: Via PyPI
pip install bygul
Note: Some binary dependencies (wgsim/mason) may need to be installed manually or built from source if using this method.
Option 3: Local Build from Source
git clone [https://github.com/andersen-lab/Bygul](https://github.com/andersen-lab/Bygul)
cd Bygul
pip install -e .
🧬 Usage: Amplicon Sequencing Mode
Use this mode when simulating specific genomic regions defined by a primer set.
Basic Command
bygul simulate-proportions [SAMPLE1.fasta,SAMPLE2.fasta] --primers [primer.bed] --reference [reference.fasta] --proportions [0.8,0.2] --outdir [output_dir]
Advanced Examples
- Random Proportions & Mismatches:
Simulate with random proportions and allow up to 2 SNPs in primer regions.
bygul simulate-proportions sample1.fasta,sample2.fasta --primers primer.bed --reference reference.fasta --outdir results/ --maxmismatch 2
- Switching Simulators:
Use
masoninstead of the defaultwgsim.bygul simulate-proportions sample1.fasta,sample2.fasta --primers primer.bed --reference reference.fasta --simulator mason
- Custom Error Rates & Lengths:
Pass simulator-specific parameters (e.g. indel fraction
-R) directly.bygul simulate-proportions sample1.fasta,sample2.fasta --primers primer.bed --reference reference.fasta -R 0.01
🌍 Usage: Metagenomics Mode
Simulate reads from entire samples without requiring a primer BED file or a reference sequence.
Basic Metagenomics Simulation
bygul simulate-proportions sample1.fasta,sample2.fasta --outdir results/ --simulation_mode metagenomics
Metagenomics with Specific Parameters
bygul simulate-proportions sample1.fasta,sample2.fasta --proportions 0.5,0.5 --outdir results/ --simulation_mode metagenomics --simulator mason --illumina-read-length 200
📝 Technical Notes
Parameter Handling
Bygul acts as a wrapper. While most flags are passed directly to the underlying simulators, the following are managed directly by Bygul for more realistic simulations(amplicon simulation mode only):
--readcnt: Number of reads per amplicon.--wgsim_insert_size: Insert size for wgsim.--wgsim_read_length/--wgsim_error_rate.
To see all available backend flags, run:
wgsim --help
mason_simulator --help
Best Practices
- Read Counts: Set
--readcnthigher than the number of contigs in your amplicon file. Too few reads can result in empty files for certain amplicons. - Primer Files: The BED file must include a column with the primer sequence. Bygul allows 1 SNP mismatch by default; use
--maxmismatchto change this.
Output Files
- Consolidated Reads: Simulated reads from all samples are at
outdir/reads.fastq. - Proportions: Assigned proportions are recorded in
results/sample_proportions.txt. - Quality Metrics: Check
outdir/[sample_name]/amplicon_stats.csvfor information on amplicon dropouts, mismatches, and ambiguous bases.
🎓 Citation
If you use this workflow in a paper, please cite the original repository: https://github.com/andersen-lab/Bygul
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bygul-3.0.1.tar.gz.
File metadata
- Download URL: bygul-3.0.1.tar.gz
- Upload date:
- Size: 13.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a84ae9972a2703f09fd12ffd3ac155b41b11b5c6a21096c447b1f346c1ced638
|
|
| MD5 |
1de503dee4549d8ce729360b09168803
|
|
| BLAKE2b-256 |
a6e7a77750038fd170e3bff676b9b6d1e4bb72ab66eec55b5560ecc644a199bb
|
Provenance
The following attestation bundles were made for bygul-3.0.1.tar.gz:
Publisher:
github_actions.yml on andersen-lab/Bygul
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bygul-3.0.1.tar.gz -
Subject digest:
a84ae9972a2703f09fd12ffd3ac155b41b11b5c6a21096c447b1f346c1ced638 - Sigstore transparency entry: 1354080622
- Sigstore integration time:
-
Permalink:
andersen-lab/Bygul@75a292bba1019e6842ca280ca5917d7d7284794f -
Branch / Tag:
refs/tags/V3.0.1 - Owner: https://github.com/andersen-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
github_actions.yml@75a292bba1019e6842ca280ca5917d7d7284794f -
Trigger Event:
release
-
Statement type:
File details
Details for the file bygul-3.0.1-py3-none-any.whl.
File metadata
- Download URL: bygul-3.0.1-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5abb2ae0eb7b126d14cade4e01cafbaeb526c91f3db53a5cd098f73a0bd9a3f0
|
|
| MD5 |
219560f8398593fb3fa2760941bf0734
|
|
| BLAKE2b-256 |
d80359e37bd0177f64c2f2c9c9853b2a8c5dd3416a71739c6e876f815d7df8a8
|
Provenance
The following attestation bundles were made for bygul-3.0.1-py3-none-any.whl:
Publisher:
github_actions.yml on andersen-lab/Bygul
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bygul-3.0.1-py3-none-any.whl -
Subject digest:
5abb2ae0eb7b126d14cade4e01cafbaeb526c91f3db53a5cd098f73a0bd9a3f0 - Sigstore transparency entry: 1354080685
- Sigstore integration time:
-
Permalink:
andersen-lab/Bygul@75a292bba1019e6842ca280ca5917d7d7284794f -
Branch / Tag:
refs/tags/V3.0.1 - Owner: https://github.com/andersen-lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
github_actions.yml@75a292bba1019e6842ca280ca5917d7d7284794f -
Trigger Event:
release
-
Statement type: