Skip to main content

Synthetic VCF generator for structural variants (Manta, DELLY) with controlled variability and realistic artefact injection

Project description

svForge logo

Generate synthetic SV VCFs to stress-test your pipelines with confidence

PyPI version License DOI


svForge produces caller-shaped VCFs (Manta, DELLY) in VCF / VCF.gz / BCF format with fine-grained control over variability (HOMLEN, SVLEN, VAF) and realistic artefact injection (SVs in ENCODE blacklist regions, gnomAD germline SVs).

Designed to be modular, it is easy to adapt to your own use case. You can tune generation parameters, plug in new callers, and customize the workflow without reworking the whole tool.

Installation

pip install svforge

Or from source:

git clone https://github.com/pieetie/svforge
cd svforge
pip install -e ".[dev,test]"

Quick start

For ready-to-run command lines (single-sample gen, paired somatic gen-pair, validation, banks, and dev checks), see docs/ready-to-use.md.

Scope & limitations

svForge ships the full header of each target caller verbatim, but for now it does not populate every INFO/FORMAT field a real run would emit. The fields below are declared in the header for parser compatibility but left empty in records — support may land in a later release.

Manta — not yet populated: CIPOS, CIEND (alignment-derived breakpoint CIs), HOMSEQ (microhomology sequence; HOMLEN is populated), BND_DEPTH, MATE_BND_DEPTH (per-breakend coverage).

DELLY — not yet populated: RDRATIO (tumor/normal read-depth ratio), INSLEN, SRMAPQ, CONSENSUS, CONSBP (split-read consensus); also, records are currently always tagged PRECISE rather than alternating with IMPRECISE.

In the meantime, if your downstream pipeline hard-requires any of these fields, you can supply your own --header-template that drops the unused declarations, or post-process the VCF to fill them with sentinel values.

Typical use cases

  • Validate downstream filters (for example, SVFORGE_SOURCE=gnomad records should disappear after your gnomAD filtering step).
  • Validate ENCODE blacklist annotation logic (for example, SVFORGE_SOURCE=blacklist records should receive your expected poor-mappability label).
  • Run reproducible CI regression tests with fixed seeds, without committing generated VCF files.
  • Smoke-test deployments and pipeline changes in seconds instead of rerunning full variant callers on BAM files.
  • Reproduce specific scenarios and edge cases (cross-chrom BNDs, contig-edge events, controlled VAF/HOMLEN ranges) for debugging and QA.
  • Demo or onboard safely with realistic SV VCFs and no patient data.

CLI

svforge gen          # one VCF for one sample
svforge gen-pair     # one 2-sample somatic VCF (NORMAL + TUMOR)
svforge validate     # self-consistency check of injected SVs
svforge bank list    # list built-in banks
svforge bank show    # dump a bank as YAML
svforge callers      # list registered writers

Run svforge <cmd> --help for the full flag list.

Credits

  • Logo and visual identity - Elisa Perrin
  • Claude (Anthropic) - assisted with tests, documentation, refactoring, and release tooling (Ruff linting/formatting, CI cleanup)

License

svForge icon Distributed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

svforge-1.0.2.tar.gz (54.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

svforge-1.0.2-py3-none-any.whl (53.1 kB view details)

Uploaded Python 3

File details

Details for the file svforge-1.0.2.tar.gz.

File metadata

  • Download URL: svforge-1.0.2.tar.gz
  • Upload date:
  • Size: 54.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for svforge-1.0.2.tar.gz
Algorithm Hash digest
SHA256 0cd15aaa14a3007f9df5335ef8a3cdb945811e0c02b306cbfd118a7db265adfe
MD5 d84f9ee458112eb5335c536c17b86632
BLAKE2b-256 5b5c93e6d53d7f0baa8152775a72132f26fd799cb5374051177aca27a744579f

See more details on using hashes here.

Provenance

The following attestation bundles were made for svforge-1.0.2.tar.gz:

Publisher: publish.yml on pieetie/svforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file svforge-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: svforge-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 53.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for svforge-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f798ed896dc1c6d8f091a40bf5c4f834b8c29be03c0213045a157ee65102321f
MD5 99d07f9cc3c0910428f659c7fad52149
BLAKE2b-256 eee5202b99f9580b412dec36f3beebc7f670ddefcd81ae572e814125d291e297

See more details on using hashes here.

Provenance

The following attestation bundles were made for svforge-1.0.2-py3-none-any.whl:

Publisher: publish.yml on pieetie/svforge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page