Skip to main content

Random generation of genetic files

Project description

Genetic data files generator for testing purposes

biophony is a package for generating random genetic data files intended specifically for testing and validation. Real genetic data is often too large, lacks flexibility, or raises privacy concerns, making it unsuitable for thorough testing. biophony makes it simpler to test software in different scenarios without needing real data, enabling focused and efficient development and validation.

Installation

biophony requires at least Python 3.11 to work.

To install with pip, run:

pip install biophony

Usage

Command Line Interfaces

biophony provides the following CLIs to generate data:

  • gen-cov: generates a BED file with custom depth,
  • gen-fasta: generates a FASTA file with a custom size sequence,
  • gen-fastavar: generates a FASTA file with custom size sequences, each with n variants with control over insertion, deletion and mutation rate,
  • gen-fastq: generates a FASTQ file with custom read count and size,
  • gen-vcf: generates a VCF file from a FASTA file, with control over insertion, deletion and mutation rate.

CLIs that read and / or write data do it on stdin and stdout by default, thus permitting to chain operations with the pipe operator |.

For exemple, run the following command to generate a VCF with 2% SNP, 1% INS and 1% DEL:

gen-fasta | gen-vcf --snp-rate 0.02 --ins-rate 0.01 --del-rate 0.01

To save the generated content, you can either use the regular output operator > to redirect stdout to a file or use the dedicated option:

gen-fasta | gen-vcf --snp-rate 0.02 --ins-rate 0.01 --del-rate 0.01 > test.vcf  # redirect
gen-fasta | gen-vcf --snp-rate 0.02 --ins-rate 0.01 --del-rate 0.01 -o test.vcf  # dedicated option

Python API

You can also use the Python API to generate random genetic data files in your scripts.

Link to the Python API documentation: https://cnrgh.gitlab.io/databases/biophony/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biophony-1.5.0.tar.gz (19.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biophony-1.5.0-py3-none-any.whl (32.3 kB view details)

Uploaded Python 3

File details

Details for the file biophony-1.5.0.tar.gz.

File metadata

  • Download URL: biophony-1.5.0.tar.gz
  • Upload date:
  • Size: 19.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.3 Linux/6.11.0-26-generic

File hashes

Hashes for biophony-1.5.0.tar.gz
Algorithm Hash digest
SHA256 aa68874d2ac18bcc5ec4111c12af387a62aa560a76d5ecd64f7c2cf768cd517b
MD5 2e79b6024eb531c1e68148d0ab58366c
BLAKE2b-256 a4aab16146a97a09b527a6c136dfec65a82449b31d12b9638a4ba23185d306b7

See more details on using hashes here.

File details

Details for the file biophony-1.5.0-py3-none-any.whl.

File metadata

  • Download URL: biophony-1.5.0-py3-none-any.whl
  • Upload date:
  • Size: 32.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.3 Linux/6.11.0-26-generic

File hashes

Hashes for biophony-1.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 94780b17250ba31656b24d66659adc522074580ead2383c7fa981fd2df2a39eb
MD5 dabfe3084a2ccbfb81d86c6d9d69ec8a
BLAKE2b-256 dfbc7e14eda687d87a29a6ccc15ab8a82f374f30c8e370425da2912bd530f653

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page