Skip to main content

DNA sequence encryption and security using self-power decomposition

Project description

DNAsecure

DNA sequence encryption and security using self-power decomposition.

Overview

DNAsecure is a Python package that provides tools for encrypting and decrypting DNA sequences using the self-power decomposition algorithm. It is built on top of the selfpowerdecomposer package and provides specialized functionality for working with DNA sequences and FASTA files.

The package uses a secure encryption approach that splits the encryption into two parts:

  1. The main encrypted data (stored in .spd files)
  2. A key file (stored in .key files)

Both parts are required to decrypt the data, providing a secure way to store and share DNA sequences.

Installation

pip install dnasecure

Dependencies

  • selfpowerdecomposer >= 0.1.1
  • numpy >= 1.19.0
  • biopython >= 1.79

Usage

Command Line Interface

DNAsecure provides a command-line interface for easy encryption and decryption of FASTA files:

# Encrypt a FASTA file
dnasecure encrypt input.fasta output.spd output.key --security-level 5

# Decrypt an SPD file back to FASTA
dnasecure decrypt output.spd output.key decrypted.fasta

# Show help
dnasecure --help

Parallel Processing

DNAsecure supports parallel processing for handling multiple sequences simultaneously, which can significantly improve performance when working with multiFASTA files:

# Encrypt a FASTA file using parallel processing
dnasecure encrypt input.fasta output.spd output.key --parallel --num-processes 4

# Decrypt an SPD file using parallel processing
dnasecure decrypt output.spd output.key decrypted.fasta --parallel --num-processes 4

You can also disable parallel processing if needed:

dnasecure encrypt input.fasta output.spd output.key --no-parallel

Performance Improvement

Parallel processing can provide significant speedup when working with multiFASTA files. In our benchmarks with a 30MB FASTA file containing 10 sequences:

  • Encryption: 5.37x speedup (42.18s → 7.86s)
  • Decryption: 5.95x speedup (111.53s → 18.75s)

The speedup scales with the number of sequences and available CPU cores.

Optimized Implementation

DNAsecure includes an experimental optimized implementation for large sequence processing that aims to provide performance improvements:

  • Uses memory views for zero-copy slicing
  • Implements parallel processing for individual large sequences
  • Optimizes chunking and buffer management

The optimized implementation is disabled by default as it may not provide significant performance improvements in all scenarios. You can enable it if you want to experiment with it:

from dnasecure.core import USE_OPTIMIZED_IMPLEMENTATION

# Enable optimized implementation
import dnasecure.core
dnasecure.core.USE_OPTIMIZED_IMPLEMENTATION = True

# Use optimized implementation
result = encrypt_sequence(sequence)

# Disable optimized implementation
dnasecure.core.USE_OPTIMIZED_IMPLEMENTATION = False

You can benchmark the performance difference using the provided benchmark script:

# Compare original vs. optimized implementation for different sequence lengths
python examples/optimized_benchmark.py --test length

# Compare original vs. optimized implementation for different chunk sizes
python examples/optimized_benchmark.py --test chunk --sequence-length 100000

Python API

Basic Usage

from dnasecure import encrypt_sequence, decrypt_sequence

# Encrypt a DNA sequence
sequence = "ATGCATGCATGCATGC"
encrypted_data, key = encrypt_sequence(sequence)

# Decrypt the sequence
decrypted_sequence = decrypt_sequence(encrypted_data, key)
print(decrypted_sequence)  # Should print the original sequence

Working with FASTA Files

from dnasecure import encrypt_fasta, decrypt_fasta

# Encrypt a FASTA file
encrypt_fasta("input.fasta", "output.spd", "output.key")

# Decrypt an SPD file back to FASTA
decrypt_fasta("output.spd", "output.key", "decrypted.fasta")

Using Parallel Processing in Python

from dnasecure import encrypt_fasta, decrypt_fasta

# Encrypt a FASTA file with parallel processing
encrypt_fasta(
    "input.fasta", 
    "output.spd", 
    "output.key", 
    parallel=True, 
    num_processes=4  # Use 4 processes
)

# Decrypt an SPD file with parallel processing
decrypt_fasta(
    "output.spd", 
    "output.key", 
    "decrypted.fasta", 
    parallel=True, 
    num_processes=4  # Use 4 processes
)

Security Features

DNAsecure provides strong security for DNA sequences:

  1. Two-part encryption: The encryption is split into two parts (data and key), both of which are required for decryption.
  2. Self-power decomposition: Uses a novel mathematical approach for compression and encryption.
  3. No placeholders: The encryption does not use placeholders for removed values, making it harder to identify what was removed.
  4. Resistance to brute force: The encryption is resistant to brute force attacks due to the large key space.

Examples

See the examples directory for more detailed usage examples.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dnasecure-0.0.4.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dnasecure-0.0.4-py3-none-any.whl (11.3 kB view details)

Uploaded Python 3

File details

Details for the file dnasecure-0.0.4.tar.gz.

File metadata

  • Download URL: dnasecure-0.0.4.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for dnasecure-0.0.4.tar.gz
Algorithm Hash digest
SHA256 13c52f425a251fa118c24eb5dcbc280398f740af24fe56b3f7b5d6ffe83fbb34
MD5 5bed62aff5cff93d361501b098405b48
BLAKE2b-256 c045c86ac04a10135d1f509e88874088cf3740fbfa69b420a22dc1f68eb89d90

See more details on using hashes here.

File details

Details for the file dnasecure-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: dnasecure-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 11.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for dnasecure-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3ebb64f9c70c60f3f987dccb9d0488bee0119515b18cdae5ee7e319fbc7097aa
MD5 1d0b2464a4e6a0a85d0427e32c38c541
BLAKE2b-256 506c77682b0839b0ac3ec34d8d3c466a5b6ebc8753e4340b34d5e23dd82f817c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page