DNA sequence encryption and security using self-power decomposition
Project description
DNAsecure
DNA sequence encryption and security using self-power decomposition.
Overview
DNAsecure is a Python package that provides tools for encrypting and decrypting DNA sequences using the self-power decomposition algorithm. It is built on top of the selfpowerdecomposer package and provides specialized functionality for working with DNA sequences and FASTA files.
The package uses a secure encryption approach that splits the encryption into two parts:
- The main encrypted data (stored in
.spdfiles) - A key file (stored in
.keyfiles)
Both parts are required to decrypt the data, providing a secure way to store and share DNA sequences.
Installation
pip install dnasecure
Dependencies
- selfpowerdecomposer >= 0.1.1
- numpy >= 1.19.0
- biopython >= 1.79
Usage
Command Line Interface
DNAsecure provides a command-line interface for easy encryption and decryption of FASTA files:
# Encrypt a FASTA file
dnasecure encrypt input.fasta output.spd output.key --security-level 5
# Decrypt an SPD file back to FASTA
dnasecure decrypt output.spd output.key decrypted.fasta
# Show help
dnasecure --help
Parallel Processing
DNAsecure supports parallel processing for handling multiple sequences simultaneously, which can significantly improve performance when working with multiFASTA files:
# Encrypt a FASTA file using parallel processing
dnasecure encrypt input.fasta output.spd output.key --parallel --num-processes 4
# Decrypt an SPD file using parallel processing
dnasecure decrypt output.spd output.key decrypted.fasta --parallel --num-processes 4
You can also disable parallel processing if needed:
dnasecure encrypt input.fasta output.spd output.key --no-parallel
Performance Improvement
Parallel processing can provide significant speedup when working with multiFASTA files. In our benchmarks with a 30MB FASTA file containing 10 sequences:
- Encryption: 5.37x speedup (42.18s → 7.86s)
- Decryption: 5.95x speedup (111.53s → 18.75s)
The speedup scales with the number of sequences and available CPU cores.
Optimized Implementation
DNAsecure includes an experimental optimized implementation for large sequence processing that aims to provide performance improvements:
- Uses memory views for zero-copy slicing
- Implements parallel processing for individual large sequences
- Optimizes chunking and buffer management
The optimized implementation is disabled by default as it may not provide significant performance improvements in all scenarios. You can enable it if you want to experiment with it:
from dnasecure.core import USE_OPTIMIZED_IMPLEMENTATION
# Enable optimized implementation
import dnasecure.core
dnasecure.core.USE_OPTIMIZED_IMPLEMENTATION = True
# Use optimized implementation
result = encrypt_sequence(sequence)
# Disable optimized implementation
dnasecure.core.USE_OPTIMIZED_IMPLEMENTATION = False
You can benchmark the performance difference using the provided benchmark script:
# Compare original vs. optimized implementation for different sequence lengths
python examples/optimized_benchmark.py --test length
# Compare original vs. optimized implementation for different chunk sizes
python examples/optimized_benchmark.py --test chunk --sequence-length 100000
Python API
Basic Usage
from dnasecure import encrypt_sequence, decrypt_sequence
# Encrypt a DNA sequence
sequence = "ATGCATGCATGCATGC"
encrypted_data, key = encrypt_sequence(sequence)
# Decrypt the sequence
decrypted_sequence = decrypt_sequence(encrypted_data, key)
print(decrypted_sequence) # Should print the original sequence
Working with FASTA Files
from dnasecure import encrypt_fasta, decrypt_fasta
# Encrypt a FASTA file
encrypt_fasta("input.fasta", "output.spd", "output.key")
# Decrypt an SPD file back to FASTA
decrypt_fasta("output.spd", "output.key", "decrypted.fasta")
Using Parallel Processing in Python
from dnasecure import encrypt_fasta, decrypt_fasta
# Encrypt a FASTA file with parallel processing
encrypt_fasta(
"input.fasta",
"output.spd",
"output.key",
parallel=True,
num_processes=4 # Use 4 processes
)
# Decrypt an SPD file with parallel processing
decrypt_fasta(
"output.spd",
"output.key",
"decrypted.fasta",
parallel=True,
num_processes=4 # Use 4 processes
)
Security Features
DNAsecure provides strong security for DNA sequences:
- Two-part encryption: The encryption is split into two parts (data and key), both of which are required for decryption.
- Self-power decomposition: Uses a novel mathematical approach for compression and encryption.
- No placeholders: The encryption does not use placeholders for removed values, making it harder to identify what was removed.
- Resistance to brute force: The encryption is resistant to brute force attacks due to the large key space.
Examples
See the examples directory for more detailed usage examples.
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dnasecure-0.0.4.tar.gz.
File metadata
- Download URL: dnasecure-0.0.4.tar.gz
- Upload date:
- Size: 18.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13c52f425a251fa118c24eb5dcbc280398f740af24fe56b3f7b5d6ffe83fbb34
|
|
| MD5 |
5bed62aff5cff93d361501b098405b48
|
|
| BLAKE2b-256 |
c045c86ac04a10135d1f509e88874088cf3740fbfa69b420a22dc1f68eb89d90
|
File details
Details for the file dnasecure-0.0.4-py3-none-any.whl.
File metadata
- Download URL: dnasecure-0.0.4-py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ebb64f9c70c60f3f987dccb9d0488bee0119515b18cdae5ee7e319fbc7097aa
|
|
| MD5 |
1d0b2464a4e6a0a85d0427e32c38c541
|
|
| BLAKE2b-256 |
506c77682b0839b0ac3ec34d8d3c466a5b6ebc8753e4340b34d5e23dd82f817c
|