Skip to main content

A utility for shuffling biological sequences while preserving dinucleotide frequencies.

Project description

dinuc_shuf

This Python package provides a minimal and efficient implementation for performing dinucleotide shuffles on one-hot-encoded sequences.

Dinucleotide shuffling preserves the dinucleotide (nucleotide pair) frequencies of the input sequence while randomizing the order of the pairs. This is particularly useful for generating random sequences that match the compositional properties of the original input.

To ensure a uniform random sample from all possible shuffles, the algorithm leverages the rank-one-update Kirchhoff matrix method described by Colburn et al. for sampling random arborescences, combined with a random Eulerian walk on the dinucleotide transition graph. The core algorithm is implemented in Rust for performance, with Python bindings for easy integration.

This package is lightweight, requiring only a single dependency on Numpy.

Installation

To install the package from PyPI, run:

pip install dinuc-shuf

Usage

import numpy as np
from dinuc_shuf import shuffle

SEQ_ALPHABET = np.array(["A","C","G","T"], dtype="S1")

def one_hot_encode(sequence, dtype=np.uint8):
    sequence = sequence.upper()
    seq_chararray = np.frombuffer(sequence.encode('UTF-8'), dtype='S1')
    one_hot = (seq_chararray[:,None] == SEQ_ALPHABET[None,:]).astype(dtype)

    return one_hot

def one_hot_decode(one_hot):
    return SEQ_ALPHABET[one_hot.argmax(axis=1)].tobytes().decode('UTF-8')

sequence = "ACCCACGATGATG"
one_hot_sequence = one_hot_encode(sequence)
shuffled_one_hot = shuffle(one_hot_sequence[None,:,:])
shuffled = one_hot_decode(shuffled_one_hot[0,:,:])

print(shuffled) # Output: "ACATGATGACCCG"

API Reference

A full API reference is available here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dinuc_shuf-0.1.0b1.tar.gz (25.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dinuc_shuf-0.1.0b1-cp38-abi3-win_amd64.whl (220.3 kB view details)

Uploaded CPython 3.8+Windows x86-64

dinuc_shuf-0.1.0b1-cp38-abi3-win32.whl (204.2 kB view details)

Uploaded CPython 3.8+Windows x86

dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_x86_64.whl (530.9 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ x86-64

dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_i686.whl (552.6 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ i686

dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_armv7l.whl (598.9 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ ARMv7l

dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_aarch64.whl (511.3 kB view details)

Uploaded CPython 3.8+musllinux: musl 1.2+ ARM64

dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (368.6 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (395.0 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ s390x

dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (432.9 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ppc64le

dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (345.0 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARMv7l

dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (339.0 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl (381.7 kB view details)

Uploaded CPython 3.8+manylinux: glibc 2.5+ i686

dinuc_shuf-0.1.0b1-cp38-abi3-macosx_11_0_arm64.whl (305.4 kB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

dinuc_shuf-0.1.0b1-cp38-abi3-macosx_10_12_x86_64.whl (337.6 kB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file dinuc_shuf-0.1.0b1.tar.gz.

File metadata

  • Download URL: dinuc_shuf-0.1.0b1.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.8.1

File hashes

Hashes for dinuc_shuf-0.1.0b1.tar.gz
Algorithm Hash digest
SHA256 f66e6227579c662d4b5afe16dd0ebc946f0b5632cd3289f0c2df02455c2b593a
MD5 c4a0cc753e87d3e1935aa10492664b5d
BLAKE2b-256 a21b07fb627832be609e480b94f27202b9119c75fa6951b86621ff5bc903eb27

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 5909b83cd36ffd706c2155034aadc18583e25f5f063126781f18bcafec8e1f75
MD5 238b414403017f47260633827af2aacd
BLAKE2b-256 4a839a286f24241ac8f649df061db899baac4bdb48d2d4d3f7defb9e03ffa1e6

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-win32.whl.

File metadata

  • Download URL: dinuc_shuf-0.1.0b1-cp38-abi3-win32.whl
  • Upload date:
  • Size: 204.2 kB
  • Tags: CPython 3.8+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.8.1

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 420e65e2753ffb425003cdbb265129052fca600298e7632aea46850a3a4ae754
MD5 704e6b11dc166f98b66353279f129549
BLAKE2b-256 06306eaca95f79e43244c745401d87a35cc81bb336ce6ec1061f8a3d8717ccab

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1f303ba0226242d87926498343777603c6f1816683c4fa2b82192f861b170d94
MD5 fce8788a41a64838ea66d40f368de589
BLAKE2b-256 cf88cf17515dadbde6885b1643eb1ebdc8bd06267cd7d47563cfa25c0cc91fa9

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 7415ad470990d9fa5f2c321a9a942dbb428965f461fefb2212950d92ff3f42b5
MD5 4c3d0acead4e8ff9a6196c7fcc85d942
BLAKE2b-256 17b43fe4e52a8f3186c68ccf7555ef1fa61f42b970cb25a6c564de47143c8ef1

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 53cd6850b39d3baca67e94e906ad1bb6679f1ae2e1c1bf2b217e15ac104de664
MD5 a15db4edbdcf728ceb0ce424b6b80135
BLAKE2b-256 b084237fd02b3709ca5995edecd9f9ce1e8e7515347170b6b8186ea5783d5d81

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 2f33bb3bc1cb07caaafb1114e6641ae1dc694614e4563ffb138718ec340c4913
MD5 fc99b5030936315c79ac684a7c4fa074
BLAKE2b-256 6f9c3a50f4a1cd70609db150ae9c0b940a0b1bae38ac84887ebd042329ce9579

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e0956ad25ccd2c072413e4975b3f58f7a2965d9605e2268cc345d35168f56ed1
MD5 da9f8d2b4ce848aa0747f39589a88d66
BLAKE2b-256 a3e316bd2cc9ab2fe458560f8ef7cef513fb5914c70bfd3432c3454e27ae0018

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 3abeba621a4245b3fd9dd2201870ca65904a6bf655b965f394e478ae31cddd9c
MD5 b23120c8ff26a48400a600b0161e1901
BLAKE2b-256 8b01e346bed2f6e55ef8c340822332d3bdb9bd893e5acc35c42770cab5154e9f

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 faee4a838d49d2020645b67d93166b6bb47541b2abfe53c9c1fbf1b5bef18064
MD5 3981d002652ca7aae15d169ea6ec1277
BLAKE2b-256 17e5773fc64381b9d3a973ac24827e2e529dc8d7764ff78573cd7b4aad7f92c0

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 1b075cc303f6ba0109c5574aab820bac0de9923c70fbdd964603598cbfbda319
MD5 b2b3cfd110b51ffbe7efd1566e3d78c6
BLAKE2b-256 00952fc86582c6dcd414f7edac3010d23bd85f8030a2d6ede154714005bec77f

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 361267a8b194c9485dfe080302b7056c51661b9d2a4fc4e4611870f6a153ddd7
MD5 7dd3d7ba4e4298472b2868b21fc76049
BLAKE2b-256 e2406601cbb71d82a82f1d7a91a3156c28162c7915eb9f51adde0ff9cba8a24f

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 bb7a8cf11cec44fc3a56efeec2c40963c9d2ce0222940ec66184f89a9493d063
MD5 ed9c374713f94b962c71837ad16c1746
BLAKE2b-256 1511176b76aefe2ce605a19db677b3bb0a78b52eb53d993ae1aff0d69d02d486

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 23dad1bc4f0af3daea3d3c9c01596531937dc035381b1782841f8c7578721197
MD5 6fdc78e5eee3cf6ef777b94b639b21a3
BLAKE2b-256 99e9fc708a83875c1dbf4bbf41d48583c1bba89669d640cacb4bc1a702b32071

See more details on using hashes here.

File details

Details for the file dinuc_shuf-0.1.0b1-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for dinuc_shuf-0.1.0b1-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 af99c3a9057736ca882a1866d6482e31dda90b1937ca100dd53583c72f3093ea
MD5 144f3b05ffec66e23cd98d173148180d
BLAKE2b-256 b75cddf9a9ae9d1467bffbeb2556829b028e6d9ad0afc48a29acda71f7bd05b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page