No project description provided

Project description

quickdna

Quickdna is a simple, fast library for working with DNA sequences. It is up to 100x faster than Biopython for some translation tasks, in part because it uses a native Rust module (via PyO3) for the translation. However, it exposes an easy-to-use, type-annotated API that should still feel familiar for Biopython users.

⚠ Quickdna is "pre-1.0" software. Its API is still evolving. For now, if you're interested in using quickdna, we suggest you depend on an exact version or git rev, so that new releases don't break your code.

# These are the two main library types. Unlike Biopython, DnaSequence and
# ProteinSequence are distinct, though they share a common BaseSequence base class
>>> from quickdna import DnaSequence, ProteinSequence

# Sequences can be constructed from strs or bytes, and are stored internally as
# ascii-encoded bytes.
>>> d = DnaSequence("taatcaagactattcaaccaa")

# Sequences can be sliced just like regular strings, and return new sequence instances.
>>> d[3:9]
DnaSequence(seq='tcaaga')

# many other Python operations are supported on sequences as well: len, iter,
# ==, hash, concatenation with +, * a constant, etc. These operations are typed
# when appropriate and will not allow you to concatenate a ProteinSequence to a
# DnaSequence, for example

# DNA sequences can be easily translated to protein sequences with `translate()`.
# If no table=... argument is given, NBCI table 1 will be used by default...
>>> d.translate()
ProteinSequence(seq='*SRLFNQ')

# ...but any of the NCBI tables can be specified. A ValueError will be thrown
# for an invalid table.
>>> d.translate(table=22)
ProteinSequence(seq='**RLFNQ')

# This exists too! It's somewhat faster than Biopython, but not as dramatically as
# `translate()`
>>> d[3:9].reverse_complement()
DnaSequence(seq='TCTTGA')

# This method will return a list of all (up to 6) possible translated reading frames:
# (seq[:], seq[1:], seq[2:], seq.reverse_complement()[:], ...)
>>> d.translate_all_frames()
(ProteinSequence(seq='*SRLFNQ'), ProteinSequence(seq='NQDYST'),
ProteinSequence(seq='IKTIQP'), ProteinSequence(seq='LVE*S*L'),
ProteinSequence(seq='WLNSLD'), ProteinSequence(seq='G*IVLI'))

# translate_all_frames will return less than 6 frames for sequences of len < 5
>>> len(DnaSequence("AAAA").translate_all_frames())
4
>>> len(DnaSequence("AA").translate_all_frames())
0

# There is a similar method, `translate_self_frames`, that only returns the
# (up to 3) translated frames for this direction, without the reverse complement

# The IUPAC ambiguity codes are supported as well.
# Codons with N will translate to a specific amino acid if it is unambiguous,
# such as GGN -> G, or the ambiguous amino acid code 'X' if there are multiple
# possible translations.
>>> DnaSequence("GGNATN").translate()
ProteinSequence(seq='GX')

# The fine-grained ambiguity codes like "R = A or G" are accepted too, and
# translation results are the same as Biopython. In the output, amino acid
# ambiguity code 'B' means "either asparagine or aspartic acid" (N or D).
>>> DnaSequence("RAT").translate()
ProteinSequence(seq='B')

# To disallow ambiguity codes in translation, try: `.translate(strict=True)`

Benchmarks

For regular DNA translation tasks, quickdna is faster than Biopython. (See benchmarks/bench.py for source). Machines and workloads vary, however -- always benchmark!

task	time	comparison
translate_quickdna(small_genome)	0.00306ms / iter
translate_biopython(small_genome)	0.05834ms / iter	1908.90%
translate_quickdna(covid_genome)	0.02959ms / iter
translate_biopython(covid_genome)	3.54413ms / iter	11979.10%
reverse_complement_quickdna(small_genome)	0.00238ms / iter
reverse_complement_biopython(small_genome)	0.00398ms / iter	167.24%
reverse_complement_quickdna(covid_genome)	0.02409ms / iter
reverse_complement_biopython(covid_genome)	0.02928ms / iter	121.55%

Should you use quickdna?

Quickdna pros
- It's quick!
- It's simple and small.
- It has type annotations, including a py.typed marker file for checkers like MyPy or VSCode's PyRight.
- It makes a type distinction between DNA and protein sequences, preventing confusion.
Quickdna cons:
- It's newer and less battle-tested than Biopython.
- It's not yet 1.0 -- the API is liable to change in the future.
- It doesn't support reading FASTA files or many of the other tasks Biopython can do, so you'll probably end up still using Biopython or something else to do those tasks.

Installation

Quickdna has prebuilt wheels for Linux (manylinux2010), OSX, and Windows available on PyPi.

Development

Quickdna uses PyO3 and maturin to build and upload the wheels, and poetry for handling dependencies. This is handled via a Justfile, which requires Just, a command-runner similar to make.

Poetry

You can install poetry from https://python-poetry.org, and it will handle the other python dependencies.

Just

You can install Just with cargo install just, and then run it in the project directory to get a list of commands.

Flamegraphs

The just profile command requires cargo-flamegraph, please see that repository for installation instructions.

Project details

Release history Release notifications | RSS feed

This version

0.5.0

Jan 9, 2023

0.2.0

Jan 14, 2022

0.1.3

Jan 13, 2022

0.1.2

Jan 13, 2022

0.1.1

Jan 12, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

quickdna-0.5.0-cp311-none-win_amd64.whl (115.4 kB view details)

Uploaded Jan 9, 2023 CPython 3.11Windows x86-64

quickdna-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.0 MB view details)

Uploaded Jan 9, 2023 CPython 3.10manylinux: glibc 2.17+ x86-64

quickdna-0.5.0-cp310-cp310-macosx_10_7_x86_64.whl (236.0 kB view details)

Uploaded Jan 9, 2023 CPython 3.10macOS 10.7+ x86-64

quickdna-0.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded Jan 9, 2023 CPython 3.9manylinux: glibc 2.17+ x86-64

quickdna-0.5.0-cp39-cp39-macosx_10_7_x86_64.whl (236.3 kB view details)

Uploaded Jan 9, 2023 CPython 3.9macOS 10.7+ x86-64

quickdna-0.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded Jan 9, 2023 CPython 3.8manylinux: glibc 2.17+ x86-64

quickdna-0.5.0-cp38-cp38-macosx_10_7_x86_64.whl (235.9 kB view details)

Uploaded Jan 9, 2023 CPython 3.8macOS 10.7+ x86-64

File details

Details for the file quickdna-0.5.0-cp311-none-win_amd64.whl.

File metadata

Download URL: quickdna-0.5.0-cp311-none-win_amd64.whl
Upload date: Jan 9, 2023
Size: 115.4 kB
Tags: CPython 3.11, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/0.14.8

File hashes

Hashes for quickdna-0.5.0-cp311-none-win_amd64.whl
Algorithm	Hash digest
SHA256	`a86f736e08169511abf950bd4f20ce4738d3615f80476ab43756dc214a123635`
MD5	`f4832ebf2b96d4041eb4e4cac2950088`
BLAKE2b-256	`52720dde2c3d53ee06a1138eb8244dd388cc28c52a5ae1da4cb3e157e1a63aa8`

See more details on using hashes here.

File details

Details for the file quickdna-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: quickdna-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Jan 9, 2023
Size: 1.0 MB
Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/0.14.8

File hashes

Hashes for quickdna-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`ac589d30e650bfff0e98f954abdc79215939ddcb94376e502f2b4ae2f4566f40`
MD5	`30942440219d83b89d8d8dfe8843e100`
BLAKE2b-256	`3a58bae7d907b09f9e49be9e05b61848ae476e5d344648a4aed385f1be342e67`

See more details on using hashes here.

File details

Details for the file quickdna-0.5.0-cp310-cp310-macosx_10_7_x86_64.whl.

File metadata

Download URL: quickdna-0.5.0-cp310-cp310-macosx_10_7_x86_64.whl
Upload date: Jan 9, 2023
Size: 236.0 kB
Tags: CPython 3.10, macOS 10.7+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/0.14.8

File hashes

Hashes for quickdna-0.5.0-cp310-cp310-macosx_10_7_x86_64.whl
Algorithm	Hash digest
SHA256	`3ca0faa62507be6ace2db9615c01e7580a7f8bafed85741518c2764423359a63`
MD5	`2ae0ec876f88cc9235a5e185e7886665`
BLAKE2b-256	`9aadea9daa3f6f33280b59ae26ae5e0d06fa726f8cd85e37c8d2a9974ce4edf0`

See more details on using hashes here.

File details

Details for the file quickdna-0.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: quickdna-0.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Jan 9, 2023
Size: 1.1 MB
Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/0.14.8

File hashes

Hashes for quickdna-0.5.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`eeabbdc0928817f427b3f6ec03cea470a2ed44456aa647bcfc1dd7e67c01a2c5`
MD5	`51e958498322a540646224959829b87f`
BLAKE2b-256	`9b19500a94b7aceb91b76c7c463001c4720b69b8a77a8d0ccdfdd6dc5011f0e1`

See more details on using hashes here.

File details

Details for the file quickdna-0.5.0-cp39-cp39-macosx_10_7_x86_64.whl.

File metadata

Download URL: quickdna-0.5.0-cp39-cp39-macosx_10_7_x86_64.whl
Upload date: Jan 9, 2023
Size: 236.3 kB
Tags: CPython 3.9, macOS 10.7+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/0.14.8

File hashes

Hashes for quickdna-0.5.0-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm	Hash digest
SHA256	`ba1bfdfc9be9fced13b7f726a237ae2084192bdb3de72b989754c24a3791da2a`
MD5	`1ad5a89169e5ac2a5a8d78e86038e608`
BLAKE2b-256	`24383174c1d5994da34bf2bf1d7ae3608f1ef5058c273ea03e4449387b598c0f`

See more details on using hashes here.

File details

Details for the file quickdna-0.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: quickdna-0.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Jan 9, 2023
Size: 1.1 MB
Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/0.14.8

File hashes

Hashes for quickdna-0.5.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`1cbf77fb5600038424ad9a3db9e41fa6ade37fb58348726a9a62fce7dc367290`
MD5	`addddd412860745ec798d0fb4ab5ff55`
BLAKE2b-256	`d1ce3da31fb05f2060879df45b1fc6d92a20da5b021a10cf518f7614e987fbca`

See more details on using hashes here.

File details

Details for the file quickdna-0.5.0-cp38-cp38-macosx_10_7_x86_64.whl.

File metadata

Download URL: quickdna-0.5.0-cp38-cp38-macosx_10_7_x86_64.whl
Upload date: Jan 9, 2023
Size: 235.9 kB
Tags: CPython 3.8, macOS 10.7+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/0.14.8

File hashes

Hashes for quickdna-0.5.0-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm	Hash digest
SHA256	`5d6b8d531adaf361f0d6cbd4d52f5fd209e62874785063aca5abc4634da7b050`
MD5	`f8fdd836bef637f3a513e6d2ec67f9d7`
BLAKE2b-256	`9ba6a6e9c3c6f801915ac37d859bdc0516c9abbe746fb7f2d1610bd15e5ef4a5`

See more details on using hashes here.

quickdna 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

quickdna

Benchmarks

Should you use quickdna?

Installation

Development

Poetry

Just

Flamegraphs

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes