A comprehensive library for computational molecular biology
Project description
Biotite project
Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:
Searching and fetching data from biological databases
Reading and writing popular sequence/structure file formats
Analyzing and editing sequence/structure data
Visualizing sequence/structure data
Interfacing external applications for further analysis
Biotite internally stores most of the data as NumPy ndarray objects, enabling
fast C-accelerated analysis,
intuitive usability through NumPy-like indexing syntax,
extensibility through direct access of the internal NumPy arrays.
As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.
If you use Biotite in a scientific publication, please cite:
Installation
Biotite requires the following packages:
numpy
requests
msgpack
Some functions require some extra packages:
mdtraj - Required for trajetory file I/O operations.
matplotlib - Required for plotting purposes.
Biotite can be installed via Conda…
$ conda install -c conda-forge biotite
… or pip
$ pip install biotite
Usage
Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:
import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez
# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
db_name="protein", ret_type="fasta"
)
# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()
# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
avidin_seq, streptavidin_seq, matrix,
gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA
TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT
DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ
More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.
Contribution
Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for biotite-0.26.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 235b6e5fbd495013b50acaf98cd6483d2c137ff5c079003b7fc39e859ee8b24d |
|
MD5 | 593ebcc60e1f1f4f089d87fc8efa011e |
|
BLAKE2b-256 | 4b01b37a26150f8ed4d6c28afcab6a15786dc70e441a5b10370c7d53f6db42b5 |
Hashes for biotite-0.26.0-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 53af5b4e81ec48438e5fc92e50e4a9cece63330409eb3ea2436e3b76c9e69906 |
|
MD5 | d0ae0fe4f927cae6dbba815fce8c7ebe |
|
BLAKE2b-256 | 6ca3e7cf1040910008fc468f2499db6223dfd48e28f414f97f4b35c6e3cf01f2 |
Hashes for biotite-0.26.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3a027127e983e1bfa24f9ae945e58117a5d95c47b9a3f9bf110442e54169afc |
|
MD5 | bcd70ce6dfb657a4b5510afb746ac55b |
|
BLAKE2b-256 | 5579ca57aeb36df823b688441e8272a7868124517fae903dda563135676fc22e |
Hashes for biotite-0.26.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b68f486bd718148f58ad4a5b65da5d1d4285c71338a5d558b5edd5b15ada806 |
|
MD5 | 1a19c13a1e18cdb3ae09630561b1fa28 |
|
BLAKE2b-256 | 274f9847a4fe3ac880772767ddbbb00f553dda497c9be397f29d7836ecc69c92 |
Hashes for biotite-0.26.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8820d22174f739b60712f8b463456d761722c3d06ad2c46dfbd8fcf3f9c42a7b |
|
MD5 | ca90badad22a8d5bdfddd097eef40480 |
|
BLAKE2b-256 | 2f88d5d6e95a0f483f7fe9e08cab4dfb456885aba16be64c6e592101696bfb3f |
Hashes for biotite-0.26.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85a6da67400f2dd1c9ecef28dc453d85acb14b2dabf7e716b9de631b4363bcb5 |
|
MD5 | 2bdccdb28090265407a2b9048c3e0ff9 |
|
BLAKE2b-256 | ca6926e27c43c3b9ddaec8ee7b8a20db9059d777c55965d4c19bf88f56051a2a |
Hashes for biotite-0.26.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 56305879fd730bfa847e74b8edd095ba07d321b61ad06768490ad1042ccac6e2 |
|
MD5 | ecda3c7b7069a6bbf9fdc5f624b80e4a |
|
BLAKE2b-256 | 02d87a29cb7da84fd4a01619cf576b3f3e3a5af41a33b873bfbae26c8859f491 |
Hashes for biotite-0.26.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd82423bbbd38cf7e41120cc9dea1a6dc2d620f0e66f93f5b1ce3470c2ee68f0 |
|
MD5 | 2ef4f47b7930030630587c69ebac1fec |
|
BLAKE2b-256 | fdae30fb69551d4f93020870510148b5e864fc115552059af39cc23b643c7d77 |
Hashes for biotite-0.26.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5af440ba0984b5da981a7930510de6727df5950eeea0918442a0f3a7f394c9e9 |
|
MD5 | 16247299deed534ee2aee59c657e121a |
|
BLAKE2b-256 | 2dbc6fb6f65ca8126760d25331199fb263342c1f0c17437c7ba7daad9b03e224 |
Hashes for biotite-0.26.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c6f2a04c6a30bf82612645e0556b80f7808fe6339026f5f4996c912c59eaae63 |
|
MD5 | 7f237c187d04f089a1e6fec6ef4e980c |
|
BLAKE2b-256 | 3bff44cfef6506eb1bedc7fa278ecc42ced04e7988e0338c35f8e6ae360deca3 |
Hashes for biotite-0.26.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39cd3c4450cfbb617be40f8bbd033a0ed73177ea622b168e0eb11b9970d70090 |
|
MD5 | 5e9c295c25743a78f0d1206ba7e3fb56 |
|
BLAKE2b-256 | 961a753abaf589edb6c02bc22cf603a0207967248bf97fcdda1ea4870bbab857 |
Hashes for biotite-0.26.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45d2d79e4096284e42a5dd619be8edb87954fa651700e86ac24716072495d784 |
|
MD5 | 6b3ffd5c8c7aeff19900396b202b0003 |
|
BLAKE2b-256 | 6591fde9e415f8eaa754882c28a6ff1a6a2d0ba16c7b7951b5c2a324932d0f43 |