A comprehensive library for computational molecular biology
Project description
Biotite project
Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:
Searching and fetching data from biological databases
Reading and writing popular sequence/structure file formats
Analyzing and editing sequence/structure data
Visualizing sequence/structure data
Interfacing external applications for further analysis
Biotite internally stores most of the data as NumPy ndarray objects, enabling
fast C-accelerated analysis,
intuitive usability through NumPy-like indexing syntax,
extensibility through direct access of the internal NumPy arrays.
As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.
If you use Biotite in a scientific publication, please cite:
Installation
Biotite requires the following packages:
numpy
requests
msgpack
Some functions require some extra packages:
mdtraj - Required for trajetory file I/O operations.
matplotlib - Required for plotting purposes.
Biotite can be installed via Conda…
$ conda install -c conda-forge biotite
… or pip
$ pip install biotite
Usage
Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:
import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez
# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
db_name="protein", ret_type="fasta"
)
# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()
# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
avidin_seq, streptavidin_seq, matrix,
gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA
TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT
DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ
More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.
Contribution
Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for biotite-0.27.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a10265430628deabf73b6690f094d9ad1b546354e78bd5f0731e61e8731c7d60 |
|
MD5 | 09d2128b820098b0e5674ba56f0193f8 |
|
BLAKE2b-256 | 15af6ea1280f65c91d81b087388d47064a0754ace88a883128ff2b7e7201154f |
Hashes for biotite-0.27.0-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8a69a478309b9ffab42fea219aa24117a0fce2710bc45a913580d8240d266d6 |
|
MD5 | a7442211a4fdc5e7aeacd55165026cbf |
|
BLAKE2b-256 | 874c1435fe26ba9ff92dacae1d934408a05630db35ab53c8604846da8466d572 |
Hashes for biotite-0.27.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e0dfab992b676b36c2b05d8693e481163d2a0c0db72d2cfa384f3eb5357c19a5 |
|
MD5 | bc8cfd94061048ca9a9dea96145f5d5c |
|
BLAKE2b-256 | a0d951af3db2aeadff9265738e696e8237b35eab5977139f1f11a416b1f6adc6 |
Hashes for biotite-0.27.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63d1013be64236094efc7d08443c6e8c534ce8b6ae55afb489cc0462ba851e5e |
|
MD5 | bb1afcd23dd0e51f2da077257a704dcf |
|
BLAKE2b-256 | 6014f0611a8cd2beff686a6a87b4ab978ea8bb3c2be6c30f61256f4b8d6bd939 |
Hashes for biotite-0.27.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d15dc4253ee1370c7193c6071e8bb663bcb4ecbaf00aaacb23613e2718d017aa |
|
MD5 | c0ddf21306b4edcd06a80e4b3586168c |
|
BLAKE2b-256 | 1be02c77f7c48fbb628b44dfee47822db673973686280085a3c5eb9a4cade6d6 |
Hashes for biotite-0.27.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e80b3f9991041dcfc74d8f27cf7e0d139ebdd01081ec6fe34b8bdd0489fdd4e |
|
MD5 | dfd5a182329da2d8218c8f3c1e1896a5 |
|
BLAKE2b-256 | 2bfd051114c64ca3c3a16d4dd60e693ce6ad049c93b90cf9acaa7ee89ddec143 |
Hashes for biotite-0.27.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 769b769dcce043256a45a1c0d058d3c34ba8f70fee2fdde1ff9cff7b4ac7dfdf |
|
MD5 | dabbaf635a1381c1282339ede9cfd051 |
|
BLAKE2b-256 | e1fd64762ac1356521e5b58459c1b14f2b899a735fbba5cb62672d77c0be319f |
Hashes for biotite-0.27.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | caa295d46127999437fc4f85c9729045792b57d17ce478a1b1ab39a340d060ba |
|
MD5 | 7476deb9a0e914e5fefa20316b042ae9 |
|
BLAKE2b-256 | 135902f8b3f04bdebbc230d3f7264a4d46733102ab4480da6de9d56d04646136 |
Hashes for biotite-0.27.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3cd3aaad8f2bf5c864da3b4da6a7b5eea4044a24fc7aad178a8fdc74dce742ac |
|
MD5 | 39cef96b94922c222e22454b1d4462c7 |
|
BLAKE2b-256 | 19745cdf45d170d1637ade1d637f1259da1d80291654aa32ed8bb60395344b64 |
Hashes for biotite-0.27.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b3407e563627a7e27aa2492762158ad6fd9178b1682d7a68ecbddee2039fc9e8 |
|
MD5 | e31e84ab0d5add3cc398014be59c130b |
|
BLAKE2b-256 | 3f50eed21046dc2ea27c2ec0d4b053a15e30878b99bef1e6faf086c3e39254e1 |
Hashes for biotite-0.27.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94aa8130cc06c466c8ed0be7909f3e90af85ce4133f7f28b7098aea888e1d9c0 |
|
MD5 | c0081d50179fe8461c858beacbad6564 |
|
BLAKE2b-256 | 7e10bb3e94f45bd0c3cf6cafc419e31f558ec8b65ba0e7687b91b1d2ecfddf4d |
Hashes for biotite-0.27.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cef1a453aa1d3ba7d95a89cb4ca598c2b09e1721b4dedb90f10f38237b30b36a |
|
MD5 | eb2eac30b8a19df305ead492a88b5609 |
|
BLAKE2b-256 | 75a102e858e1ce7d41ff0f9d3d0300a7860c17b91bc5592419af814a6560534f |