A comprehensive library for computational molecular biology
Project description
Biotite project
Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:
Searching and fetching data from biological databases
Reading and writing popular sequence/structure file formats
Analyzing and editing sequence/structure data
Visualizing sequence/structure data
Interfacing external applications for further analysis
Biotite internally stores most of the data as NumPy ndarray objects, enabling
fast C-accelerated analysis,
intuitive usability through NumPy-like indexing syntax,
extensibility through direct access of the internal NumPy arrays.
As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.
If you use Biotite in a scientific publication, please cite:
Installation
Biotite requires the following packages:
numpy
requests
msgpack
Some functions require some extra packages:
mdtraj - Required for trajetory file I/O operations.
matplotlib - Required for plotting purposes.
Biotite can be installed via Conda…
$ conda install -c conda-forge biotite
… or pip
$ pip install biotite
Usage
Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:
import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez
# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
db_name="protein", ret_type="fasta"
)
# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()
# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
avidin_seq, streptavidin_seq, matrix,
gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA
TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT
DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ
More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.
Contribution
Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for biotite-0.24.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85b934ea5e420b7909dd48480a134a1d34d75d5983a01bd2f8e34eab6dca6f27 |
|
MD5 | 3708f1202f2982813ec437e1a3983679 |
|
BLAKE2b-256 | 7c63f2dffaa0f2d9b3d96c2ed44c23c2da9922efd32298bec2de92dd93c34b70 |
Hashes for biotite-0.24.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44637a78f8a2199834f7f0ff398a56642a4f979503c9311bf9c51192f5ebca24 |
|
MD5 | 66baac2ba7253d337508ccf6a89cd828 |
|
BLAKE2b-256 | 95108b1b45d72701664965524d747b0e6a66946bcd3cb286dd58cd971b6eb60b |
Hashes for biotite-0.24.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 758d43b8d0943c38abb18eb776d39ae43987ffddee343ba6b22135dc45520883 |
|
MD5 | d8b8ecc75742197c22752408ef238cff |
|
BLAKE2b-256 | af64fb49fcb8df75269d2c8b47b4b1a2ec25515c73b8800e0131250176d7bc2e |
Hashes for biotite-0.24.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b23a2838d39e4c4e39bcfaeeaa57d31d56a971e3fa2378bb03c7be5386dda796 |
|
MD5 | af00dfdb67b33f3d116928bc313c9c34 |
|
BLAKE2b-256 | 76b7d0d648cea260e15e06b73c8fe6a7a22027fc534fb905dec8df84ff7fc7a3 |
Hashes for biotite-0.24.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | afb1bfd88ed8340d8dcff4e55ec731e2fed2a584f7c04b59c13c042f21949c57 |
|
MD5 | fb5c2d7c47beb83e86ec67295b19361e |
|
BLAKE2b-256 | 9d0517cdb677c2ba7d3cb8c4a2ef03bd2797fe9c237c7702136a13f60c338d4f |
Hashes for biotite-0.24.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eceae1ad76e7e486c436592768c76d3b8b7335c5f56587ca5dc8f7ad1f1d2454 |
|
MD5 | 23b65eb7afe1ebe45dead8e29c286ceb |
|
BLAKE2b-256 | ce4535220ec6d7f527d52f41600e54ea97a2618b2ef09e658949d7e7d955726c |
Hashes for biotite-0.24.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 622a10cfb572de075117b67cccd32fd1c9f29364d63d9475be07de3ddc7bd00c |
|
MD5 | 7b878ac1fab5a9df32bc310f09b015c1 |
|
BLAKE2b-256 | eea50c2549afe6b48425fa291fdb55d2f1d0080dfc780a0db417572432b5f494 |
Hashes for biotite-0.24.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58ba3176fe6ecbb991207b30db574d7985919c777b0632f523bec0ca0f50f722 |
|
MD5 | 9e1a74c37fa826bf45f82a33aad5e9ea |
|
BLAKE2b-256 | b85daccac2d7bf67fc01c4de2b3a53ecc439428253c82f120cbc8afe6f739031 |
Hashes for biotite-0.24.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dbf106621c54e6e365ce06bbf6a2a8e7ad6d07c9f933b945ead5b51a7e019255 |
|
MD5 | ac565fb181344ff3c5016feeb4e9cae4 |
|
BLAKE2b-256 | 91d4f1f738eb282097debf850c5c1f5b961a8ca866ed6904b36a13c5f950ce0e |
Hashes for biotite-0.24.0-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a606bfaa682630695bbcb23c748d5ce0045510cd5136e11578107f6d7cc81117 |
|
MD5 | e62bb13e27c7ff7be6abaed4acec6f69 |
|
BLAKE2b-256 | 747bc261ccfc851cdda84a10cc4a3c9f767929322b4a5feb6dbcebae3c23ad47 |
Hashes for biotite-0.24.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dc4046685ef852a756f01b08fbd9c54f3c166f4b531c43f0fdb3f243f5ce055d |
|
MD5 | c5a7bddac3bd333fda1ed145ea10da39 |
|
BLAKE2b-256 | fd15564de75ecce36b679d75e503304abcdd06d24ccf7a4817724806b599d7aa |
Hashes for biotite-0.24.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f69fb3396f5397ef546ecfae4fe4b1b0b59c79c62afb74c97b482bc82906c1c1 |
|
MD5 | 142a2610abb69a392bd0bf53671392d8 |
|
BLAKE2b-256 | da95ade44f3b7bb894027cb7399df045eb218e2e2cfded2c4d479f03d2395143 |