A comprehensive library for computational molecular biology
Project description
Biotite project
Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:
Searching and fetching data from biological databases
Reading and writing popular sequence/structure file formats
Analyzing and editing sequence/structure data
Visualizing sequence/structure data
Interfacing external applications for further analysis
Biotite internally stores most of the data as NumPy ndarray objects, enabling
fast C-accelerated analysis,
intuitive usability through NumPy-like indexing syntax,
extensibility through direct access of the internal NumPy arrays.
As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.
If you use Biotite in a scientific publication, please cite:
Installation
Biotite requires the following packages:
numpy
requests
msgpack
networkx
Some functions require some extra packages:
mdtraj - Required for trajetory file I/O operations.
matplotlib - Required for plotting purposes.
Biotite can be installed via Conda…
$ conda install -c conda-forge biotite
… or pip
$ pip install biotite
Usage
Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:
import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez
# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
db_name="protein", ret_type="fasta"
)
# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()
# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
avidin_seq, streptavidin_seq, matrix,
gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA
TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT
DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ
More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.
Contribution
Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for biotite-0.38.0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1369e8ca52e5484e9ebc57bcf5f858b84eb000711ae83b0b6d8e220e529ff423 |
|
MD5 | 43f017876108f461c422f4b011d443e7 |
|
BLAKE2b-256 | 201475cc969c9ef9dd6147f3d52ff8520f7ba3c6609421fa176163d03e304a76 |
Hashes for biotite-0.38.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b48e803d7a8a30dbf2072b0683eec464abb8d792b069a9ce47acb8205ecef7a7 |
|
MD5 | 5fe595df732c871893b4ddf9a4d7f07f |
|
BLAKE2b-256 | 48f66a7deae3325034294e2ae1bf8242777c140fc3d2d47b02aaf1ea9be37504 |
Hashes for biotite-0.38.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 11cc8bdeb34618eb88e5a7f3d183fe325cb9c12ddc69cdcaa81a3e90202df6cc |
|
MD5 | 108f7f0c72bd5a75eb60d0ce0b6b271b |
|
BLAKE2b-256 | c3a44966684c45ca63f2636450f80af177a602392fb8a65daacfce1212393ca1 |
Hashes for biotite-0.38.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2df0214ade0be19cb7025e8acedb18c882cad0f9a7bb99e459e6e3884813df3 |
|
MD5 | 9ab738dcd55c58363fa1b9224ef7a473 |
|
BLAKE2b-256 | e9b11375e204abc0779b2ce61c85f1c0ecbdbd4a4ea5dc3899ab72977e3c121a |
Hashes for biotite-0.38.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1683277036018f0bd026c4d537e197c2bde5c6950f7b7e08782f75bc718cee2c |
|
MD5 | d81214d44d47a89152e7791dc7d5a4bc |
|
BLAKE2b-256 | 22b60e3caab0c0d5981e0a6485d18b234f6236746fbf2e1e92a35d416c3a7911 |
Hashes for biotite-0.38.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa0d71b48f5ff1693d73fa3e7ef7bf2ef3a986a14073327fd53413b6e534ce68 |
|
MD5 | c831b0e332a268407df899601c5e3f23 |
|
BLAKE2b-256 | bcc1c7596357123595c5896308c40391d503200d6939facfea4770c8701d9ec0 |
Hashes for biotite-0.38.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3669b2794b3326e21bba2f566b9eb4585c7aed335c5d303dbe07c641f6c1cdc |
|
MD5 | 0c5c62423823e6e42b6a14a6b53a55d7 |
|
BLAKE2b-256 | 802d3800d9d4e82f1b4e1369fa740c9410e726d99b5ae122a1da5d396d827862 |
Hashes for biotite-0.38.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1b7c11684b3918d5a05d82ec19fb800fb46951cd1b9a549e1e59b7f23ea5de51 |
|
MD5 | bfbf1d6a45856d28b1c39d667c271167 |
|
BLAKE2b-256 | 6a475280a98c8c0ffe67289a0b114e5bca0731cb7d1202ba7adcc04e54f6d99f |
Hashes for biotite-0.38.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8e1e380a70d8800537a9cb9310143c0a8d2117f44947c372e30ba59d8e6e881 |
|
MD5 | 88d10c5a862706c45aae01e0bf3bbe26 |
|
BLAKE2b-256 | f31b35b1bc60fe56aa9fe447a4ef2b6934eb5437bc0bbf31110400470fec4f33 |
Hashes for biotite-0.38.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bb2ca85d3abfb7d42e97e4f0edd431f2ce7a4d5841f6e95b662da4cf332dde93 |
|
MD5 | a97edb229ec094fd10152043da258a5c |
|
BLAKE2b-256 | 63fa57ef0eb3d64eff678ddcbe1f55570575bc3edec56881aa8eabc3987c8b35 |
Hashes for biotite-0.38.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ad94df70d10b945073459c56855b1fb4661bb44d534b804b8264629d57c2c0d |
|
MD5 | 030d88f80b8fa71a733b7aaf87e7235c |
|
BLAKE2b-256 | fb1b2a5d85d61f1e3d0b1e790276387fa2920541d48a4c42112666fc473a9065 |
Hashes for biotite-0.38.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4465797916b7adcfa16704665f93ecf4d9c72c8a14c421841944c2e6ad6c87d8 |
|
MD5 | 8d8bb85b40ca58c60844b43f29cd5244 |
|
BLAKE2b-256 | 4e0d3fd6f06e2501ca7367eeee50e7638cf2382607e410eae169008fd48d9f12 |