A comprehensive library for computational molecular biology
Project description
Biotite project
Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:
Searching and fetching data from biological databases
Reading and writing popular sequence/structure file formats
Analyzing and editing sequence/structure data
Visualizing sequence/structure data
Interfacing external applications for further analysis
Biotite internally stores most of the data as NumPy ndarray objects, enabling
fast C-accelerated analysis,
intuitive usability through NumPy-like indexing syntax,
extensibility through direct access of the internal NumPy arrays.
As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.
If you use Biotite in a scientific publication, please cite:
Installation
Biotite requires the following packages:
numpy
requests
msgpack
networkx
Some functions require some extra packages:
mdtraj - Required for trajetory file I/O operations.
matplotlib - Required for plotting purposes.
Biotite can be installed via Conda…
$ conda install -c conda-forge biotite
… or pip
$ pip install biotite
Usage
Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:
import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez
# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
db_name="protein", ret_type="fasta"
)
# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()
# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
avidin_seq, streptavidin_seq, matrix,
gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA
TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT
DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ
More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.
Contribution
Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for biotite-0.28.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0f662e13075b2afe2b4e549d8c829730b30e2bce3a6256f9e921fa0a0b26e399 |
|
MD5 | eb3f6524720b225471c37a7742281a57 |
|
BLAKE2b-256 | 537ea4be5712be353988d5df31740825d70f41dfc8cb07e6baaf167426c1bf81 |
Hashes for biotite-0.28.0-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4a6d77bd082fc0162e3c75328edd850b6a4d4bd1cdc4999f61079a022ba3d36 |
|
MD5 | ead6342544bc656ca47f91fd821a9fb0 |
|
BLAKE2b-256 | 95f9976bf57cc8995c655dae5f94e73b63f9867c81bbd76c6ff7ad062928bfd1 |
Hashes for biotite-0.28.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 84c64cc1a23a8b656f1dd8caf3d4cc5891454fec7f2b05e007345f287a496f4b |
|
MD5 | d1973788b58dcb07cb58518041b47dfd |
|
BLAKE2b-256 | 496086bd3934001d16cfe54ba24cb7ebb98e236da56497436913575c125a928a |
Hashes for biotite-0.28.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a11915d298df9e16661227fcb5a115fe39f30e798658c7fc7564c8131b8efd0a |
|
MD5 | f581da268b6434a4b30b30656398ebad |
|
BLAKE2b-256 | 9d8cd399c6cc269368b47ad24fee55912f7aaa86eda11b7446aee37b0d26e4d3 |
Hashes for biotite-0.28.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f7c55ac55ef77a7306350001172822260cbd9168eff7f569f0cbeb8f6c5c033d |
|
MD5 | f3977094143955e9b0696905aeb9913c |
|
BLAKE2b-256 | 11bab02dc87bdabb23b43c8ebb67ae4539794fd1c00ae25ef985ca0049e66bed |
Hashes for biotite-0.28.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fad730d93ed686e42f207519dbdb4f20c68d7765881326601f3aaa76a827db1c |
|
MD5 | 72aaad4183a6414140e1ff2826708331 |
|
BLAKE2b-256 | c5575c35234d1de82fc1d2dc4b4e6926a1bcbdb3777ceb6b3a07f7c80c11cd9c |
Hashes for biotite-0.28.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18202c83bf58e4dd0e1ba7b796c8c16d63a4e52b7a83aee98588cef5e109a684 |
|
MD5 | ec4b6881ec617d518a634533224910f8 |
|
BLAKE2b-256 | 87cf8c00e614d5a4dc2ec9d2f5ac1386fe02872550fd86aa4e73f908d932b520 |
Hashes for biotite-0.28.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ba648911eb2d996fd38f7843c0b46f43514e95fc6c300af3534e84411450d6b |
|
MD5 | 4dd608a9e81ae924530ff453dfc98d9a |
|
BLAKE2b-256 | 8c5534c8cab565eba1047a430ee6c7794af07067e812b0b3485b396887e3acff |
Hashes for biotite-0.28.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ddaebac1c02892e4bfb94733136d166120ae6d76409fd16302aed63f74faf748 |
|
MD5 | 5a12635236f055bd5c30dc3307672297 |
|
BLAKE2b-256 | 77b566c630c1b55e6889991a1cfd7f15668623a0034fb6c8fa964ca2f57f396d |
Hashes for biotite-0.28.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee567f8223359c018df3e73cab711becc8d11b7792877b8f55baa30ba71b645d |
|
MD5 | 118e3ec0e109a451110a27a2bdbace07 |
|
BLAKE2b-256 | 8750785e1f119e97185e3bf9128b99654b10854ccf9a3fa6ac058c39fee723c1 |
Hashes for biotite-0.28.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae6b90996da45536d2f6efd43b6c925e43eb8c7fec0e63d8ba768ca23b057ad9 |
|
MD5 | 66cb4617a892b11fa63ddcad2ad658ed |
|
BLAKE2b-256 | ab856d64888f8e3f76130019efe63b281eeed524a4cdece502370484573257df |
Hashes for biotite-0.28.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3841548e5cb24722eb2b20cd811ea5806a25144ea96d804f63ee2ef1121f964 |
|
MD5 | 97ad4e0413a9e579eec4d6d4e916d677 |
|
BLAKE2b-256 | 21692a805c8ea16809d31b5bf448c4cdd0cfd21020af8752148b004230ade303 |