A comprehensive library for computational molecular biology
Project description
Biotite project
Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:
Searching and fetching data from biological databases
Reading and writing popular sequence/structure file formats
Analyzing and editing sequence/structure data
Visualizing sequence/structure data
Interfacing external applications for further analysis
Biotite internally stores most of the data as NumPy ndarray objects, enabling
fast C-accelerated analysis,
intuitive usability through NumPy-like indexing syntax,
extensibility through direct access of the internal NumPy arrays.
As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.
If you use Biotite in a scientific publication, please cite:
Installation
Biotite requires the following packages:
numpy
requests
msgpack
Some functions require some extra packages:
mdtraj - Required for trajetory file I/O operations.
matplotlib - Required for plotting purposes.
Biotite can be installed via Conda…
$ conda install -c conda-forge biotite
… or pip
$ pip install biotite
Usage
Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:
import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez
# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
db_name="protein", ret_type="fasta"
)
# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()
# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
avidin_seq, streptavidin_seq, matrix,
gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA
TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT
DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ
More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.
Contribution
Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for biotite-0.23.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78d117717250650279b4e3617f0f7b23108127922a62afdfa98567eceafe34bd |
|
MD5 | e9f976a36579e1f7edb74b47f81996a5 |
|
BLAKE2b-256 | d8d8991882605fb9567c87f551712d0b388a3395037db416c48b347633b43a60 |
Hashes for biotite-0.23.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7b5150d08e4634459b1a38ac3fcfea0dbae454bd379470a1d6b3244f0d642abb |
|
MD5 | 96bb20b69bb2062c674d06390707b99b |
|
BLAKE2b-256 | 2969285330e31590ceb878e048a81a04e6a075cd98d8f95ed48eac3ef937eeb0 |
Hashes for biotite-0.23.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dae8a1e9420fd939a2604dd6b4cbabbd283b812051d7ccbf21bcb89a552ff1e8 |
|
MD5 | 661ab74d9e5f8896fce3f8d7b57c431a |
|
BLAKE2b-256 | 4688b6df5ac83e1cca798905d6a01f5b8d43c3115152459e8637c0fc0207d6c8 |
Hashes for biotite-0.23.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bbfbcb9f965244fd2991c3be52a51c562b383fd8fabbf87334412d66b9c8ae4f |
|
MD5 | 74226613bbe749608d0974bf2fd82665 |
|
BLAKE2b-256 | 368d8b07fdeede7f6a0d8cdeab6cf273655d76137ab2337801cfc68ec988c0e3 |
Hashes for biotite-0.23.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 598e271d03b62f5c1f2a990396c049ef696f6869ea29d020872782d8945f77a9 |
|
MD5 | 29d2e80261aeff06708718c542db33ae |
|
BLAKE2b-256 | abbe6d6909e4ab205b7b702c4ec1920c184d8f491d991d102ecc9df56ec8055c |
Hashes for biotite-0.23.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a8d1d2a9677fb308f6bbb7e8bd8c7ee83cb2297aeb8954ec9015349d4e6d11d |
|
MD5 | 3a90d8f9f68a1a2617e73ae05316dc5b |
|
BLAKE2b-256 | 84efa5ae56262ba1a9923f9a4fdf76a0bb3a0302dbd1df3d2674c615afafe732 |
Hashes for biotite-0.23.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 565e35e6bb0922f97afdb649c881a69514c6dc5bc3d8f29a7bb8b480c2a2b625 |
|
MD5 | 153af72cb55d895984180f6aa7419fb0 |
|
BLAKE2b-256 | 5d8d75ba755f4016a9f6be7d0163601d7eb5d0c9949f78d5b23916b6faf18264 |
Hashes for biotite-0.23.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d48335e56f0bdf53bb92808fa7e7388cbf5f5bfc6cdac8dd10712b925fdf9bb5 |
|
MD5 | 4a94c4389b91526a9d24931851d09aed |
|
BLAKE2b-256 | bdc45cc7d0f0851aa0e0b1ffc116a9cf3839e7f326ff0f7f4b45f0fb9500ad78 |
Hashes for biotite-0.23.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bdecdd4f75328edb480c4ef910bad9580dd9e0b061b2a5405f53b98a3c9759a8 |
|
MD5 | 4d55cf5359773989604b04270641959d |
|
BLAKE2b-256 | f34630f7320d36acdd5f7d00682f52bb8256725fc884e7d871b1226731ae1a3d |
Hashes for biotite-0.23.0-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44ddbd706fd8f4811e7ee7e01349aacc82ef8e859078dcf25f84e65d12cad4fa |
|
MD5 | c367ec1404329672b86d92604a849e9b |
|
BLAKE2b-256 | a4482be750b288cc6b8022d7b533d3b1681641b224a9cf467fe5f4598f0c1742 |
Hashes for biotite-0.23.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16fd75c46d58a9146ad0ce941decb5355c7c6ad774b022e58e26e8dcd6a02a0c |
|
MD5 | 1ff4bfc2ebc93dde26761dd594793c76 |
|
BLAKE2b-256 | 3a67cec1667d6b6b8d7cddbd88b15ab42bed89036ad5e627065b482fed91f6a8 |
Hashes for biotite-0.23.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e90c1dd243437f62918eff7e7cd52b81a10440e18d09cf7cb28fa17c3f095f12 |
|
MD5 | acbab73a9f820d7a8899f41106b9aefd |
|
BLAKE2b-256 | b36caa728ae5310a66e8b5f0a200c3e37d1a6c72d070053bbfff7a832b59f962 |