A comprehensive library for computational molecular biology
Project description
Biotite project
Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:
Searching and fetching data from biological databases
Reading and writing popular sequence/structure file formats
Analyzing and editing sequence/structure data
Visualizing sequence/structure data
Interfacing external applications for further analysis
Biotite internally stores most of the data as NumPy ndarray objects, enabling
fast C-accelerated analysis,
intuitive usability through NumPy-like indexing syntax,
extensibility through direct access of the internal NumPy arrays.
As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.
If you use Biotite in a scientific publication, please cite:
Installation
Biotite requires the following packages:
numpy
requests
msgpack
networkx
Some functions require some extra packages:
mdtraj - Required for trajetory file I/O operations.
matplotlib - Required for plotting purposes.
Biotite can be installed via Conda…
$ conda install -c conda-forge biotite
… or pip
$ pip install biotite
Usage
Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:
import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez
# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
db_name="protein", ret_type="fasta"
)
# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()
# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
avidin_seq, streptavidin_seq, matrix,
gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA
TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT
DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ
More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.
Contribution
Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for biotite-0.29.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8344839e63c4594029dd078aa10bb238d0c87ec28a48e2047622a6b7bbcd8b82 |
|
MD5 | e580c3ee987513e3599f7a27176b46e6 |
|
BLAKE2b-256 | 9ad93145b0b4181e6106355fe18a585ae2e201f7a1f532688dc483c499837364 |
Hashes for biotite-0.29.0-cp39-cp39-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 36e33f1504b5deae58e11e8447561946b8c93a7a14d7e7d3767d9f87c3c1daa3 |
|
MD5 | 05545efed876f892a958fd280afc7281 |
|
BLAKE2b-256 | ec218df5aff5e821611e2f65843fdf469c736ed54178bd2c16fd9789e97a8ab1 |
Hashes for biotite-0.29.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82ca3819eff4a039043637b9cd06dabc9e5dbb03377f5b8cc20266075a78bf15 |
|
MD5 | c1b5a17f61eb37816e25c3075162d6b6 |
|
BLAKE2b-256 | ad3e3380754b24a2e42465d074ab403e1f0e687ec8a11bbc5cd398a160616fde |
Hashes for biotite-0.29.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f584792629deecacb81d0ca6b3fdde06faadd5a0b72f184083492cffe1c5dcd |
|
MD5 | 9df41872289eca144ffc76e5616da57d |
|
BLAKE2b-256 | a6bc3357bb833fb6ba39454a75b1841a22fd8c79b6bddd4bc2194bad03487507 |
Hashes for biotite-0.29.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8fe96f57e039db860605f5fbc49ed7a43129bfdfd53d2fe00458cee991d646d2 |
|
MD5 | f6db6bc5afccab380de80aedb0dd2116 |
|
BLAKE2b-256 | 90a708be7149d1471e6aa1be4abe6e319e7979aa07b25d3e87b815e676c8e1ca |
Hashes for biotite-0.29.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 28cdf63fce1efa19b45527f55ace6641502e625e5d0b804d94d050d11837b6cc |
|
MD5 | 1e52f2213023768e38b3b51977bdd6e9 |
|
BLAKE2b-256 | f2457ec2588d4c4e6a79a473d211786550889b01cc4ec557f4c8add55674c3f9 |
Hashes for biotite-0.29.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 72e6c8609782b0fa8ee2e04ca72b249be82f1e25f7c10ac886daac0531145433 |
|
MD5 | e041d6eaa938be307a031f0f2f8afd96 |
|
BLAKE2b-256 | 708e43f8d9b270a42f15cdfed4766f299d9a89540e365998cba97cb07196ea6a |
Hashes for biotite-0.29.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 073f9cb968be2dcf701061089a43b1b684736f40131a0a803bbcfae9174d90aa |
|
MD5 | 1520b65a5d6eaa6bdf463ab8c1fef022 |
|
BLAKE2b-256 | 20405747a34e1db1d6130bb1921adb9ba702a08acb4ad401f976f9486ecd5b83 |
Hashes for biotite-0.29.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8fcda55b29c47164270d318d21d93a20f20595add0e3cd95093503be1f8fc5c6 |
|
MD5 | 12080387469083b5462c139b2b36b9d7 |
|
BLAKE2b-256 | 7801b9bfa41e0ff78eb30c24a8588e87b5da9a38e0ddefa18b5c88c4d35f29c0 |
Hashes for biotite-0.29.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6f7eff782bb1123aea58570c5c51ab665ed0ee5b51743f499bc38ee08df43bce |
|
MD5 | 1ec524d2a542ba5cc83643c3d89cc93c |
|
BLAKE2b-256 | 822f7735f99d93b8c6a2ee320de58029624ab7391ca4d755fa729be283f00b68 |
Hashes for biotite-0.29.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82135eb29a9548b81a3a9c1df92b870e3545b8e57776bb864eb01c07b64bec28 |
|
MD5 | 000d1919bf77fcc920a9604417f53c95 |
|
BLAKE2b-256 | 66bea154ab3f0a5cc23a5eb59bbd5930cd36d8a2dc4ee0bb8c08e8ba41d6c0f8 |
Hashes for biotite-0.29.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1d4cb911de30da2b39a64368fd4737e1e4b90c10f36a14de275cacb5f90cec4 |
|
MD5 | fe306303a86b2a281bd414991ff27d4a |
|
BLAKE2b-256 | f0c14b3aa8375e01c2b629f7e619831533b85a5dd7100ff91993341fa29db1da |