A comprehensive library for computational molecular biology
Project description
Biotite project
Biotite is your Swiss army knife for bioinformatics. Whether you want to identify homologous sequence regions in a protein family or you would like to find disulfide bonds in a protein structure: Biotite has the right tool for you. This package bundles popular tasks in computational molecular biology into a uniform Python library. It can handle a major part of the typical workflow for sequence and biomolecular structure data:
Searching and fetching data from biological databases
Reading and writing popular sequence/structure file formats
Analyzing and editing sequence/structure data
Visualizing sequence/structure data
Interfacing external applications for further analysis
Biotite internally stores most of the data as NumPy ndarray objects, enabling
fast C-accelerated analysis,
intuitive usability through NumPy-like indexing syntax,
extensibility through direct access of the internal NumPy arrays.
As a result the user can skip writing code for basic functionality (like file parsers) and can focus on what their code makes unique - from small analysis scripts to entire bioinformatics software packages.
If you use Biotite in a scientific publication, please cite:
Installation
Biotite requires the following packages:
numpy
requests
msgpack
Some functions require some extra packages:
mdtraj - Required for trajetory file I/O operations.
matplotlib - Required for plotting purposes.
Biotite can be installed via Conda…
$ conda install -c conda-forge biotite
… or pip
$ pip install biotite
Usage
Here is a small example that downloads two protein sequences from the NCBI Entrez database and aligns them:
import biotite.sequence.align as align
import biotite.sequence.io.fasta as fasta
import biotite.database.entrez as entrez
# Download FASTA file for the sequences of avidin and streptavidin
file_name = entrez.fetch_single_file(
uids=["CAC34569", "ACL82594"], file_name="sequences.fasta",
db_name="protein", ret_type="fasta"
)
# Parse the downloaded FASTA file
# and create 'ProteinSequence' objects from it
fasta_file = fasta.FastaFile.read(file_name)
avidin_seq, streptavidin_seq = fasta.get_sequences(fasta_file).values()
# Align sequences using the BLOSUM62 matrix with affine gap penalty
matrix = align.SubstitutionMatrix.std_protein_matrix()
alignments = align.align_optimal(
avidin_seq, streptavidin_seq, matrix,
gap_penalty=(-10, -1), terminal_penalty=False
)
print(alignments[0])
MVHATSPLLLLLLLSLALVAPGLSAR------KCSLTGKWDNDLGSNMTIGAVNSKGEFTGTYTTAV-TA
-------------------DPSKESKAQAAVAEAGITGTWYNQLGSTFIVTA-NPDGSLTGTYESAVGNA
TSNEIKESPLHGTQNTINKRTQPTFGFTVNWKFS----ESTTVFTGQCFIDRNGKEV-LKTMWLLRSSVN
ESRYVLTGRYDSTPATDGSGT--ALGWTVAWKNNYRNAHSATTWSGQYV---GGAEARINTQWLLTSGTT
DIGDDWKATRVGINIFTRLRTQKE---------------------
-AANAWKSTLVGHDTFTKVKPSAASIDAAKKAGVNNGNPLDAVQQ
More documentation, including a tutorial, an example gallery and the API reference is available at https://www.biotite-python.org/.
Contribution
Interested in improving Biotite? Have a look at the contribution guidelines. Feel free to join or community chat on Discord.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for biotite-0.25.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 756a87e5d13b35cbdd473a6404a44055329f18763ed7bdf5eb6f10004519a1a8 |
|
MD5 | 89cf91d69c1ed63caec4e60b07cf72ab |
|
BLAKE2b-256 | 91abd69351d50f3980792ccb1cfe395cb08dc8cde357296de4e980505c4217f2 |
Hashes for biotite-0.25.0-cp38-cp38-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a250207b870af3fc510fec858020d09057d8cf2d8153327a5b9705335a70daf |
|
MD5 | 2a2884ccc96c53b82725d5383c232bd0 |
|
BLAKE2b-256 | 33a78f8d295635dc52c94d5daffe82026e623144acc4f7f012d33be90af90884 |
Hashes for biotite-0.25.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 533926e874d0be787af58ca4e0deb22f04dfd1ec58b53353e6a05bfa2162604d |
|
MD5 | f468b19388d4df9f1c8291002e3229a7 |
|
BLAKE2b-256 | daf0ab593da56ec6048c9d021455a2a10d3f58c021e88bf50f2bb70b302f394a |
Hashes for biotite-0.25.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f3b6a9cdb5179000552f4056745e449563d9d954501402dfe5554fd0174a6b57 |
|
MD5 | 7a69a8bb79f75fa5642c9ca82e265af7 |
|
BLAKE2b-256 | 94f51f8e9603e738e7eb933750b50ad1b831e46be501864ad894b21b956289c6 |
Hashes for biotite-0.25.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 393fb4ee41c94d0bc012d75f9625fed1340842d6530d12652fc397c766934559 |
|
MD5 | 516460d37f399f8a052e02c92008313f |
|
BLAKE2b-256 | 679db51682f0e072b3896649ffb188507d3ef40206b89a9503d590dcfdb77cc3 |
Hashes for biotite-0.25.0-cp37-cp37m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4684ac7c6714b51cd871bc8e878113c422e7d5ee9133c9f9b4feb0cafc3a7a1f |
|
MD5 | fffd6142362fe26ab834f32754728afc |
|
BLAKE2b-256 | fb37a600729588165a30593984a4ab5ecc66cb1691718d5970a54c75a920e043 |
Hashes for biotite-0.25.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 94edde58249b251db05fe80cbb24f692893b73f390c62df012fb2112d1d7f6a8 |
|
MD5 | e91e4d21937b9693222d3b21344381f8 |
|
BLAKE2b-256 | 0a01c409de79ce0582feac917476d29f9339f397b7b0fbae8ab8b509b0bc8b95 |
Hashes for biotite-0.25.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | edcb8572c589a8e7efe1d9e942348f416cd0df05688f3a654a5ea183578b270d |
|
MD5 | 3ac394041c54c90006df2cf1f1de165e |
|
BLAKE2b-256 | f2e57e94549d8700f5fdf4cdf9296333382626365bff9e3eae18e2c884d59597 |
Hashes for biotite-0.25.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3b6c7b0da4544afefd81d45b8de442fcefd42cc3d509abe50c34927ae37b3897 |
|
MD5 | 356e90bc152429c43f79f63d09d44a5d |
|
BLAKE2b-256 | f783d8779437dfe7f50d5f733cfeaffbe9c5ada4bd70324d01a4467f938901c0 |
Hashes for biotite-0.25.0-cp36-cp36m-win32.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20a6c5d84a2c88729d6922bc5bd296a87cda8b63f72c083f8092e752e9e31a1b |
|
MD5 | dfb91d782138cd83264ab8617168fcca |
|
BLAKE2b-256 | 23e1fc9d8f340d73ee7c310e86b12bd67752a73fa4818850f83b8dc91ea7fcd6 |
Hashes for biotite-0.25.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9a19c4a10d49240a996edcc5f6af47b7dcdd6bbf78cbf53cce71fbb46baf61e6 |
|
MD5 | 983f697b9a9bde06f80207330c7e3989 |
|
BLAKE2b-256 | 9a06b944d84c1fdd0785891d2faf09ab7b66c9112f9f139c6647369982b6dfc4 |
Hashes for biotite-0.25.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d20323bfaaf0a7fd8fba8c6667ce41c65b386f1171c05dcb0c8b389d61fcab7 |
|
MD5 | 4887ce6bf9be003f43b28f6599854e33 |
|
BLAKE2b-256 | 3911c9ab6d25858d54c6fbbca56f57db2034008bb004ca5a26ec3a499f018cae |