Easily map epitope coordinates between sequences in an alignment
Project description
Epitope aligner
Easily map epitope coordinates between sequences in an alignment, regardless of which coordinate system you are using. This lets you combine epitopes from different sources and calculate things like epitope density in a set of proteins.
epitope_aligner
is a python package hosted on Github at
BarinthusBio/epitope_aligner.
Full documentation at barinthusbio.github.io/epitope_aligner.
If you have any suggestions or problems, please open an issue.
Contents
Install
Install directly from github using one of:
pip install git+https://github.com/Vaccitech/epitope_aligner.git
pip install git+git@github.com:Vaccitech/epitope_aligner.git
Quickstart
The full quickstart example is here which analyses and plots the epitopes from different strains of the influenza virus.
In the current minimal example we'll:
- convert epitope coordinates to an aligned antigen
- float the epitope sequences to match it
- calculate the number of epitopes at each position in the antigen
For the inverse of these aligning and floating operations see the cookbook.
Import functions from epitope_aligner
modules and pandas to create
an example dataset.
from epitope_aligner import map, stretch, utils
import pandas as pd
We'll define a short example antigen sequence, with an aligned and unaligned version.
aligned_seq = "ABC---DEFGH-IJK--LM"
seq = aligned_seq.replace("-","")
We'll define some exmple epitopes with positions in the unaligned antigen sequence.
epitopes = pd.DataFrame({
'start': [2, 6, 9],
'end': [4, 9, 12],
'seq': ["BCD", "FGHI", "IJKL"],
"length": [3, 4, 4]
})
epitopes
# start end seq length
# 0 2 4 BCD 3
# 1 6 9 FGHI 4
# 2 9 12 IJKL 4
Let's calculate the start positions of these epitopes in the aligned antigen sequence.
epitopes['newstart'] = map.align_coords(
table = epitopes,
aligned_parent_seq = aligned_seq,
coord_col = "start",
index = 1
)
epitopes
# start end seq length newstart
# 0 2 4 BCD 3 2
# 1 6 9 FGHI 4 9
# 2 9 12 IJKL 4 13
Now we can "float" an epitope to line up with its antigen based on a start position and antigen sequence.
epitopes['float'] = map.float_epitopes(
table=epitopes,
parent_seq=aligned_seq,
start_col="newstart",
index=1,
)
epitopes
# Aligned antigen
# ABC---DEFGH-IJK--LM
# Aligned epitopes
# -BC---D
# --------FGH-I
# ------------IJK--L
We can easily count the number of epitopes overlapping each position by "stretching" them. For plotting, it is often helpful to add zeros for positions with no epitopes.
stretched_epitopes = stretch.stretch(epitopes)
positional_count = stretched_epitopes.groupby("position").size()
positional_count = stretch.add_empty_positions(
positional_count,
parent_seq_length=len(seq),
index=1,
empty_value=0
)
positional_count
# position
# 1 0.0
# 2 1.0
# 3 1.0
# 4 1.0
# 5 0.0
# 6 1.0
# 7 1.0
# 8 1.0
# 9 2.0
# 10 1.0
# 11 1.0
# 12 1.0
# 13 0.0
# dtype: float64
Read the cookbook for tips on calculating more interesting measures than counts.
Examples
A real world example is demonstrated in the quickstart which analyses and plots the epitopes from different strains of the influenza virus.
The cookbook provides a detailed description and example of all functions.
The full documentation includes function APIs under the submodules:
Dev
Details on testing, creating docs, and virtual envinments.
Dev: Set up
Create a virtual environment with python3 -m venv .venv
.
Activate that environment with . .venv/bin/activate
.
Install in editable mode with pip install -e .
.
Deactivate it with deactivate
.
Dev: Nox
Linting, bandit, documentation, examples, and testing can all be run with
nox
based on noxfile.py
. This is also run by github actions.
Dev: Make docs
The full guide is docs/README.md
but in short pdoc generates the
api documentation and renders the read me, jupyter notebook examples
are converted to html, and the complete docs are hosted at barinthusbio.github.io/epitope_aligner/index.html.
Generating the docs and hosting them is handled by the github actions, but
if you want to produce them locally just run nox
.
Dev: Publish to PyPI
Uploading requires the build
and twine
packages,
pip install --upgrade twine build
.
python -m build
will create both the --sdist
and --wheel
.
twine check dist/*
will check the package is ready for uploading.
twine upload dist/*
will actually upload to pypi.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file epitope_aligner-0.1.2.tar.gz
.
File metadata
- Download URL: epitope_aligner-0.1.2.tar.gz
- Upload date:
- Size: 235.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf0c34a07d6108849cbc961f006db134431e422379dd24085da0b2a28d713931 |
|
MD5 | 0d2dd75cf9d3dd63e8b3796f49fdf568 |
|
BLAKE2b-256 | 3be3d269edb0a8f6ff7186c7ce3f2abc73b3b2216cebf724314e956051bdffaa |
File details
Details for the file epitope_aligner-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: epitope_aligner-0.1.2-py3-none-any.whl
- Upload date:
- Size: 12.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | db9bf8a53bd60494a398eb3a01738daddd6b7b0359d4eb8c0a985a3cc3cf4426 |
|
MD5 | ef206fccdcd3e1258adf5ca8f82fd844 |
|
BLAKE2b-256 | ee8207bdee5712e8a309185510cef953070b179c0e17bdfb3512232097a72c41 |