Python module for processing of variable bias voltage nanopore sequencing
Project description
varv
Repository for the code of Thijn Hoekstra's thesis. The goal of this project is to implement variable voltage in the nanopore measurement of DNA-peptide conjugates. The repository contains Python modules that provides a high-level interface for dealing with nanopore data in general, and variable-voltage nanopore data in particular.
The documentation is currently work in progress, but HTML files can be found in the docs folder.
This code GNU general licensed.
Description
Currently, there does not seem to be a standard for handling nanopore data. Oxford Nanopore Technologies data provides many packages for their platform, but these do not generalise well to the data created by custom nanopore setups like the one used in the Cees Dekker lab.
Another issue with nanopore data analysis is that many current methods use code written in MATLAB. The experience during this thesis has highlighted a couple disadvantages. For one, MATLAB code is generally less broadly used and understood. In addition, this code has been harder to maintain, document, and most importantly has issues with portability.
The goal of this repository is to start to address these two issues, first by porting and integrating nanopore analysis code in a single Python module. This module should also be easy to install and run on every computer.
This module is inspired by MNE, which is an open-source Python package for managing EEG and MEG data.
Installation
(Extra) On virtual environments
The recommended installation method for this package is to install it via the pip package installer into a
virtual environment. For more information on virtual environments, read
this section
in Programming Foundations, an open-access book on Python programming foundations [1].
If you are already familiar with creating a virtual environment, skip ahead to the next section. If you use conda,
you can skip this step and create a virtual environment using it instead. It is assumed that you
have installed Python and know how to open a terminal. To create a virtual
environment open a terminal at a location of your choosing and run:
python -m venv .venv
This creates a virtual environment in which to install the Python packages. This virtual environment is stored in a
folder called .venv. To activate the virtual environment, run:
source .venv/bin/activate
For Windows, use:
.venv\Scripts\activate.bat
If activated correctly, (.venv) should appear on the left of your terminal line. Next, install the varv module
using the instructions in the next section.
Install using PyPI
varv is listed on PyPI and can be easily installed using the package manager
pip, simply using:
pip install varv
Note that there is no conda distribution for this package yet, so also install it via pip or from source
(described in the section below).
For extra Jupyter interactivity (not recommended).
pip install varv[jupyter]
Install from source
If you intend to develop or modify varv or try the latest version, install the package from source. Once again,
make sure you install into a virtual environment. To install from source, first open a terminal and navigate to a
directory of your choice, then clone the repository:
git clone https://gitlab.tudelft.nl/xiuqichen/varv.git
Note that this requires Git. For more information on version control using Git, check out this section in Programming Foundations [1].
Next, move into the varv directory.
cd varv
Finally, install an editable version of the package. [Dev] installs the packages needed for development,
like testing and documentation/
pip install -e '.[dev]'
How to use
varv provides some high-level objects for managing nanopore sequencing data. Each of the following sections
describes a useful class for handling such data.
Managing measurement metadata
Consider a nanopore measurement. It involves many parameters, for example a name specifying the sample and conditions, a bias voltage of a certain magnitude over the membrane, or the sampling rate it was measured at. These metadata can be neatly stored using the Info class.
For example, to store metadata for a measurement with a sampling rate of 5 kHz, a name Sample A, and a (constant) bias voltage of 180 mV, write:
from varv.base import Info
info = Info(5e3, "Sample A", 180)
This info can be displayed by calling print(info). More metadata can also be stored, like:
- Bias voltage amplitude (in the case of a variable voltage measurement setup)
- Bias voltage frequency (in the case of a variable voltage measurement setup)
- Open state current (in pA)
Managing data for experiments
To store data measured over an entire run of an experiment (typically with a duration ~1000s), use the Raw class,
which is created using an Info object and the data as a pandas dataframe:
import numpy as np
import pandas as pd
from varv.base import Raw, Info
data = pd.DataFrame(
columns=["i", "v"],
data=np.random.random((100, 2)))
raw = Raw(Info(5e3), data)
print(raw)
# For multichannel data, you can specify a channel number
raw_2 = Raw(Info(5e3), data, channel=2)
In the Cees Dekker lab, the nanopore data is stored as a .dat file created by LabView. These can be converted to a
raw object:
from varv.io.labview import read_measurement_dat
raw = read_measurement_dat("experiment_1.dat")
Note that this function with downsample the data to a sampling rate of 5 kHz. This can be turned off for variable voltage data.
Custom pandas accesor
Finding reads
To find reads in the measurement, run:
from varv.events import Events
events = Events.from_raw(raw)
Which returns an Events object, containing the various reads found in the data. For finding events in
variable-voltage data, use the keyword arguments:
kwargs = {
"open_state_current": (170, 200),
"lowpass": 100,
"known_good_voltage": (90, 210),
}
eves = Events.from_raw(raw, **kwargs)
These events can be filtered by length, step rate, current ranges, and more. They can also be sliced to return single events for further analysis:
eve = eves[2]
Check out the notebooks folder for more examples.
Development
For torch on MacOS with Apple Silicon (Uses GPU). For more info, check the Apple Website
pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
Contribution guidelines
This project follows many principles from the Good Integration Practices from pytest. Important guidelines are:
- The Python Packaging User Guide by The Python Packaging Project for building the module.
- Good Integration Practices by pytest for testing.
- The Google Python Style Guide by Google for names of classes, variables, etc. Also for the creation of docstrings.
- The ruff for auto-formatting the code.
The symbols i, v, and g, are reserved for current, voltage, and conductance. For indices, the use of j is preferred.
Finally, this project is inspired by MNE, consider checking out their contribution guideliens.
Testing code
To test the code for the entire module, make sure you are in the varv directory and simply run:
pytest
Tests for individual submodules can be found in the tests folder.
Formatting code
To automatically format the code, use:
ruff check --fix
Building
ruff check --fix
References
[1] Šoštarić , N. (2024). Programming Foundations TU Delft OPEN Publishing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file varv-0.0.2.tar.gz.
File metadata
- Download URL: varv-0.0.2.tar.gz
- Upload date:
- Size: 16.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
080301fd5374a19eb94f110d452488622cac9b676bde15a54807da8f92ce827f
|
|
| MD5 |
8c2cb10cb4c68ac57e111626f163b3d4
|
|
| BLAKE2b-256 |
e3b1b5ccdd024794109180979272ff2ebda5e0268e0a2091e44cf6b761b04ed8
|
File details
Details for the file varv-0.0.2-py3-none-any.whl.
File metadata
- Download URL: varv-0.0.2-py3-none-any.whl
- Upload date:
- Size: 16.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5152237ad111ee468bb5137c710aea8027f6716114f341e148213c5be1c476d4
|
|
| MD5 |
06077771a0deaa1e3d474d7f24224ae3
|
|
| BLAKE2b-256 |
ce9b4ccc56a10b4d9b91842fbce5d4517bede8dc285a20e28b7f77157b57a46c
|