ML for biomolecular binding
Project description
mubind
Model highlights
- MuBind is a deep learning model that can learn DNA-sequence features predictive of cell transitions in single-cell genomics data, using graph representations and sequence-activity across cells. The codebase is written in PyTorch.
- This package works with single-cell genomics data, scATAC-seq, etc. We have also tested it on bulk in vitro samples (HT-SELEX). See documentation for examples.
- Complemented with velocity-driven graph representations we learn sequence-to-activity transcriptional regulators linked with developmental processes. These predictions are biologically confirmed in several systems, and reinforced through chromatin accessibility and orthogonal gene expression data across pseudotemporal order. Refer to bioRxiv for more details.
Workflow and model architecture
Other specifications
- Number of cells: The scalability of this method has been tested on single-cell datasets between 1,000 and 100,000 cells.
- Number of peaks: We have tested three-times the number of features (peaks, promoters) selected randomly and with EpiScanpy's variability score. In our experience, highest testing performances are obtained when using random features. all features requires calibration of batch sizes and total GPU memory.
- Running time: Using a Graph Layer and PWMs in the Binding Layer, the running time with one GPU is about 50 min (5,000 cells, 15,000 features). For additional memory and scaling tips, please refer to the documentation.
Resources
Please refer to the documentation.
Installation
There are several alternative options to install mubind:
pip
- Install the latest release of
mubind
fromPyPI <https://pypi.org/project/mubind/>
_:
pip install mubind
- Install the latest development version:
pip install git+https://github.com/theislab/mubind.git@main
Release notes
See the changelog.
Preprint
If mubind is useful for your research, please consider citing as:
Ibarra I.L., Schneeberger J., Erdogan E., Redl L., Martens L., Klein D., Aliee H., and Theis F.J. Learning sequence-based regulatory dynamics in single-cell genomics bioRxiv 2024.08.07.605876 (2024) doi:10.1101/2024.08.07.605876.
Funding acknowledgments.
Issues
If you found a bug, please open an Issue.
Project template created using scverse cookie template
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file mubind-0.2.1.tar.gz
.
File metadata
- Download URL: mubind-0.2.1.tar.gz
- Upload date:
- Size: 62.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b733c73fa3dc5ae5bc4eb6b8d9d85244570ac19db0933670d7bb61c495109996 |
|
MD5 | 9998019bae10b21a944f5bed5481cda6 |
|
BLAKE2b-256 | ab5542ba5e42e68a2d84a3ed5146e79ac938c8958e31096f2cf9a2cfefa988b4 |
File details
Details for the file mubind-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: mubind-0.2.1-py3-none-any.whl
- Upload date:
- Size: 66.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b6e53a87928b2db8eab9ec28eda17fde83067d9caabf7266ef4497be36479a72 |
|
MD5 | 2fd4f448ea20e0e8aaab5cc8f40fcdcd |
|
BLAKE2b-256 | 34e486eee103b021f2cf48fc73f3b0041e88b959bc60859c8881a49130331b81 |