To use Evo2 easily in HPC
Project description
EasyEvo2
A Python toolkit for easily using Evo2 models in bioinformatics workflows, particularly in HPC environments.
Description
EasyEvo2 provides a simplified interface to Evo2 foundation models for sequence embedding. It enables biologists and bioinformaticians to efficiently extract embeddings from DNA, RNA, or protein sequences without extensive deep learning expertise. It's specially designed to work well in High-Performance Computing (HPC) environments.
Installation
# Install from PyPI
pip install easyevo2
# Or install from source
git clone https://github.com/ylab-hi/EasyEvo2.git
cd EasyEvo2
pip install .
Usage
Basic Usage
# Embed sequences from a FASTA/FASTQ file using the default model (evo2_7b)
easyevo2 embed input.fa
# Specify a different model and specific layer
easyevo2 embed input.fa --model-type evo2_40b --layer-name blocks.28.mlp.l3
# Specify a different model and multiple layers
easyevo2 embed input.fa --model-type evo2_40b --layer-name blocks.28.mlp.l3 blocks.28.mlp.l2
# Save to a specific output file
easyevo2 embed input.fa --output my_embeddings
The output will be a safetensor file containing the embeddings for each sequence in the input file.
We can load the embeddings using the load_tensor function:
from easyevo2 import load_tensor
embeddings = load_tensor("my_embeddings.mode.layer.safetensors")
print(embeddings)
# Output: {
# "seq1": torch.tensor([...]),
# "seq2": torch.tensor([...]),
# }
Development
This project uses a Makefile to automate common development tasks:
# Show available commands
make help
# Run tests
make test
# Lint code
make lint
# Format code
make format
# Build package
make build
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file easyevo2-0.1.4.tar.gz.
File metadata
- Download URL: easyevo2-0.1.4.tar.gz
- Upload date:
- Size: 44.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9ff3009d6892880454dba5e6d8b0c0e03465585cedd2b09ff0595926b3988f6
|
|
| MD5 |
214622eb54e27f5e3244f845bd7acf05
|
|
| BLAKE2b-256 |
9df80aeec6cb3fd3b881ad64da9a64ef420c4da0970b91eecb88d01b4e7a99c7
|
File details
Details for the file easyevo2-0.1.4-py3-none-any.whl.
File metadata
- Download URL: easyevo2-0.1.4-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ddfb29ac69ccb74578e625519b07d8afaf8e4b9f484dd9c6e4ef830fc48f434e
|
|
| MD5 |
3ef05fffb7e9b23efd4cdf14cb85629d
|
|
| BLAKE2b-256 |
83448cd050163f7d629c98b9c60df769283039984e8002d6f80dd89373e1eed3
|