PyTorch implementation of AlphaGenome
Project description
AlphaGenome PyTorch
A PyTorch port of AlphaGenome, the DNA sequence model from Google DeepMind that predicts hundreds of genomic tracks at single base-pair resolution from sequences up to 1M bp.
We strive to make it an accessible, readable, and hackable implementation — for integrating into existing PyTorch pipelines, fine-tuning on custom datasets, and building on top of.
Installation
Installation from PyPI:
pip install alphagenome-pytorch
Installation from repo:
pip install git+https://github.com/genomicsxai/alphagenome-pytorch
For fine-tuning (incl. BigWig data loading):
pip install alphagenome-pytorch[finetuning] # adds pyBigWig, pyfaidx
Quick Start
from alphagenome_pytorch import AlphaGenome
# Load pretrained model
model = AlphaGenome.from_pretrained('alphagenome.pt', device='cuda')
# Create one-hot encoded DNA sequence in NLC format (batch=1, length=131072, channels=4)
# Channels: A=0, C=1, G=2, T=3
sequence = np.random.randint(0, 4, size=(1, 131072))
dna_onehot = torch.tensor(np.eye(4)[sequence], dtype=torch.float32)
# Inference (handles dtype casting, returns float32 outputs)
outputs = model.predict(dna_onehot, organism_index=0) # organism: 0=human, 1=mouse
# outputs['atac'][1] -> (B, 131072, 256) ATAC at 1bp
# outputs['atac'][128] -> (B, 1024, 256) ATAC at 128bp
# outputs['contact_maps'] -> (B, 28, 64, 64) 3D contact
The weights for this port are available on Hugging Face.
Extracting Embeddings
Use model.encode() to get embeddings without running prediction heads — useful for
building custom heads or analyzing representations:
# Get embeddings (128bp only for efficiency)
emb = model.encode(dna_onehot, organism_index=0, resolutions=(128,))
emb['embeddings_128bp'] # (B, 1024, 3072) at 128bp
Fine-tuning
Train a new head on your data with frozen trunk (linear probing) or with LoRA adapters:
from alphagenome_pytorch import AlphaGenome, TransferConfig, load_trunk, prepare_for_transfer
# Load trunk, freeze, add custom heads
model = AlphaGenome()
model = load_trunk(model, 'alphagenome.pt')
model = prepare_for_transfer(model, TransferConfig(
mode='lora',
new_heads={'atac': {'modality': 'atac', 'num_tracks': 1}},
lora_rank=8,
))
The easiest way to start with fine-tuning is to use scripts/finetune.py that implements a flexible CLI interface:
# LoRA fine-tuning
python scripts/finetune.py --mode lora --lora-rank 8 \
--genome hg38.fa --modality atac --bigwig *.bw \
--train-bed train.bed --val-bed val.bed \
--pretrained-weights alphagenome.pt
# Multi-GPU
torchrun --nproc_per_node=4 scripts/finetune.py --mode lora ...
See examples/notebooks/finetuning_gm12878_demo.ipynb for an example of linear probing on ATAC-seq data.
Numerical Parity with JAX
This port is validated against the original JAX model, including per-head and full forward pass output comparisons as well as loss values and gradients.
See a compiled ARCHITECTURE_COMPARISON.md for some technical details.
Model Outputs
| Head | Tracks | Resolutions | Description |
|---|---|---|---|
| atac | 256 | 1bp, 128bp | Chromatin accessibility |
| dnase | 384 | 1bp, 128bp | DNase-seq |
| procap | 128 | 1bp, 128bp | Transcription initiation |
| cage | 640 | 1bp, 128bp | 5' cap RNA |
| rnaseq | 768 | 1bp, 128bp | RNA expression |
| chip_tf | 1664 | 128bp | TF binding |
| chip_histone | 1152 | 128bp | Histone modifications |
| contact_maps | 28 | 64×64 | 3D chromatin contacts |
| splice_sites | 4 | 1bp | Splice site classification (D+, A+, D−, A−) |
| splice_junctions | 734 | pairwise | Junction read counts (367 tissues × 2 strands) |
| splice_site_usage | 734 | 1bp | Fraction of transcripts using splice site |
See more information about model outputs in the official AlphaGenome documentation.
Example Notebooks
- Demo — Basic inference and JAX comparison
- Variant Scoring — Effect prediction
- In Silico Mutagenesis — ISM analysis
- TAL1 Mutation Example - TAL1 variant effect and ISM (Figure 6 from AlphaGenome)
- Fine-tuning — ATAC-seq linear probing
- Fine-tuning — MPRA (encoder-only)
Citation
@article{avsec2026alphagenome,
title={Advancing regulatory variant effect prediction with AlphaGenome},
author={Avsec, {\v{Z}}iga and Latysheva, Natasha and Cheng, Jun and Novati, Guido and Taylor, Kyle R and Ward, Tom and Bycroft, Clare and Nicolaisen, Lauren and Arvaniti, Eirini and Pan, Joshua and others},
journal={Nature},
volume={649},
number={8099},
pages={1206--1218},
year={2026},
publisher={Nature Publishing Group UK London}
}
bioRxiv preprint
@article{avsec2025alphagenome,
title = {AlphaGenome: advancing regulatory variant effect prediction with a unified DNA sequence model},
author = {Avsec, {\v Z}iga and Latysheva, Natasha and Cheng, Jun and ...},
year = {2025},
journal = {bioRxiv},
doi = {10.1101/2025.06.25.661532}
}
Acknowledgements
We acknowledge Phil Wang, Miquel Anglada-Girotto, and Xinming Tu as developers of an older AlphaGenome PyTorch port unrelated to this repo. Note that the PyPI namespace is now linked to this repo.
License
This project is a port of the google-deepmind/alphagenome_research repository licensed under the Apache License, Version 2.0:
Copyright 2026 Google LLC
The weights are subject to the model terms.
This port is licensed under the Apache License, Version 2.0 (Apache 2.0):
Copyright 2026 Danila Bredikhin, Martin Kjellberg, Christopher Zou, Alejandro Buendia, Xinming Tu, Anshul Kundaje
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this except in compliance with the License. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file alphagenome_pytorch-0.3.0.tar.gz.
File metadata
- Download URL: alphagenome_pytorch-0.3.0.tar.gz
- Upload date:
- Size: 2.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e7c0b585c2781dedef0787b6c5a5cd4dfd3f1ec3003c3297a69bb8722d68030
|
|
| MD5 |
ef18875c84fba5d8dd75e155781f9fbe
|
|
| BLAKE2b-256 |
2bf866c48a515ddbd6de782063cfe920b9335525af28af366edfd33a5e6980dd
|
Provenance
The following attestation bundles were made for alphagenome_pytorch-0.3.0.tar.gz:
Publisher:
publish.yml on genomicsxai/alphagenome-pytorch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
alphagenome_pytorch-0.3.0.tar.gz -
Subject digest:
6e7c0b585c2781dedef0787b6c5a5cd4dfd3f1ec3003c3297a69bb8722d68030 - Sigstore transparency entry: 1085215333
- Sigstore integration time:
-
Permalink:
genomicsxai/alphagenome-pytorch@564a3146d891907f1f90c1c5428d214dce70bacd -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/genomicsxai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@564a3146d891907f1f90c1c5428d214dce70bacd -
Trigger Event:
push
-
Statement type:
File details
Details for the file alphagenome_pytorch-0.3.0-py3-none-any.whl.
File metadata
- Download URL: alphagenome_pytorch-0.3.0-py3-none-any.whl
- Upload date:
- Size: 169.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c0623f4f2bf78d5dd7fedce3a16f45aefa5259235e4735564f4c2690790031a
|
|
| MD5 |
788ebdcca0616a7a48fd127a76cc249a
|
|
| BLAKE2b-256 |
60bbd5f1b9f67e26d0ba202c13819c9466714453f11fbdd260137740ca6411b4
|
Provenance
The following attestation bundles were made for alphagenome_pytorch-0.3.0-py3-none-any.whl:
Publisher:
publish.yml on genomicsxai/alphagenome-pytorch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
alphagenome_pytorch-0.3.0-py3-none-any.whl -
Subject digest:
5c0623f4f2bf78d5dd7fedce3a16f45aefa5259235e4735564f4c2690790031a - Sigstore transparency entry: 1085215625
- Sigstore integration time:
-
Permalink:
genomicsxai/alphagenome-pytorch@564a3146d891907f1f90c1c5428d214dce70bacd -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/genomicsxai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@564a3146d891907f1f90c1c5428d214dce70bacd -
Trigger Event:
push
-
Statement type: