TF-MInDi: Transcription Factor Motifs and Instances Discovery
Project description
TF-MINDI: Transcription Factor Motif Instance Neighborhood Decomposition and Interpretation
TF-MINDI is a Python package for analyzing transcription factor binding patterns from deep learning model attribution scores. It identifies and clusters sequence motifs from contribution scores, maps them to DNA-binding domains, and provides comprehensive visualization tools for regulatory genomics analysis.
Getting Started
Please refer to the documentation for detailed tutorials and examples, in particular, the API documentation and Tutorials
Key Features
- Seqlet Extraction: Identifies important sequence regions from contribution scores using recursive seqlet calling from
tangermeme - Motif Similarity Analysis: Compares extracted seqlets to known motif databases using TomTom
- Clustering & Dimensionality Reduction: Groups similar seqlets using Leiden clustering and t-SNE visualization
- DNA-Binding Domain Annotation: Maps seqlet clusters to transcription factor families
- Pattern Generation: Creates consensus motifs from clustered seqlets with alignment
- Comprehensive Visualization: Region-level contribution plots, t-SNE embeddings, motif logos, and heatmaps
Installation
tfmindi is compatible with python version 3.10-3.12.
CPU Version (Default)
pip install tfmindi
GPU-Accelerated Version (Recommended for large datasets)
# Requires CUDA-compatible GPU (CUDA 12.X)
pip install tfmindi[gpu]
The GPU version provides significant speedups for:
- PCA computation
- Neighborhood graph construction
- t-SNE embedding
- Leiden clustering
We're still working on making the tfmindi package as GPU-compatible as possible.
If tfmindi can't find the GPU, try importing rapids_singlecell directly in python and see what errors you get.
You might have to explicitly set your LD_LIBRARY_PATH for cuml as described here.
Quick Start
TF-MINDI follows a scanpy-inspired workflow:
- Preprocessing (
tm.pp): Extract seqlets, calculate motif similarities, and create an Anndata object - Tools (
tm.tl): Cluster seqlets and create consensus patterns - Plotting (
tm.pl): Visualize results
import tfmindi as tm
# Optional: Check GPU availability and set backend
print(f"GPU available: {tm.is_gpu_available()}")
print(f"Current backend: {tm.get_backend()}")
# tm.set_backend('gpu') # Force GPU backend
# tm.set_backend('cpu') # Swap back to CPU backend
# Extract seqlets from contribution scores
seqlets_df, seqlet_matrices = tm.pp.extract_seqlets(
contrib=contrib_scores, # (n_examples, 4, length)
oh=one_hot_sequences, # (n_examples, 4, length)
threshold=0.05
)
# Calculate motif similarity
motif_collection = tm.load_motif_collection(
tm.fetch_motif_collection()
)
similarity_matrix = tm.pp.calculate_motif_similarity(
seqlet_matrices,
motif_collection,
chunk_size=10000
)
# Create AnnData object for analysis
adata = tm.pp.create_seqlet_adata(
similarity_matrix,
seqlets_df,
seqlet_matrices=seqlet_matrices,
oh_sequences=one_hot_sequences,
contrib_scores=contrib_scores,
motif_collection=motif_collection
)
# Cluster seqlets and annotate with DNA-binding domains
tm.tl.cluster_seqlets(adata, resolution=3.0)
# Generate consensus logos for each cluster
patterns = tm.tl.create_patterns(adata)
# Visualize results
tm.pl.tsne(adata, color_by="cluster_dbd")
tm.pl.region_contributions(adata, example_idx=0)
tm.pl.dbd_heatmap(adata)
Release Notes
See the changelog.
Contact
If you found a bug, please use the issue tracker.
Citation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tfmindi-1.2.0.tar.gz.
File metadata
- Download URL: tfmindi-1.2.0.tar.gz
- Upload date:
- Size: 6.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9b0f3711a370f9639ea1f779b64d426a702170222ab62fd43e0be562989308a
|
|
| MD5 |
23ad65e87aaceab8add050f19a624966
|
|
| BLAKE2b-256 |
390d164d848f75cc4fb0d04a996c870ccb23fb428b8e3f2bd07923ae16faa0b0
|
Provenance
The following attestation bundles were made for tfmindi-1.2.0.tar.gz:
Publisher:
release.yaml on aertslab/TF-MINDI
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tfmindi-1.2.0.tar.gz -
Subject digest:
b9b0f3711a370f9639ea1f779b64d426a702170222ab62fd43e0be562989308a - Sigstore transparency entry: 831659069
- Sigstore integration time:
-
Permalink:
aertslab/TF-MINDI@3b200fee242c369a5f5a1eb398b3d4d585f06458 -
Branch / Tag:
refs/tags/1.2.0 - Owner: https://github.com/aertslab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@3b200fee242c369a5f5a1eb398b3d4d585f06458 -
Trigger Event:
release
-
Statement type:
File details
Details for the file tfmindi-1.2.0-py3-none-any.whl.
File metadata
- Download URL: tfmindi-1.2.0-py3-none-any.whl
- Upload date:
- Size: 66.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a7d2fbb9ec4e3616ae7922e72def90622209a9ee6b1e1901df693e6ccc08c21
|
|
| MD5 |
6ddd491140f0c71bbc6fd8e8582f832b
|
|
| BLAKE2b-256 |
24ef74ef976c29ea750a88069a0ff79d19ef29a663ba6a7164acf464c43130ec
|
Provenance
The following attestation bundles were made for tfmindi-1.2.0-py3-none-any.whl:
Publisher:
release.yaml on aertslab/TF-MINDI
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tfmindi-1.2.0-py3-none-any.whl -
Subject digest:
3a7d2fbb9ec4e3616ae7922e72def90622209a9ee6b1e1901df693e6ccc08c21 - Sigstore transparency entry: 831659092
- Sigstore integration time:
-
Permalink:
aertslab/TF-MINDI@3b200fee242c369a5f5a1eb398b3d4d585f06458 -
Branch / Tag:
refs/tags/1.2.0 - Owner: https://github.com/aertslab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@3b200fee242c369a5f5a1eb398b3d4d585f06458 -
Trigger Event:
release
-
Statement type: