featlens

Model-agnostic feature-map visualization: PCA, cosine-similarity, k-means and foreground maps from any vision model and any layer.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

tkargin

These details have not been verified by PyPI

Project description

FeatLens

📖 Documentation: https://turhancan97.github.io/FeatLens/ · 🤗 Live demo: https://huggingface.co/spaces/turhancan97/FeatLens-demo

See what any vision model encodes. FeatLens renders feature maps for any vision model — DINO, DINOv2/v3, CLIP, SigLIP, MAE, DeiT, V-JEPA, CNNs, … — loaded from any source (timm, HuggingFace transformers, torch.hub, an external repo, or a model you built yourself), and from any layer, as a clean model × layer grid. Color the features by robust PCA, cosine-similarity to a seed patch, k-means segmentation, a foreground mask, or saliency — match patches across two images, roll up a ViT's attention, batch a folder, or sweep a video clip.

DINO feature maps across layers

Most "DINO PCA" scripts are welded to one model. FeatLens separates representation access (a small adapter layer over the model zoo) from visualization (PCA / cosine / k-means / foreground / saliency / attention-rollout), so you can point it at a new model in seconds and compare models/layers side by side.

Gallery

All produced by examples/quickstart.py. The per-image rows below use DINO ViT-S/8 at 768px — a small patch-8 backbone at high resolution gives a fine 96×96 feature grid, so thin structures (whiskers, feather barbs, individual fruit) stay crisp. The model × layer, compare and the compare / method figures further down use DINO ViT-B/16 at 448px for a denser feature grid.

visualize(...) — DINO ViT-S/8 @ 768px, feature maps across layers 2 / 5 / 8 / 11:

Image (original size)	Source	Feature maps
`peacock.jpg` · 1600×1280
`cat_hires.jpg` · 1600×1200
`market.jpg` · 1600×1063

grid(...) — model × layer, overlaid on cat_hires.jpg at 448px (DINO vs DINOv2 across layers 2/5/8/11):

model x layer grid overlay

compare(...) — models at the final layer on cat_hires.jpg at 448px:

compare models at last layer

custom_adapter — a ResNet-50 (CNN escape hatch) across three images at 768px, each with its layer -1 feature map:

resnet50 feature map

Same scene, six ViT-B/16 backbones — market.jpg at 1024px, last-layer features (a 64×64 grid), PCA→RGB per model. Architecture and patch size are held fixed, so the differences are purely the training objective: DINOv3 and DINO carve the scene into smooth semantic regions, MAE stays low-frequency, while SigLIP, supervised, and Perception Encoder encode much higher-frequency detail.

six ViT-B/16 backbones compared on one image

Beyond PCA — the same DINOv2 row on cat_hires.jpg at 448px, recolored by cosine-similarity to a seed patch, k-means segmentation, a foreground mask, and saliency (activation magnitude) — across layers 2 / 5 / 8 / 11:

Method	Across layers
`cosine` (seed on the cat)
`kmeans` (k=6)
`foreground`
`saliency` (activation magnitude)

correspond(...) — seed a patch in image A, find the matches in image B. Here the seed is on a real cat's eye; DINOv2 features match the same semantic part on a watercolor cat, across the photo→illustration domain gap:

cross-image correspondence

attention(...) — attention-rollout for a timm ViT. Composing DINO ViT-B/16's self-attention (Abnar–Zuidema) shows the [CLS] token concentrating on the cat — overlaid, with a [0, 1] scale:

attention rollout

video(...) — per-frame feature maps over a clip, as a filmstrip (+ an animated GIF). Here a synthetic pan across market.jpg, with DINOv2 last-layer PCA features tracking the scene:

video filmstrip

Played back as the input clip beside its feature map (left: source frames, right: DINOv2 PCA):

input clip next to DINOv2 feature map

For a temporal model the whole clip is fed once and the spatiotemporal tokens are split back into per-time-step grids. Here V-JEPA 2.1 (ViT-B/16) on a real cockatoo clip, last layer, one shared PCA basis across frames so the colors stay consistent and the bird (centre) reads as it moves against the fixed perch — python examples/vjepa_video.py:

V-JEPA temporal filmstrip

Played back as the input clip beside its feature map (left: source frames, right: V-JEPA PCA):

input video next to V-JEPA feature map

Install

pip install -e ".[timm]"          # timm backend (DINO, CLIP, SigLIP, DeiT, ...)
# extras: [hf] transformers · [clip] open_clip · [video] read .mp4 clips · [all]

Install PyTorch for your platform first (https://pytorch.org).

Quick start (Python)

import featlens as ll

# One model, scrub layers (shared PCA basis -> colors comparable across the row)
ll.visualize("dinov2_vitb14", "img.jpg", layers=[2, 5, 8, 11], out="row.png")

# Compare models at the final layer (per-tile basis)
ll.compare(["dino_vitb16", "mae_vitb16", "clip_large_openai"], "img.jpg", layer=-1, out="cmp.png")

# Full model x layer grid, overlaid on the image
ll.grid(["dino_vitb16", "dinov2_vitb14"], "img.jpg", layers=[2, 5, 8, 11], overlay=True, out="grid.png")

# Batch a whole folder -> one figure per image (+ a contact-sheet montage)
ll.batch("dino_vitb16", "photos/", "out/", layers=[2, 5, 8, 11], montage="sheet.png")

# Multi-frame video -> a filmstrip (frames x layers) + an animated GIF
ll.video("dinov2_vitb14", "clip.mp4", layers=[5, 11], n_frames=16, out="strip.png")  # needs featlens[video]
ll.video("vjepa2_1_vitb16", "clip.mp4", n_frames=16, out="strip.png")  # temporal: one clip, per-step grids

# Attention-rollout: where is the [CLS] token looking? (timm ViTs)
ll.attention("dino_vitb16", "img.jpg", layer=-1, overlay=True, out="attn.png")

Quick start (CLI)

featlens --models dino_vitb16 clip_large_openai --layers 2 5 8 11 \
    --images examples/images/cat.jpg --mode grid --out out/grid.png
featlens --config configs/example.yaml --images examples/images/cat.jpg --out out/grid.png

# Batch: point --images at a folder (or glob) and --out-dir at an output folder
featlens --models dino_vitb16 --layers 2 5 8 11 --images photos/ --out-dir out/

# Video (filmstrip + GIF) and attention-rollout
featlens --mode video --models dinov2_vitb14 --images clip.mp4 --n-frames 16 --out strip.png
featlens --mode attention --models dino_vitb16 --images cat.jpg --overlay --out attn.png

Image size & resizing

Images are resized to a square img_size × img_size before the model (default 224). img_size must be divisible by the model's patch size (multiples of 16 for patch-16 models, 14 for patch-14). Larger sizes give a finer feature grid at more compute:

ll.visualize("dinov2_vitb14", "img.jpg", layers=[2, 5, 8, 11], img_size=448)   # 32x32 grid

For non-square images, choose how aspect ratio is handled with resize_mode:

`resize_mode`	behavior
`squash` (default)	resize straight to `img_size²` — may distort
`crop`	resize shortest side to `img_size`, center-crop — aspect preserved
`pad`	resize longest side to `img_size`, pad to square — keeps the whole image

ll.grid([...], "wide.jpg", resize_mode="crop")          # Python

featlens --models dino_vitb16 --images wide.jpg --resize-mode pad --img-size 448 --out g.png

(FeatureGrid(interpolation_size=…) is separate — it only upscales the rendered tiles, not the model input.)

Model sources

Source	How to pass it	Needs
timm	friendly name (`dinov2_vitb14`) or raw id (`vit_base_patch16_224`)	`[timm]`
HuggingFace	`hf:facebook/dinov2-base`	`[hf]`
torch.hub (V-JEPA)	`vjepa2_vitl16`	network for weights
External repo (VGGT/SPA/…)	`external_adapter.load(repo_dir, builder, hook_target=…)`	the cloned repo
Your own model	`custom_adapter.load(model, feature_fn=…)`	—

Friendly names (see featlens/registry.py) cover DINO, DINOv2/v3, CLIP, SigLIP, MAE, DeiT, EVA-02, BEiT, SAM, Perception Encoder and V-JEPA; any other timm id works directly.

Layers

layers=[2, 5, 8, 11] selects transformer block indices (0-based, negatives allowed, -1 = last). The same convention holds across backends — for HuggingFace models FeatLens maps block i to hidden_states[i+1] (skipping the embedding output) for you.

Visualization methods

Every method consumes the same dense feature stack, so it works on grid / visualize / compare and across any layer:

cosine similarity follows the clicked seed patch
cosine mode: the heatmap (right, with its [-1, 1] colorbar) tracks the seed patch (white star) — click anywhere on the 🤗 live demo.

`method`	shows	extra args
`pca` (default)	robust PCA → RGB	`basis`, `remove_first_component`
`cosine`	cosine similarity to a seed patch (with a [-1, 1] colorbar)	`seed=(x, y)`, `colormap`
`kmeans`	unsupervised k-means segmentation (with a cluster legend)	`k`
`foreground`	fg/bg mask (first PCA component)	—
`saliency`	per-patch activation magnitude (with a [0, 1] colorbar)	`colormap`

fl.visualize("dino_vitb16", "img.jpg", layers=[2, 5, 8, 11], method="cosine", seed=(0.5, 0.5))
fl.compare(["dino_vitb16", "dinov2_vitb14"], "img.jpg", layer=-1, method="kmeans", k=8)

# Cross-image correspondence: multiple seeds, and mutual-NN to drop spurious matches
fl.correspond("dino_vitb16", "a.jpg", "b.jpg", seed=[(0.4, 0.5), (0.6, 0.3)], mutual=True, out="corr.png")

# Get the arrays back (no PNG round-trip): RGB tiles + the underlying scalar field
res = fl.visualize("dino_vitb16", "img.jpg", layers=[2, 5, 8, 11], method="cosine", return_data=True)
res["tiles"]    # [R, C, H, W, 3] rendered RGB     res["scalars"]  # [R, C, h, w] cosine sim in [-1, 1]

seed is normalized image coords (x, y) ∈ [0, 1] (resolution/model independent). Pass cache=True to memoize extraction on disk ($FEATLENS_CACHE_DIR, else ~/.cache/featlens) so re-renders are instant. Try it in the browser on the 🤗 live demo (or run demo/ locally) — in cosine mode, click the image to move the seed. See the docs.

Bring your own model

Anything that isn't built in works through the escape hatch — give a feature function or a hook target. CNNs work for free (their conv map is already spatial):

import torch.nn as nn, torchvision
from featlens import FeatureExtractor, FeatureGrid
from featlens.adapters import custom_adapter

resnet = torchvision.models.resnet50(weights="DEFAULT")
trunk = nn.Sequential(*list(resnet.children())[:-2])           # -> [B, 2048, h, w]
lm = custom_adapter.load(trunk, patch_size=32, feature_fn=lambda m, x: m(x), name="resnet50")
FeatureGrid([FeatureExtractor(lm)]).render("img.jpg", out_path="resnet50.png")

For a model in its own repo, external_adapter.load(repo_dir, builder, hook_target="blocks") puts the repo on sys.path, builds the model, and hooks its blocks.

How it works

Adapters resolve a spec → a LoadedModel and drive extraction in one of three modes: forward hooks on per-block modules (ViTs/CNNs/V-JEPA), HF output_hidden_states, or a user callable.
tokens_to_grid normalizes whatever a layer emits ([B,N,D] tokens with optional CLS/register prefixes, or [B,D,h,w] maps) into a dense [B,D,h,w] grid.
Robust PCA (median-absolute-deviation outlier filtering) projects features to RGB; FeatureGrid lays out the model × layer tiles with a per-tile or shared-per-model basis.

The extraction core adapts the FrozenBackbone pattern; the PCA is adapted from the SpaRRTa feature-map script.

Contributing

Contributions are welcome — this is an open-source project and we're happy to accept and support them. Whether it's a new model adapter, a visualization method, a bug fix, docs, or just a question, please jump in (see CONTRIBUTING.md for the full guide):

🐛 Found a bug or have an idea? Open an issue — bug reports, feature requests, and questions are all welcome.
🔧 Want to send a change? Fork the repo, create a branch, and open a pull request. Small, focused PRs are easiest to review.
✅ Before you push: run pytest -q and, for docs changes, mkdocs build --strict. New behavior should come with a test; new models should be verified to load and forward.
💬 Not sure where to start? Open an issue describing what you'd like to do and we'll help you scope it.

By contributing you agree that your contributions are licensed under the project's MIT License.

Releasing

Releases publish to PyPI automatically via .github/workflows/publish.yml (PyPI Trusted Publishing — no API token stored in the repo).

One-time setup on PyPI: add a trusted publisher for the project (Account → Publishing) with owner turhancan97, repository FeatLens, workflow publish.yml, environment pypi. PyPI supports a pending publisher so the very first release can also go through Actions.

Then cut a release by pushing a tag:

# bump the version in pyproject.toml first, then:
git tag v0.1.0 && git push origin v0.1.0

The workflow builds the sdist + wheel, runs twine check, and uploads to PyPI.

License

MIT.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

tkargin

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.2

Jun 30, 2026

0.3.1

Jun 30, 2026

0.3.0

Jun 29, 2026

0.2.6

Jun 29, 2026

0.2.5

Jun 29, 2026

0.2.4

Jun 29, 2026

0.2.3

Jun 29, 2026

0.2.2

Jun 29, 2026

0.2.1

Jun 28, 2026

0.2.0

Jun 26, 2026

0.1.0

Jun 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

featlens-0.3.2.tar.gz (54.5 kB view details)

Uploaded Jun 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

featlens-0.3.2-py3-none-any.whl (51.4 kB view details)

Uploaded Jun 30, 2026 Python 3

File details

Details for the file featlens-0.3.2.tar.gz.

File metadata

Download URL: featlens-0.3.2.tar.gz
Upload date: Jun 30, 2026
Size: 54.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for featlens-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`5d94339f60552fc6f2e2e01e0afe74a57000449562c3907644932ab1e8a2b453`
MD5	`95eac63294896fd01a723e16bc38dd10`
BLAKE2b-256	`e7738adec05c4b91fe00553883ac748aa55c08346932b0f494d43292597b5b72`

See more details on using hashes here.

Provenance

The following attestation bundles were made for featlens-0.3.2.tar.gz:

Publisher: publish.yml on turhancan97/FeatLens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: featlens-0.3.2.tar.gz
- Subject digest: 5d94339f60552fc6f2e2e01e0afe74a57000449562c3907644932ab1e8a2b453
- Sigstore transparency entry: 2025797110
- Sigstore integration time: Jun 30, 2026
Source repository:
- Permalink: turhancan97/FeatLens@46f363dbfcaa2b890579d3d7d0b6379e0e9728b9
- Branch / Tag: refs/tags/v0.3.2
- Owner: https://github.com/turhancan97
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@46f363dbfcaa2b890579d3d7d0b6379e0e9728b9
- Trigger Event: push

File details

Details for the file featlens-0.3.2-py3-none-any.whl.

File metadata

Download URL: featlens-0.3.2-py3-none-any.whl
Upload date: Jun 30, 2026
Size: 51.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for featlens-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b35214a3c5c8715b928f5858ca7411fc364705c1c6493dce2f5b797a68beeacd`
MD5	`dc94b49e604effa4f832b765474ac9b3`
BLAKE2b-256	`70b101c318ac4f8fdd8cc9e7f505b1ba486d0093a041e803f2953919a70b4568`

See more details on using hashes here.

Provenance

The following attestation bundles were made for featlens-0.3.2-py3-none-any.whl:

Publisher: publish.yml on turhancan97/FeatLens

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: featlens-0.3.2-py3-none-any.whl
- Subject digest: b35214a3c5c8715b928f5858ca7411fc364705c1c6493dce2f5b797a68beeacd
- Sigstore transparency entry: 2025797291
- Sigstore integration time: Jun 30, 2026
Source repository:
- Permalink: turhancan97/FeatLens@46f363dbfcaa2b890579d3d7d0b6379e0e9728b9
- Branch / Tag: refs/tags/v0.3.2
- Owner: https://github.com/turhancan97
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@46f363dbfcaa2b890579d3d7d0b6379e0e9728b9
- Trigger Event: push

featlens 0.3.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Gallery

Install

Quick start (Python)

Quick start (CLI)

Image size & resizing

Model sources

Layers

Visualization methods

Bring your own model

How it works

Contributing

Releasing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance