Easily map images (as `PIL.Images`) to features (as `np.ndarray`) from pretrained vision models.
Project description
enczoo: easily extract image features from pretrained vision models
enczoo is a Python library with a simple goal: to make it as easy as possible to map images (as PIL.Images) to features (as numpy arrays) from state-of-the-art vision models, such as Imagenet-pretrained ResNet50 and CLIP ViT-B/16.
Installation
enczoo requires Python 3.12 or above, and is installed using the wonderful uv project manager. Once you have uv installed, just run the following command in your project:
uv add enczoo
Usage
import enczoo
from PIL import Image
image = Image.open('my-image.png')
model = enczoo.ResNet50(
layer_name='avgpool',
# device=gpu
)
features = model.compute_features(images=[image]) # np.ndarray
# Want another layer? Check out: print(enczoo.ResNet50.layer_names)
Available models
Pixels
- Family: raw pixels
- Returns: float32 RGB pixels after preprocessing
- Output shape:
[B, 224, 224, 3] - Academic reference: none; this is an
enczooconvenience encoder
AlexNet
- Family: ImageNet-pretrained CNN
- Returns: intermediate activations from the requested layer
- Output shape: depends on
layer_name - Layer selection: inspect
enczoo.AlexNet.layer_names - Academic reference: AlexNet, "ImageNet Classification with Deep Convolutional Neural Networks" (Krizhevsky et al., 2012)
ResNet50
- Family: ImageNet-pretrained CNN
- Returns: intermediate activations from the requested layer
- Output shape: depends on
layer_name - Layer selection: inspect
enczoo.ResNet50.layer_names - Academic reference: ResNet, "Deep Residual Learning for Image Recognition" (He et al., 2015)
RobustResNet50
- Family: adversarially robust ImageNet ResNet-50
- Returns: intermediate activations from the requested layer
- Output shape: depends on
layer_name - Layer selection: inspect
enczoo.RobustResNet50.layer_names - Weights: downloaded on first use from the released ImageNet L2 epsilon-3.0 checkpoint
- Academic reference: Engstrom et al., "Robustness (Python Library)" release checkpoint via the MadryLab model weights
ConvNeXtB
- Family: ImageNet-pretrained CNN
- Returns: intermediate activations from the requested layer
- Output shape: depends on
layer_name - Layer selection: inspect
enczoo.ConvNeXtB.layer_names - Academic reference: ConvNeXt, "A ConvNet for the 2020s" (Liu et al., 2022)
CLIPResNet50
- Family: CLIP ResNet visual encoder
- Returns: intermediate activations from the requested visual layer
- Output shape: depends on
layer_name - Layer selection: inspect
enczoo.CLIPResNet50.layer_names - Academic reference: CLIP, "Learning Transferable Visual Models From Natural Language Supervision" (Radford et al., 2021)
CLIPViTB16
- Family: CLIP vision transformer
- Returns: the model's pooled CLS-based image embedding
- Output shape:
[B, 768] - Academic reference: CLIP, "Learning Transferable Visual Models From Natural Language Supervision" (Radford et al., 2021)
DINOv2ViTB14
- Family: self-supervised vision transformer
- Returns: the model's pooled CLS-based image embedding
- Output shape:
[B, 768] - Academic reference: DINOv2, "DINOv2: Learning Robust Visual Features without Supervision" (Oquab et al., 2023)
AligNetViTB16
- Family: AlignNet-aligned vision transformer
- Returns: the SavedModel feature tensor selected from the exported
pre_logitsoutput - Output shape: depends on the downloaded model
- Weights: downloaded on first use and cached under
ENCZOO_CACHE_DIRor the platform cache directory - Academic reference: Muttenthaler et al. 2025; weights come from the AlignNet model release
UnaligNetViTB16
- Family: unaligned vision transformer from the AlignNet release
- Returns: the SavedModel feature tensor selected from the exported
pre_logitsoutput - Output shape: depends on the downloaded model
- Weights: downloaded on first use and cached under
ENCZOO_CACHE_DIRor the platform cache directory - Academic reference: Muttenthaler et al. 2025; weights come from the AlignNet model release
Why develop enczoo?
Under the hood, enczoo solves several tiny problems which make correctly computing image features more annoying and error-prone than it should be. For example, enczoo automatically:
- performs model-specific image transforms ("was it -1 to 1, 0 to 1, or 0-255...?"),
- ensures images are in RGB format
- puts the model in inference, not training, mode
- turns off autograd
- returns tensors as
np.ndarray(no moredetach().cpu().numpy()) - resizes the image while preserving aspect ratio
- and more!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file enczoo-0.1.5.dev3.tar.gz.
File metadata
- Download URL: enczoo-0.1.5.dev3.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
448a2af1847e1b5a89719be16752d99e133f7147856e8fde89769831070aa07e
|
|
| MD5 |
bc6bd0df0452e4094a6015fabcde0d8a
|
|
| BLAKE2b-256 |
8c5206b88f7095cf9a086731a508bd909c93f5ece8650417949d3b9d8416bd61
|
File details
Details for the file enczoo-0.1.5.dev3-py3-none-any.whl.
File metadata
- Download URL: enczoo-0.1.5.dev3-py3-none-any.whl
- Upload date:
- Size: 1.4 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f252efa431c492f8afff46bd6765050233e7903205a0fe5b8c66c7499edaacfc
|
|
| MD5 |
52122656a86f3696eccce8a8500ab998
|
|
| BLAKE2b-256 |
7a778b56305a34f2025b0a358342b75873c92ee8d87f923562b43003b356526e
|