Skip to main content

CanViT inference on Apple Silicon via MLX

Project description

CanViT-MLX

MLX implementation of CanViT, the Canvas Vision Transformer, for native Apple Silicon inference.

CanViT: Toward Active-Vision Foundation Models (arXiv:2603.22570)

Install

uv add "canvit-mlx[hub]"

Quickstart

import mlx.core as mx
from canvit_mlx import load_from_hf_hub, load_and_preprocess, Viewpoint, extract_glimpse_at_viewpoint

model = load_from_hf_hub("canvit/canvitb16-add-vpe-pretrain-g128px-s512px-in21k-dv3b16-2026-02-02-mlx")
image = load_and_preprocess("test_data/Cat03.jpg", target_size=512)

state = model.init_state(batch_size=1, canvas_grid_size=32)
vp = Viewpoint.full_scene(batch_size=1)
glimpse = extract_glimpse_at_viewpoint(image, vp, glimpse_size_px=128)
out = model(glimpse, state, vp)
mx.eval(out.state.canvas, out.state.recurrent_cls, out.local_patches)

# Canvas spatial features (linearly decodable into dense predictions)
canvas_spatial = model.get_spatial(out.state.canvas)  # [1, 1024, 1024]
print(canvas_spatial.shape)

Classification

from pathlib import Path
from canvit_mlx import CanViTForImageClassification, Viewpoint, extract_glimpse_at_viewpoint, load_and_preprocess

clf = CanViTForImageClassification.from_pretrained_with_probe(
    pretrained_weights=Path("weights/canvitb16-add-vpe-pretrain-g128px-s512px-in21k-dv3b16-2026-02-02.safetensors"),
    pretrained_config=Path("weights/canvitb16-add-vpe-pretrain-g128px-s512px-in21k-dv3b16-2026-02-02.json"),
    probe_weights=Path("path/to/probe.safetensors"),
)

image = load_and_preprocess("test_data/Cat03.jpg", target_size=512)
state = clf.init_state(batch_size=1, canvas_grid_size=32)
vp = Viewpoint.full_scene(batch_size=1)
glimpse = extract_glimpse_at_viewpoint(image, vp, glimpse_size_px=128)
logits, new_state = clf(glimpse, state, vp)

Demos

uv run --group demos python demos/basic.py
uv run --group demos python demos/basic.py --image test_data/Cat03.jpg --canvas-grid 64

Converting weights

Convert a PyTorch checkpoint from HuggingFace Hub to MLX format:

uv run python convert.py
uv run python convert.py --verify  # includes PT vs MLX numerical comparison

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canvit_mlx-0.1.1.tar.gz (445.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

canvit_mlx-0.1.1-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file canvit_mlx-0.1.1.tar.gz.

File metadata

  • Download URL: canvit_mlx-0.1.1.tar.gz
  • Upload date:
  • Size: 445.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for canvit_mlx-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a14e929941099a1e56b250eb201d1307f87730cdefc4d0672e162247052e1094
MD5 0520dc249f53f8bdd1df285c3f2fb42d
BLAKE2b-256 578695e7885be4188aac9ede24ee219a01b6aac612272293ea314fdf5f34f896

See more details on using hashes here.

File details

Details for the file canvit_mlx-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: canvit_mlx-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for canvit_mlx-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7e87d5e8c1b6131e8a94ebf09214ff442d9462b805063e3819862aae7b8ed1e1
MD5 49f4630c19aa016315639469af9898bc
BLAKE2b-256 5e7e103e383d4789e37907c8d78fa24a960e26b8b809e416c3400a3705de5332

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page