LLaVA-style graft adding vision-language capability to Mistral-family decoders (Schneewolf Labs — Project Artemis).

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Artemis — Schneewolf Labs

A LLaVA-style graft that adds vision-language capability to any Mistral-family text decoder without modifying the decoder. Built originally for the Schneewolf Labs A-series, but architecturally Mistral-Nemo agnostic — point it at any Mistral-class checkpoint (A2, A3, Mahou, Flammades, etc.) and you get an ArtemisVLM around it.

Path B by design

       PIL Image
           │
           ▼
   ┌───────────────────────┐
   │  Qwen3-VL ViT         │  patches → ViT layers → merger
   │  (FROZEN, pixels only)│
   └───────────────────────┘
           │  N vectors of dim out_hidden_size
           ▼
   ┌───────────────────────┐
   │  Projector (trained)  │  2-layer MLP, out_hidden → text_hidden
   │  ~45M params          │
   └───────────────────────┘
           │  N vectors in the text decoder's hidden space
           ▼
   ┌───────────────────────────────────────────────────────────────┐
   │  Mistral-family decoder (FROZEN in Stage-1, full-FT Stage-2)   │
   │  At each <|image_pad|> position, OVERWRITE the embedding with │
   │  the next projector vector. Then run as a normal decoder.     │
   └───────────────────────────────────────────────────────────────┘
           │
           ▼
       text output (decoder's own vocab — Qwen vocab never seen)

The vision tower processes pixels (no text tokens). The projector bridges hidden spaces, not token spaces. The decoder is byte-identical to the underlying Mistral checkpoint — its vocab, weights, chat template, reasoning, tool calling, and identity are preserved by construction.

Install

pip install artemis-vlm

Or, from source:

git clone https://github.com/Schneewolf-Labs/Artemis.git
cd Artemis
pip install -e .

Requires transformers>=5.0.0, torch>=2.5.0, Pillow.

Quick start — load a pretrained Artemis checkpoint

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import artemis_vlm  # registers ArtemisVLM with AutoConfig / AutoModel

REPO = "schneewolflabs/A3-preview"  # or any ArtemisVLM checkpoint

model = AutoModelForCausalLM.from_pretrained(REPO, dtype=torch.bfloat16).to("cuda").eval()
tok = AutoTokenizer.from_pretrained(REPO)
processor = artemis_vlm.ArtemisVLMProcessor(
    tokenizer=tok, vision_config=model.visual.config,
    min_pixels=32 * 32, max_pixels=512 * 512,
)

from PIL import Image
image = Image.open("photo.jpg")
messages = [{"role": "user", "content": [
    {"type": "image"},
    {"type": "text", "text": "Describe this image in detail."},
]}]
text = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
batch = processor(text=text, images=[image], return_tensors="pt").to("cuda")
with torch.no_grad():
    out = model.generate(**batch, max_new_tokens=200, do_sample=False)
print(tok.decode(out[0][batch["input_ids"].shape[1]:], skip_special_tokens=True))

Quick start — build a new graft from your own checkpoints

import torch
import artemis_vlm
from transformers import Qwen3VLForConditionalGeneration

# Take the vision tower from a pretrained Qwen3-VL checkpoint
qv = Qwen3VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen3-VL-2B-Instruct", dtype=torch.bfloat16,
)
vision = qv.model.visual
del qv  # free the Qwen3-VL decoder we don't need

# Graft onto any Mistral-class text checkpoint
model = artemis_vlm.ArtemisVLMForConditionalGeneration.from_a2_and_vision(
    "schneewolflabs/A2",  # or any Mistral-Nemo finetune
    vision_model=vision,
    image_token_id=22,    # repurposed <|image_pad|> in A-series Tekken vocab
    torch_dtype=torch.bfloat16,
)

# Stage-1: train only the projector (~45M params)
trainable, total = model.set_training_stage("stage1")
print(f"Stage-1: trainable={trainable/1e6:.1f}M / total={total/1e9:.2f}B")

Training (Stage-1 / Stage-2)

set_training_stage("stage1") freezes the ViT and the decoder, leaving only the projector trainable — the "alignment" phase. set_training_stage("stage2") unfreezes the decoder for the visual-instruction phase.

The recommended trainer is Schneewolf-Labs/Merlina, which exposes Artemis training as training_mode: "vlm_stage1" / "vlm_stage2" on its REST API. The ArtemisDataCollator here is data_collator=-compatible with any trainer that consumes a custom collator (Grimoire, accelerate-driven loops, HF Trainer).

Key implementation notes

Merged vision features. Qwen3VLVisionModel.forward() returns pre-merge features on last_hidden_state and merged features on pooler_output. We use pooler_output (matches the merger's downstream-consumer contract).
Patch / merge sizes come from vision_config. Qwen3-VL uses patch_size=16; Qwen2-VL's image processor defaults to patch_size=14. The processor sources patch / temporal / merge from vision_config so the <|image_pad|> expansion count can never drift from the model's merged feature count.
Image token splice. At each <|image_pad|> position in the prompt, the input embedding is overwritten with the next projector vector (via masked_scatter). The decoder sees a normal token sequence where some embeddings happen to come from vision instead of embed_tokens.
DeepStack / Interleaved-MRoPE are intentionally NOT used. Those are decoder-modification ("Path A") tricks. We chose Path B (composition).
Untied weights. A-series decoders have untied embed_tokens and lm_head. ArtemisVLMForConditionalGeneration.all_tied_weights_keys = {} is declared explicitly for transformers 5.x compatibility.

Tests

Four hardware-bound smoke tests live in tests/. They require a real checkpoint on disk + an ML stack + a CUDA device, so they skip cleanly under pytest (CI won't try to run them) and are meant to be invoked as python tests/test_artemis_<name>.py on the development machine.

python tests/test_artemis_vlm.py        # model assembly + forward
python tests/test_artemis_processor.py  # chat template ↔ pad expansion
python tests/test_artemis_collator.py   # multimodal batching
python tests/test_artemis_stage_gen.py  # staged-freeze + generate()

Published checkpoints

Checkpoint	Status	Notes
`schneewolflabs/A3-preview`	public, apache-2.0	25k-sample Stage-1 smoke (proof-of-concept)
`schneewolflabs/A3`	training (Stage-1, 1M samples)	first real release
`schneewolflabs/Artemis`	planned (post Stage-2)	named flagship

License

Apache 2.0 — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

nbeerbower

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.3

May 29, 2026

0.1.2

May 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

artemis_vlm-0.1.3.tar.gz (26.0 kB view details)

Uploaded May 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

artemis_vlm-0.1.3-py3-none-any.whl (20.6 kB view details)

Uploaded May 29, 2026 Python 3

File details

Details for the file artemis_vlm-0.1.3.tar.gz.

File metadata

Download URL: artemis_vlm-0.1.3.tar.gz
Upload date: May 29, 2026
Size: 26.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for artemis_vlm-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`67093413d7f4e6e398622ed0d7efbdfd96574b4418f1981414e178e857138db0`
MD5	`bf13de2f7f7bbb1e46cee5e553d5ffea`
BLAKE2b-256	`2cfa17d30b4670180bd44341345bef95ac8dd1df24f75db79ac25fe70f2daf7f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for artemis_vlm-0.1.3.tar.gz:

Publisher: release.yml on Schneewolf-Labs/Artemis

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: artemis_vlm-0.1.3.tar.gz
- Subject digest: 67093413d7f4e6e398622ed0d7efbdfd96574b4418f1981414e178e857138db0
- Sigstore transparency entry: 1673025463
- Sigstore integration time: May 29, 2026
Source repository:
- Permalink: Schneewolf-Labs/Artemis@8e00c88142aa83dcf9f8135cd076aa828a54368d
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/Schneewolf-Labs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@8e00c88142aa83dcf9f8135cd076aa828a54368d
- Trigger Event: push

File details

Details for the file artemis_vlm-0.1.3-py3-none-any.whl.

File metadata

Download URL: artemis_vlm-0.1.3-py3-none-any.whl
Upload date: May 29, 2026
Size: 20.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for artemis_vlm-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`25e46a33e45c61084c7f42610c23cdb13b08346320aa220a488936b1b0bf7910`
MD5	`9abbf87e796b18d374e103aa89b45adf`
BLAKE2b-256	`23f263395585f9d4fef39795de4eed9479130b4473abfce02f7b43898db56800`

See more details on using hashes here.

Provenance

The following attestation bundles were made for artemis_vlm-0.1.3-py3-none-any.whl:

Publisher: release.yml on Schneewolf-Labs/Artemis

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: artemis_vlm-0.1.3-py3-none-any.whl
- Subject digest: 25e46a33e45c61084c7f42610c23cdb13b08346320aa220a488936b1b0bf7910
- Sigstore transparency entry: 1673025486
- Sigstore integration time: May 29, 2026
Source repository:
- Permalink: Schneewolf-Labs/Artemis@8e00c88142aa83dcf9f8135cd076aa828a54368d
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/Schneewolf-Labs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@8e00c88142aa83dcf9f8135cd076aa828a54368d
- Trigger Event: push

artemis-vlm 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Artemis — Schneewolf Labs

Path B by design

Install

Quick start — load a pretrained Artemis checkpoint

Quick start — build a new graft from your own checkpoints

Training (Stage-1 / Stage-2)

Key implementation notes

Tests

Published checkpoints

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance