Official project for GLUE3D.

These details have not been verified by PyPI

Project description

GLUE3D: General Language Understanding Evaluation for 3D Point Clouds

Giorgio Mariani, Alessandro Raganato, Simone Melzi, Gabriella Pasi

Official implementation of GLUE3D: General Language Understanding Evaluation for 3D Point Clouds.

GLUE3D is a Q&A benchmark for evaluation of 3D-LLMs object understanding capabilities. It is built around 128 richly textured surfaces spanning creatures, objects, architecture and transport. Each surface is provided as a 50 k-point RGB point cloud, a 8K-point RGB point cloud, a 512 × 512 RGB rendering, and five RGB-D multiviews. These multiple representations enable point-for-point evaluation across several modalities.

GLUE3D consists of three Q&A task types: binary question answering, multiple-choice question answering, and open-ended captioning. This diverse set of tasks enables a more robust and comprehensive assessment of multimodal understanding in 3D-LLMs.

Installation

To evaluate your question-answering model on GLUE3D, we offer a PyPI package that can be easily installed with the command:

pip install glue3d

You can install glue3d from source if you want the latest changes in the library or are interested in contributing. However, the latest version may not be stable. Feel free to open an issue if you encounter an error.

git clone https://github.com/giorgio-mariani/GLUE3D.git
cd GLUE3D

pip install -e .

Answer generation

To evaluate your model, first you need to generate your 3D-LLM's answers for the desired GLUE3D task. You can do so in two main ways:

Using the dataset loader (load_GLUE3D_benchmark) with your own model and code.
Using the built-in AnswerGenerator interface with generate_GLUE3D_answers. This option is to be preferred if your model follows huggingface causal generation procedure (e.g., LlavaLlamaForCausalLM).

Option 1: Using load_GLUE3D_benchmark

Using `load_GLUE3D_benchmark`

The GLUE3D benchmark data can be (down)loaded using:

import pandas as pd
from glue3d.data import load_GLUE3D_benchmark

dataset = load_GLUE3D_benchmark(
    dataset_name="GLUE3D-points-8k", # or "GLUE3D-images", "GLUE3D-multiview", "GLUE3D-points"
    qa_task="binary_task",           # or "multiplechoice_task", "captioning_task"
    cache_dir=None,                  # Optional; defaults to './cache' or $GLUE3D_CACHE_DIR
)

This procedure loads to memory and prepare the necessary GLUE3D data for the specified Q&A task and data-type. It also automatically downloads all necessary data to disk if this is not yet stored. The available tasks are binary_task, multiplechoice_task, captioning_task. Note that the loader uses a local cache directory. You can customize it via the GLUE3D_CACHE_DIR environment variable.

Once the GLUE3D data is loaded, you can iterate through the dataset to generate answers for each question in the Q&A task:

your_model = ...  # Load your 3D-LLM

model_answers = []
for x in dataset:
    oid = x["object_id"]
    qid = x["question_id"]
    q = x["question"]
    pc = x["data"]  # e.g., (8192 x 6) np.ndarray for "GLUE3D-points-8K"

    answer = your_model.answer_question(pc, q)
    model_answers.append({
        "OBJECT_ID": oid,
        "QUESTION_ID": qid,
        "MODEL_ANSWER": answer,
    })

# Save results
pd.DataFrame.from_records(model_answers).to_csv("qa.csv", index=False)

[!IMPORTANT] Ensure your answers follow the expected format for each task.

For the binary_task, the model answer must be a boolean object (either True or False).

For the multiplechoice_task, the model answer must be one of A, B, C, D.

For the captioning_task the model answer must be a string.

Option 2: Using the `AnswerGenerator` Interface

Using the `AnswerGenerator` Interface

If your 3D-LLM inherits the GeneratorMixin class (e.g., LlavaLlamaForCausalLM), then it is possible to use our *HFAnswerGenerator abstract classes to simplify the generation process. The only requirement is to implement the prepare_inputs function, which takes in input the point cloud (or image) and the question and returns the keyword inputs for the GeneratorMixin.generate() method:

import numpy as np
from typing import override
from glue3d import generate_GLUE3D_answers
from glue3d.models.hf import (
    BinaryHFAnswerGenerator,
    MultichoiceHFAnswerGenerator,
    CaptioningHFGenerator
)

# Example custom AnswerGenerator for the binary task
class YourAnswerGenerator(BinaryHFAnswerGenerator): # <- Swap with MultichoiceHFAnswerGenerator
    def __init__(self, your_model, tokenizer):      #   or CaptioningHFGenerator for other tasks.
        super().__init__(your_model, tokenizer)

    @override
    def prepare_inputs(self, data: np.ndarray, text: str) -> dict:
        ... # Preprocess data (e.g., tokenize text, move tensors to device, apply chat templates)
        return {
            "input_ids": ...,
            "points": ...,
            "do_sample": ...,
            "stopping_criteria": ...,
        }

Once you have your custom implementation, generation can be simply done by calling generate_GLUE3D_answers on your target dataset-type and Q&A task:

your_model = ...
answer_gen = YourAnswerGenerator(your_model)

qa_answers = generate_GLUE3D_answers(
    qa_task="binary_task",
    dataset_type="GLUE3D-points-8K",
    answer_generator=answer_gen,
)

# `qa_answers` is returned as a pandas DataFrame
qa_answers.to_csv("qa.csv", index=False)

Q&A evaluation

As result of the answer generation step, you should have a .CSV file containing the question-answer pairs for a given task. The file (let us call it binary-qa.csv) should have a structure similar to

OBJECT_ID, QUESTION_ID, VALUE
dc5c798, 0fbac6, True
dc5c798, 556cc4, False
...

It is then possible to evaluate the answers produced by your model using the glue3d evaluate CLI command:

glue3d evaluate --input-file binary-qa.csv --output-file out.csv --task binary_task

Or equivalently, using Python

from glu3d.evaluate_answers import evaluate_GLUE3D_answers

out = evaluate_GLUE3D_answers("binary_task", "binary-qa.csv")
out.to_csv("out.csv")

For the binary and multiple choice tasks, the output is a dataframe which indicates extact match between the question answer and the model provided one. For the captioning task, results scores for BLEU, METEOR, ROUGE-L, S-BERT, and SimCSE are provided. All scores are scaled to range between 0-100.

[!NOTE] For the captioning task it is also possible to change the evaluator to use qwen3-30B-A3B as a judge. To do so use the command:
glue3d evaluate --input-file captions.csv --output-file out.csv --task captioning_task --evaluator qwen_3_30B_A3B

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.0

Nov 28, 2025

0.2.5

Oct 12, 2025

0.2.4

Oct 10, 2025

This version

0.2.3

Oct 8, 2025

0.2.2

Oct 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glue3d-0.2.3.tar.gz (410.5 kB view details)

Uploaded Oct 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

glue3d-0.2.3-py3-none-any.whl (33.5 kB view details)

Uploaded Oct 8, 2025 Python 3

File details

Details for the file glue3d-0.2.3.tar.gz.

File metadata

Download URL: glue3d-0.2.3.tar.gz
Upload date: Oct 8, 2025
Size: 410.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for glue3d-0.2.3.tar.gz
Algorithm	Hash digest
SHA256	`57d0ea442c43048f40350594ddc6b2c350e147651744b687c646a5692dcf3759`
MD5	`05dfd56e233dfe2e0803dacb6a5d1999`
BLAKE2b-256	`5cb8527a30749829be5f8e2edfebbd83241325600db3898872e8b872dae38962`

See more details on using hashes here.

File details

Details for the file glue3d-0.2.3-py3-none-any.whl.

File metadata

Download URL: glue3d-0.2.3-py3-none-any.whl
Upload date: Oct 8, 2025
Size: 33.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for glue3d-0.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8701fc7acf873be8ea9d9f048f45472f7dfd9d112e46b9dbd517239870b0ab40`
MD5	`93c3f945e21c598289ea5f6c0bcdbec0`
BLAKE2b-256	`92e1c7f94fae36cef0ee3d5fee41847ae933111b36ec6dd9ee644c6f1c94c0ed`

See more details on using hashes here.

glue3d 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

GLUE3D: General Language Understanding Evaluation for 3D Point Clouds

Installation

Answer generation

Using `load_GLUE3D_benchmark`

Using the `AnswerGenerator` Interface

Q&A evaluation

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

glue3d 0.2.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

GLUE3D: General Language Understanding Evaluation for 3D Point Clouds

Installation

Answer generation

Using load_GLUE3D_benchmark

Using the AnswerGenerator Interface

Q&A evaluation

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Using `load_GLUE3D_benchmark`

Using the `AnswerGenerator` Interface