Skip to main content

Official project for GLUE3D.

Project description

GLUE3D: General Language Understanding Evaluation for 3D Point Clouds

Giorgio Mariani, Alessandro Raganato, Simone Melzi, Gabriella Pasi

Official implementation of GLUE3D: General Language Understanding Evaluation for 3D Point Clouds.

GLUE3D is a Q&A benchmark for evaluation of 3D-LLMs object understanding capabilities. It is built around 128 richly textured surfaces spanning creatures, objects, architecture and transport. Each surface is provided as a 50 k-point RGB point cloud, a 8K-point RGB point cloud, a 512 × 512 RGB rendering, and five RGB-D multiviews. These multiple representations enable point-for-point evaluation across several modalities.

GLUE3D consists of three Q&A task types: binary question answering, multiple-choice question answering, and open-ended captioning. This diverse set of tasks enables a more robust and comprehensive assessment of multimodal understanding in 3D-LLMs.


Installation

To evaluate your question-answering model on GLUE3D, we offer a PyPI package that can be easily installed with the command:

pip install glue3d

You can install glue3d from source if you want the latest changes in the library or are interested in contributing. However, the latest version may not be stable. Feel free to open an issue if you encounter an error.

git clone https://github.com/giorgio-mariani/GLUE3D.git
cd GLUE3D

pip install -e .

Answer generation

To evaluate your model, first you need to generate your 3D-LLM's answers for the desired GLUE3D task. You can do so in two main ways:

  1. Using the dataset loader (load_GLUE3D_benchmark) with your own model and code.
  2. Using the built-in AnswerGenerator interface with generate_GLUE3D_answers. This option is to be preferred if your model follows huggingface causal generation procedure (e.g., LlavaLlamaForCausalLM).
Option 1: Using load_GLUE3D_benchmark

Using load_GLUE3D_benchmark

The GLUE3D benchmark data can be (down)loaded using:

import pandas as pd
from glue3d.data import load_GLUE3D_benchmark

dataset = load_GLUE3D_benchmark(
    dataset_name="GLUE3D-points-8k", # or "GLUE3D-images", "GLUE3D-multiview", "GLUE3D-points"
    qa_task="binary_task",           # or "multiplechoice_task", "captioning_task"
    cache_dir=None,                  # Optional; defaults to './cache' or $GLUE3D_CACHE_DIR
)

This procedure loads to memory and prepare the necessary GLUE3D data for the specified Q&A task and data-type. It also automatically downloads all necessary data to disk if this is not yet stored. The available tasks are binary_task, multiplechoice_task, captioning_task. Note that the loader uses a local cache directory. You can customize it via the GLUE3D_CACHE_DIR environment variable.

Once the GLUE3D data is loaded, you can iterate through the dataset to generate answers for each question in the Q&A task:

your_model = ...  # Load your 3D-LLM

model_answers = []
for x in dataset:
    oid = x["object_id"]
    qid = x["question_id"]
    q = x["question"]
    pc = x["data"]  # e.g., (8192 x 6) np.ndarray for "GLUE3D-points-8K"

    answer = your_model.answer_question(pc, q)
    model_answers.append({
        "OBJECT_ID": oid,
        "QUESTION_ID": qid,
        "MODEL_ANSWER": answer,
    })

# Save results
pd.DataFrame.from_records(model_answers).to_csv("qa.csv", index=False)

[!IMPORTANT] Ensure your answers follow the expected format for each task.

  • For the binary_task, the model answer must be a boolean object (either True or False).
  • For the multiplechoice_task, the model answer must be one of A, B, C, D.
  • For the captioning_task the model answer must be a string.
Option 2: Using the `AnswerGenerator` Interface

Using the AnswerGenerator Interface

If your 3D-LLM inherits the GeneratorMixin class (e.g., LlavaLlamaForCausalLM), then it is possible to use our *HFAnswerGenerator abstract classes to simplify the generation process. The only requirement is to implement the prepare_inputs function, which takes in input the point cloud (or image) and the question and returns the keyword inputs for the GeneratorMixin.generate() method:

import numpy as np
from typing import override
from glue3d import generate_GLUE3D_answers
from glue3d.models.hf import (
    BinaryHFAnswerGenerator,
    MultichoiceHFAnswerGenerator,
    CaptioningHFGenerator
)

# Example custom AnswerGenerator for the binary task
class YourAnswerGenerator(BinaryHFAnswerGenerator): # <- Swap with MultichoiceHFAnswerGenerator
    def __init__(self, your_model, tokenizer):      #   or CaptioningHFGenerator for other tasks.
        super().__init__(your_model, tokenizer)

    @override
    def prepare_inputs(self, data: np.ndarray, text: str) -> dict:
        ... # Preprocess data (e.g., tokenize text, move tensors to device, apply chat templates)
        return {
            "input_ids": ...,
            "points": ...,
            "do_sample": ...,
            "stopping_criteria": ...,
        }

Once you have your custom implementation, generation can be simply done by calling generate_GLUE3D_answers on your target dataset-type and Q&A task:

your_model = ...
answer_gen = YourAnswerGenerator(your_model)

qa_answers = generate_GLUE3D_answers(
    qa_task="binary_task",
    dataset_type="GLUE3D-points-8K",
    answer_generator=answer_gen,
)

# `qa_answers` is returned as a pandas DataFrame
qa_answers.to_csv("qa.csv", index=False)

Q&A evaluation

As result of the answer generation step, you should have a .CSV file containing the question-answer pairs for a given task. The file (let us call it binary-qa.csv) should have a structure similar to

OBJECT_ID, QUESTION_ID, VALUE
dc5c798, 0fbac6, True
dc5c798, 556cc4, False
...

It is then possible to evaluate the answers produced by your model using the glue3d evaluate CLI command:

glue3d evaluate --input-file binary-qa.csv --output-file out.csv --task binary_task

Or equivalently, using Python

from glu3d.evaluate_answers import evaluate_GLUE3D_answers

out = evaluate_GLUE3D_answers("binary_task", "binary-qa.csv")
out.to_csv("out.csv")

For the binary and multiple choice tasks, the output is a dataframe which indicates extact match between the question answer and the model provided one. For the captioning task, results scores for BLEU, METEOR, ROUGE-L, S-BERT, and SimCSE are provided. All scores are scaled to range between 0-100.

[!NOTE] For the captioning task it is also possible to change the evaluator to use qwen3-30B-A3B as a judge. To do so use the command:

glue3d evaluate --input-file captions.csv --output-file out.csv --task captioning_task --evaluator qwen_3_30B_A3B

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glue3d-0.2.4.tar.gz (415.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

glue3d-0.2.4-py3-none-any.whl (33.6 kB view details)

Uploaded Python 3

File details

Details for the file glue3d-0.2.4.tar.gz.

File metadata

  • Download URL: glue3d-0.2.4.tar.gz
  • Upload date:
  • Size: 415.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for glue3d-0.2.4.tar.gz
Algorithm Hash digest
SHA256 d1274691b3c27719fe2d77bbde3b104f1583af5f7eee06b225d3f8eaf27206bb
MD5 3ec8a1cdb400e3b2821a19248a3ca298
BLAKE2b-256 46144d135874da021d0fcc07f5e172a2c0f4ec70c1d75a07c8f9209625e192e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for glue3d-0.2.4.tar.gz:

Publisher: python-publish.yml on giorgio-mariani/GLUE3D

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file glue3d-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: glue3d-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 33.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for glue3d-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 86d3999f5a95e160f0d82a55794e740d7fd6240389df4b32ccfd13caaaa8dcec
MD5 b4a88eecd9d4d8d3e920be56304f7602
BLAKE2b-256 a2da94d9c97ff2955e9966f4b46aa7f5470d65406eebd0199b894b8d98e86a3d

See more details on using hashes here.

Provenance

The following attestation bundles were made for glue3d-0.2.4-py3-none-any.whl:

Publisher: python-publish.yml on giorgio-mariani/GLUE3D

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page