Vision language models for robotics

Project description

vlm_inference

This repository contains code for performing inference using a Vision-Language Model (VLM) for robotic navigation tasks.

Setup Instructions

Follow these steps to use the library, it is already available by PyP:

Intall library:

Here you have two options, install twrought the PyP:

    pip install openai==1.78.1 opencv-python==4.11.0.86 pyyaml dotenv==0.9.9 pillow==9.0.1

    pip install vlm_inference

You can try the library in action using this Colab notebook:
📎

Install since the repostitory:

    git clone ssh://git@gitlab.iri.upc.edu:2202/mobile_robotics/moonshot_project/vlm/vlm_inference.git

    cd vlm_inference

    python3 pip install -r requirements.txt

Install by the repository

    pip install -e .

Create a .env file and add your OpenAI API key and the VLM configuration path:

Create a file named .env in the root directory of the repository and add the following content, replacing sk-proj-_HFJE2I64........... with your actual OpenAI API key and ensuring the VLM_CONFIG_PATH points to your configuration file:

    # .env
    OPENAI_API_KEY=sk-proj-_HFJE2I64...........
    VLM_CONFIG_PATH=vlm_inference/config.yaml

If you have can´t not modify the python version use the following command:

    python3 pip install -r requirements.txt

Set the navigation goal (optional):

If you want to specify a navigation goal as an object or person with a description, open the following file:

    vlm_navigation/prompt_manager/navigation_prompt.txt

Locate line 7, which defines the `navigation_goal` variable, and modify it according to your desired goal. For example:

    navigation_goal = "a red chair near the window"

Run the inference script:

Execute the main inference script using Python 3:

    python3 vlm_navigation/inference.py

This script will load the VLM, potentially process images (depending on the script's functionality), and output the inference results based on the configuration and any specified navigation goal.

Project details

Release history Release notifications | RSS feed

This version

0.1.36

May 28, 2025

0.1.35

May 28, 2025

0.1.34

May 28, 2025

0.1.33

May 28, 2025

0.1.32

May 28, 2025

0.1.31

May 28, 2025

0.1.30

May 28, 2025

0.1.29

May 28, 2025

0.1.28

May 28, 2025

0.1.27

May 28, 2025

0.1.26

May 28, 2025

0.1.25

May 28, 2025

0.1.24

May 26, 2025

0.1.23

May 26, 2025

0.1.22

May 26, 2025

0.1.21

May 26, 2025

0.1.20

May 26, 2025

0.1.19

May 26, 2025

0.1.18

May 26, 2025

0.1.17

May 26, 2025

0.1.16

May 26, 2025

0.1.15

May 26, 2025

0.1.14

May 26, 2025

0.1.13

May 22, 2025

0.1.12

May 22, 2025

0.1.11

May 22, 2025

0.1.10

May 22, 2025

0.1.9

May 21, 2025

0.1.6

May 20, 2025

0.1.5

May 19, 2025

0.1.4

May 19, 2025

0.1.3

May 19, 2025

0.1.2

May 19, 2025

0.1.1

May 19, 2025

0.1.0

May 19, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vlm_inference-0.1.36.tar.gz (11.2 kB view details)

Uploaded May 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vlm_inference-0.1.36-py3-none-any.whl (11.4 kB view details)

Uploaded May 28, 2025 Python 3

File details

Details for the file vlm_inference-0.1.36.tar.gz.

File metadata

Download URL: vlm_inference-0.1.36.tar.gz
Upload date: May 28, 2025
Size: 11.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for vlm_inference-0.1.36.tar.gz
Algorithm	Hash digest
SHA256	`fbd7120c4d7c4a866f5ec7e225d2789c95d4a679d657736a11c42a69e53908b6`
MD5	`686d2420ff5de6335005d25a9d269ada`
BLAKE2b-256	`3c5d003ac9b94055c081f34ae25c2685932336ffcc4b4b8ca094d44dbec015e9`

See more details on using hashes here.

File details

Details for the file vlm_inference-0.1.36-py3-none-any.whl.

File metadata

Download URL: vlm_inference-0.1.36-py3-none-any.whl
Upload date: May 28, 2025
Size: 11.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for vlm_inference-0.1.36-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6445b9cbd03368044cca8e9272483fe7ce5f08b40296ac0b99cc6f4a336e61f3`
MD5	`aa82ed4a0f8aab267da57e09c41885f4`
BLAKE2b-256	`6137765bc756dc4bcef7f4e58d9ce8bf3422640c4fe6bccfa3166805a6f4fb5c`

See more details on using hashes here.

vlm-inference 0.1.36

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

vlm_inference

Setup Instructions

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes