Skip to main content

Vision language models for robotics

Project description

vlm_inference

This repository contains code for performing inference using a Vision-Language Model (VLM) for robotic navigation tasks.

Setup Instructions

Follow these steps to use the library, it is already available by PyP:

  1. Intall library:

Here you have two options, install twrought the PyP:

    pip install openai==1.78.1 opencv-python==4.11.0.86 pyyaml dotenv==0.9.9 pillow==9.0.1
    pip install vlm_inference

You can try the library in action using this Colab notebook:
📎 Open In Colab

Install since the repostitory:

    git clone ssh://git@gitlab.iri.upc.edu:2202/mobile_robotics/moonshot_project/vlm/vlm_inference.git

    cd vlm_inference
    python3 pip install -r requirements.txt

Install by the repository

    pip install -e .
  1. Create a .env file and add your OpenAI API key and the VLM configuration path:

    Create a file named .env in the root directory of the repository and add the following content, replacing sk-proj-_HFJE2I64........... with your actual OpenAI API key and ensuring the VLM_CONFIG_PATH points to your configuration file:

    # .env
    OPENAI_API_KEY=sk-proj-_HFJE2I64...........
    VLM_CONFIG_PATH=vlm_inference/config.yaml

If you have can´t not modify the python version use the following command:

    python3 pip install -r requirements.txt
  1. Set the navigation goal (optional):

    If you want to specify a navigation goal as an object or person with a description, open the following file:

    vlm_navigation/prompt_manager/navigation_prompt.txt
Locate line 7, which defines the `navigation_goal` variable, and modify it according to your desired goal. For example:
    navigation_goal = "a red chair near the window"
  1. Run the inference script:

    Execute the main inference script using Python 3:

    python3 vlm_navigation/inference.py

This script will load the VLM, potentially process images (depending on the script's functionality), and output the inference results based on the configuration and any specified navigation goal.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vlm_inference-0.1.36.tar.gz (11.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vlm_inference-0.1.36-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file vlm_inference-0.1.36.tar.gz.

File metadata

  • Download URL: vlm_inference-0.1.36.tar.gz
  • Upload date:
  • Size: 11.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for vlm_inference-0.1.36.tar.gz
Algorithm Hash digest
SHA256 fbd7120c4d7c4a866f5ec7e225d2789c95d4a679d657736a11c42a69e53908b6
MD5 686d2420ff5de6335005d25a9d269ada
BLAKE2b-256 3c5d003ac9b94055c081f34ae25c2685932336ffcc4b4b8ca094d44dbec015e9

See more details on using hashes here.

File details

Details for the file vlm_inference-0.1.36-py3-none-any.whl.

File metadata

  • Download URL: vlm_inference-0.1.36-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for vlm_inference-0.1.36-py3-none-any.whl
Algorithm Hash digest
SHA256 6445b9cbd03368044cca8e9272483fe7ce5f08b40296ac0b99cc6f4a336e61f3
MD5 aa82ed4a0f8aab267da57e09c41885f4
BLAKE2b-256 6137765bc756dc4bcef7f4e58d9ce8bf3422640c4fe6bccfa3166805a6f4fb5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page