Vision language models for robotics
Project description
vlm_inference
This repository contains code for performing inference using a Vision-Language Model (VLM) for robotic navigation tasks.
Setup Instructions
Follow these steps to use the library, it is already available by PyP:
- Intall library:
Here you have two options, install twrought the PyP:
pip install openai==1.78.1 opencv-python==4.11.0.86 pyyaml dotenv==0.9.9 pillow==9.0.1
pip install vlm_inference
You can try the library in action using this Colab notebook:
📎
Install since the repostitory:
git clone ssh://git@gitlab.iri.upc.edu:2202/mobile_robotics/moonshot_project/vlm/vlm_inference.git
cd vlm_inference
python3 pip install -r requirements.txt
Install by the repository
pip install -e .
-
Create a
.envfile and add your OpenAI API key and the VLM configuration path:Create a file named
.envin the root directory of the repository and add the following content, replacingsk-proj-_HFJE2I64...........with your actual OpenAI API key and ensuring theVLM_CONFIG_PATHpoints to your configuration file:
# .env
OPENAI_API_KEY=sk-proj-_HFJE2I64...........
VLM_CONFIG_PATH=vlm_inference/config.yaml
If you have can´t not modify the python version use the following command:
python3 pip install -r requirements.txt
-
Set the navigation goal (optional):
If you want to specify a navigation goal as an object or person with a description, open the following file:
vlm_navigation/prompt_manager/navigation_prompt.txt
Locate line 7, which defines the `navigation_goal` variable, and modify it according to your desired goal. For example:
navigation_goal = "a red chair near the window"
-
Run the inference script:
Execute the main inference script using Python 3:
python3 vlm_navigation/inference.py
This script will load the VLM, potentially process images (depending on the script's functionality), and output the inference results based on the configuration and any specified navigation goal.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vlm_inference-0.1.36.tar.gz.
File metadata
- Download URL: vlm_inference-0.1.36.tar.gz
- Upload date:
- Size: 11.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fbd7120c4d7c4a866f5ec7e225d2789c95d4a679d657736a11c42a69e53908b6
|
|
| MD5 |
686d2420ff5de6335005d25a9d269ada
|
|
| BLAKE2b-256 |
3c5d003ac9b94055c081f34ae25c2685932336ffcc4b4b8ca094d44dbec015e9
|
File details
Details for the file vlm_inference-0.1.36-py3-none-any.whl.
File metadata
- Download URL: vlm_inference-0.1.36-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6445b9cbd03368044cca8e9272483fe7ce5f08b40296ac0b99cc6f4a336e61f3
|
|
| MD5 |
aa82ed4a0f8aab267da57e09c41885f4
|
|
| BLAKE2b-256 |
6137765bc756dc4bcef7f4e58d9ce8bf3422640c4fe6bccfa3166805a6f4fb5c
|