Project description

VIM

A simple implementation of "VIMA: General Robot Manipulation with Multimodal Prompts"

Original implementation Link

Appreciation

Lucidrains
Agorians

Install

pip install vima

Usage

import torch
from vima import Vima

# Generate a random input sequence
x = torch.randint(0, 256, (1, 1024)).cuda()

# Initialize VIMA model
model = Vima()

# Pass the input sequence through the model
output = model(x)

MultiModal Iteration

Pass in text and and image tensors into vima

import torch
from vima.vima import VimaMultiModal

#usage
img = torch.randn(1, 3, 256, 256)
text = torch.randint(0, 20000, (1, 1024))


model = VimaMultiModal()
output = model(text, img)

License

MIT

Citations

@inproceedings{jiang2023vima,
  title     = {VIMA: General Robot Manipulation with Multimodal Prompts},
  author    = {Yunfan Jiang and Agrim Gupta and Zichen Zhang and Guanzhi Wang and Yongqiang Dou and Yanjun Chen and Li Fei-Fei and Anima Anandkumar and Yuke Zhu and Linxi Fan},
  booktitle = {Fortieth International Conference on Machine Learning},
  year      = {2023}
}

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

This version

0.0.2

Sep 13, 2023

0.0.1

Sep 13, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vima-0.0.2.tar.gz (25.3 kB view hashes)

Uploaded Sep 13, 2023 Source

Built Distribution

vima-0.0.2-py3-none-any.whl (26.3 kB view hashes)

Uploaded Sep 13, 2023 Python 3

Hashes for vima-0.0.2.tar.gz

Hashes for vima-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`eb458c3f26586668547962bb0ad031fa63475f62f783682ef84416b279829bf8`
MD5	`6408b8791ad9d7bf4a9b8b5875463e2e`
BLAKE2b-256	`a3496f37fd63826ab028595f674257eef6a88aec0f8e2b1ebabe6319a2e86dab`

Hashes for vima-0.0.2-py3-none-any.whl

Hashes for vima-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8e2ba3a331d7566a043aea4c1ac5ceb6a54a2e141bc6c4049214c030ca6a26ee`
MD5	`3798275df741489b8684864127629c98`
BLAKE2b-256	`0610e7d1f2d201507e3e9d09a1d488d31b4125fc2384955e996fc04ab021c411`