Skip to main content

Toolset for Large Multi-Modal Models

Project description

Large Multimodal Model Tools

LMM-Tools (Large Multmodal Model Tools) is a simple library that helps you utilize multimodal models to organize your image data. One of the problems of dealing with image data is it can be difficult to organize and quickly search. For example, you might have a bunch of pictures of houses and want to count how many yellow houses you have, or how many houses with adobe roofs. This library utilizes LMMs to help create these tags or descriptions and allow you to search over them, or use them in a database to do other operations.

Getting Started

LMMs

To get started you can create an LMM and start generating text from images. The following code will grab the LLaVA-1.6 34B model and generate a description of the image you pass it.

import lmm_tools as lmt

model = lmt.lmm.get_model("llava")
model.generate("Describe this image", "image.png")
>>> "A yellow house with a green lawn."

We are hosting the LLaVA-1.6 34B model, if it times out please wait ~5-10 min for the server to warm up as it shuts down when usage is low.

DataStore

You can use the DataStore class to store your images, add new metadata to them such as descriptions, and search over different columns.

import lmm_tools as lmt
import pandas as pd

df = pd.DataFrame({"image_paths": ["image1.png", "image2.png", "image3.png"]})
ds = lmt.data.DataStore(df)
ds = ds.add_lmm(lmt.lmm.get_model("llava"))
ds = ds.add_embedder(lmt.emb.get_embedder("sentence-transformer"))

ds = ds.add_column("descriptions", "Describe this image.")

This will use the prompt you passed, "Describe this image.", and the LMM to create a new column of descriptions for your image. Your data will now contain a new column with the descriptions of each image:

image_paths image_id descriptions
image1.png 1 "A yellow house with a green lawn."
image2.png 2 "A white house with a two door garage."
image3.png 3 "A wooden house in the middle of the forest."

You can now create an index on the descriptions column and search over it to find images that match your query.

ds = ds.create_index("descriptions", top_k=1)
ds.search("A yellow house.")
>>> [{'image_paths': 'image1.png', 'image_id': 1, 'descriptions': 'A yellow house with a green lawn.'}]

You can also create other columns for you data such as is_yellow:

ds = ds.add_column("is_yellow", "Is the house in this image yellow? Please answer yes or no.")

which would give you a dataset similar to this:

image_paths image_id descriptions is_yellow
image1.png 1 "A yellow house with a green lawn." "yes"
image2.png 2 "A white house with a two door garage." "no"
image3.png 3 "A wooden house in the middle of the forest." "no"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lmm_tools-0.0.4.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

lmm_tools-0.0.4-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file lmm_tools-0.0.4.tar.gz.

File metadata

  • Download URL: lmm_tools-0.0.4.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.11 Linux/6.2.0-1019-azure

File hashes

Hashes for lmm_tools-0.0.4.tar.gz
Algorithm Hash digest
SHA256 2c573a5c4e1d429cb74b7cad1eb792c3d9d722b9f32e8607937b50a4432db62c
MD5 2f1f39727864f6ee9db6ea99d40a4e0f
BLAKE2b-256 a981f9159225d1cd1654802ade9acec26a54e133ff15460d15ffb878671836c5

See more details on using hashes here.

File details

Details for the file lmm_tools-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: lmm_tools-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.11 Linux/6.2.0-1019-azure

File hashes

Hashes for lmm_tools-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 10bf060271593b4001e787806331ea7a34102f7d39100029233451eab8adcfd3
MD5 a80eb297995f2c2e3df45d1d474a5826
BLAKE2b-256 69d7effb1d58b59d0bf18a2d0b86309143d28aa0816d9deb4560a1a8e9e6f273

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page