Skip to main content

Easily convert individual images into feature vectors by specifying your desired model to extract meaningful representations.

Project description

Vector Forge Logo

Python > 3.11

🐍 Vector Forge is a Python package designed for easy transformation of various data types into feature vectors.


Python PyTorch Keras


💡Core ideas

🌄 For image embeddings, Vector Forge uses pre-trained networks, which means the models have already learned features from a large set of images called ImageNet. When we use these models in Vector Forge, we skip the part that identifies objects, and instead, we use the part that understands the image features. This way, we get a bunch of numbers (a vector) representing the image, which can be used in many different tasks like finding similar images, clustering, classification and many more.

📄 Text embeddings are a way to convert words or sentences into numbers, making it possible for computers to understand and process them. In Vector Forge, the CLIP ViT-B/32 model is utilized to generate these embeddings. When you provide any text, be it a single word or a sentence, to CLIP, it transforms this text into a fixed-size vector. Each vector has a consistent length, no matter how long or short the original text is. This consistency in size is valuable, especially when comparing different pieces of text or measuring how similar a piece of text is to an image.

🧩 Features

  • Image to Vector conversion: Easily convert individual images into feature vectors by specifying your desired model to extract meaningful representations.
  • Batch processing: Provide a folder path to process multiple images in bulk. Select your preferred model and let Vector Forge swiftly handle all the images in the specified directory.
  • Text to Vector transformation: Effortlessly convert textual data into vectors. Choose your model, and Vector Forge will transform your text input into a high-dimensional vector representation.
  • Support for multiple models: Vector Forge supports various models for vectorization, including CLIP, Xception, and VGG16, to provide flexibility in handling different data types.

⚙️ Requirements

📦 Supported models

Model Name Implementation Parameter Value Supports Image Supports Text Embedding Size
CLIP ViT-B/32 PyTorch vf.Models.CLIP (512,)
VGG16 Keras vf.Models.VGG16 (2048,)
Xception Keras vf.Models.XCEPTION (2048,)

🎛️ Usage

You can work on many pictures at once or just one piece of text with simple commands, making it a breeze to get your data ready for further use or analysis

🔧 Installation

pip install vector_forge

🔌 Create a vectorizer

Import the library

import vector_forge as vf

Default vectorizer

By default, the vectorizer is CLIP ViT-B/32, as it works for text and images.

vectorizer = vf.Vectorizer()  

Text to Vector

Example how to convert a text prompt to a vector.

text_embedding = vectorizer.text_to_vector("Nice text!")

Image to Vector

Example how to convert to convert image from path to vector.

image_embedding = vectorizer.image_to_vector("/path/to/image.jpg")

Change the vectorizer to use a different model

Example how to change the vectorizer model, in this example to Xception.
Keep in mind, that not all models work for for text prompts. If you want to compare image and texts, I recommend using CLIP ViT-B/32.

vectorizer = vf.Vectorizer(model=vf.Models.XCEPTION)

Return types

In Vector Forge, you have the flexibility to choose the format in which the vectors are returned. This is controlled by the return_type parameter available in the image_to_vector and text_to_vector methods of the Vectorizer class. Here are the available return types along with examples:

a) return_type="numpy"

This is the default return type. Vectors are returned as NumPy arrays.

image_embedding = vectorizer.image_to_vector("/path/to/image.jpg", return_type="numpy")
# Output: array([0.0234, 0.0345, ..., 0.0456])
# Shape: (2048,)  # for Xception and VGG16, (512,) for CLIP

b) return_type="str"

Vectors are returned as a string representation of the NumPy array.

image_embedding = vectorizer.image_to_vector("/path/to/image.jpg", return_type="str")
# Output: "[0.0234, 0.0345, ..., 0.0456]"

c) return_type="list"

Vectors are returned as a list of values.

image_embedding = vectorizer.image_to_vector("/path/to/image.jpg", return_type="list")
# Output: [0.0234, 0.0345, ..., 0.0456]

d) return_type="2darray"

Vectors are returned as a 2-dimensional NumPy array, where each vector is a row in the array. This format is especially useful when you want to compute similarities or perform other vectorized operations.

image_embedding = vectorizer.image_to_vector("/path/to/image.jpg", return_type="2darray")
# Output: array([[0.0234, 0.0345, ..., 0.0456]])
# Shape: (1, 2048)  # for Xception and VGG16, (1, 512) for CLIP

Batch Processing for images

Vector Forge can process multiple images in a folder in one go. Just provide the folder path, and the load_from_folder method will handle the rest.

# Convert all valid images in a folder to vectors
for vector in vectorizer.load_from_folder("/path/to/folder"):
    print(vector.shape)

You can specify the return_type and width parameters to control the output format and the width to which images should be resized, respectively.

# Example with return_type and width parameters
for vector in vectorizer.load_from_folder("/path/to/folder", return_type="str", width=300):
    print(vector)  # Each vector is now a string and images are resized to a width of 300 pixels.

🧪 A complete example

import vector_forge as vf
from sklearn.metrics.pairwise import cosine_similarity


def compute_similarity(vectorizer, text, image_path):
    # Convert text and image to vectors with return type '2darray'
    text_embedding = vectorizer.text_to_vector(text, return_type="2darray")
    image_embedding = vectorizer.image_to_vector(image_path, return_type="2darray")

    # Compute cosine similarity
    similarity = cosine_similarity(text_embedding, image_embedding)[0][0]
    return similarity


# Create a vectorizer with the default CLIP model
vectorizer = vf.Vectorizer()

# Define text and image paths
text = "A couple of birds"
image_path_1 = "vector_forge/test_data/birds.jpg"
image_path_2 = "vector_forge/test_data/sample.jpg"

# Compute and print similarity scores
similarity_1 = compute_similarity(vectorizer, text, image_path_1)
similarity_2 = compute_similarity(vectorizer, text, image_path_2)

print(f"Similarity between text and first image: {similarity_1}")
print(f"Similarity between text and second image: {similarity_2}")

🔮 Future features

New image models

[ ] Add support for VGG19
[ ] Add possibility for index creation when using load_from_folder

New text models

[ ] Add support for GloVe

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vector_forge-0.0.3.tar.gz (11.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page