Skip to main content

A package to extract semantic activations from images using pre-trained models.

Project description

ImageInsight

A Python pipeline for extracting visual activations from images, processing them, and generating semantic descriptions using a neural network model. The pipeline utilizes PyTorch, a pre-trained AlexNet model, and a customa recurrent neural network (RNN) decoder comprised of bidirectional GRU's (Gated Recurrent Unit). The RNN takes the activations from the penultimate layer of Alexnet which is passesed through fully connected layers (FC layers) and then is decoded into semnatic descriptors of the image (e.g., is red, is green, is round etc). Lastly the penultimate layer from the RNN can be extracted for further evaluation.

For more information, please refer to --> (link to paper)

Features

  • Activation Extraction: Extract visual activations from images using a pre-trained AlexNet model.
  • Semantic Descriptions: Generate descriptions from the extracted activations using a RNN.
  • Device Support: Optionally run on GPU or CPU.
  • Configurable Model: Easily switch between different model layers for activation extraction from Alexnet.

Table of Contents

  1. Installation
  2. Usage
  3. Directory Structure
  4. Dependencies
  5. License

Installation

  1. Clone this repository:

    git clone https://github.com/kikiluvbrains/ImageInsight.git
    
  2. Navigate into the project directory:

    cd ImageInsight
    
  3. Install the required dependencies:

    pip install -r requirements.txt
    

Download InsightFace Model

Please download the semantic model to later call "ImageInsight"

Download Model from Google Drive

Usage

After installing the required dependencies, you can run the pipeline using your own set of images and a pre-trained model. Here's an example of how to use the pipeline:

   from ImageInsight import ImageInsight  # Correct import of the ImageInsight class

   # Define paths and settings for running the pipeline
   image_folder = "path/to/your/images"  # Folder containing the input images
   model_name = "alexnet"  # Name of the pre-trained model to use
   layer_index = 4  # The index of the layer from which activations will be extracted
   use_gpu = False  # Set to True if a GPU is available for faster processing
   csv_output_path = "path/to/output/folder"  # Path to the folder where the CSV output will be saved
   csv_file_name = "visual_activations_output.csv"  # Name of the CSV file for the visual activations
   model_path = "path/to/your/model.pt"  # Path to the pre-trained model

   # Initialize the ImageInsight model with the path to the pre-trained model and GPU usage option
   #or use the semantic model which we provide on our github page
   insight = ImageInsight(model_path=model_path, use_gpu=True)

   # Run the pipeline
   semantic_activations = insight.run_pipeline(
      image_folder=image_folder,
      model_name=model_name,
      layer_index=layer_index,
      csv_output_path=csv_image_output_path,
      csv_file_name=csv_file_name
   )

   # Print the generated semantic activations and descriptions
   print(semantic_activations)

Directory Structure

my_pipeline/
│
├── my_pipeline/
│   ├── __init__.py               # Package initialization   ├── main.py                   # Main pipeline logic   ├── models.py                 # Model definitions (e.g., ActivationToDescriptionModel)   ├── utils.py                  # Utility functions (e.g., image activation extraction)   └── tokenizer.py              # Tokenizer setup and handling
│
├── README.md                     # Project documentation
├── requirements.txt              # Python dependencies
├── setup.py                      # Packaging information for pip

Dependencies

torch
torchvision
transformers
Pillow
numpy
scikit-learn
matplotlib

To install all dependencies, run:

pip install -r requirements.txt

License

This project is licensed under the following terms:

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, subject to the following conditions:

  1. Citation Requirement: Any use of the Software in research or publications must cite the following GitHub repository:

  2. Attribution: The above copyright notice, this permission notice, and the citation requirement must be included in all copies or substantial portions of the Software.

See the full LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ImageInsight-0.2.1.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ImageInsight-0.2.1-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file ImageInsight-0.2.1.tar.gz.

File metadata

  • Download URL: ImageInsight-0.2.1.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for ImageInsight-0.2.1.tar.gz
Algorithm Hash digest
SHA256 fd2a4745f2163d6feefff40a56cb883dd50cfbe0c9f230e1ef893188e6a3ccf6
MD5 9dd403d3bff1de77879075e619929860
BLAKE2b-256 cb5bd7a1a5156521b7ac29d5a6d53e12b5014d5b80e7d466cd5ef90823334356

See more details on using hashes here.

File details

Details for the file ImageInsight-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: ImageInsight-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.2

File hashes

Hashes for ImageInsight-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 eb36c3e3974fae4c3828606e92e146eb505d2ae441129152c04b06d9004e7b5f
MD5 30d02a249242a1c8ff1c7de90c3a1aa6
BLAKE2b-256 459f335865fc6b1412e783eeb744207f5d75f88bd760885857d7afea9164dd4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page