Skip to main content

No project description provided

Project description

Explanation Text Generator

Version of 12th of January 2024

Class to generate text explanations of image classifications in different modes and with given main and part labels.

Main Usage

Package

The current version of the package can be installed with pip using the following command:

pip install ExplanationText==0.1.5

The Explanation Generator can then be important with:

from explanation_text import ExplanationGenerator

After importing the Explanation Generator, the following two lines of code are sufficient. They are described in more details below:

explanation_generator = ExplanationGenerator(<api_token>, <mode>)
explanation_generator.initModels() # optional, loads used models on HuggingFace platform
explanation_text = explanation_generator.generate_explanation(<labels>)

First you have to create a ExplanationGenerator and set a explanation mode and your HuggingFace API Token, if you want to use modes that uses their API. The different explanation modes can be found here. If you leave it empty, the GeoGuesser method will be used. At the beginning of a pipeline, for example after the login, the initModels function (not in version 0.1.5) can be run in order to load the used language models on the HuggingFace platform, since the first request often takes some time. Afterward you can call the generate_explanation method with your list of labels receive an explanation text. In order to generate multiple explanations, generate_explanation can also handle lists of labels and returns a list of individual explanation texts. You can set two more configurations with the constructor. MinimumRelevance (default 10) filters part labels with a relevance percentage below that value and maximumPartCount (default 5) sets the number of maximum part labels that should be used for the explanation text.

GeoGuessr Mode Usage

To use the GeoGuessr Mode, the mode has to be set to "ExplanationGeneratorGG". This mode is currently the default mode, so you don't have to set it manually. In order to use the Landmark Detection feature with this mode, you also have to provide the Google API Key in the constructor of the ExplanationGenerator. However, the GeoGuessr Mode can also be used without Landmark Detection.

explanation_generator = ExplanationGenerator(<api_token>, <google_api_token>, "ExplanationGeneratorGG")

The Google Vision API Key can be created using a free trial account on Google Cloud Platform. You simply have to create a new project and enable the Google Vision API. Then you can create an API key in the credentials section of the project.

Input Format

The following json files are examples of the current format for the labels that serve as input for the explanation generator. In addition to the image (img), the input contains a list of objects. Each object has a label, a heatmap and a list of parts. Optionally, the object can also contain a probability. Each part contains an image, an optional relevancy, a position and a list of labels. The labels are a dictionary with a main label as key and a list of part labels as value. Example Portugal:

{
    "img": "base64",
        "objects" : [
            {
                "heatmap": "image",
                "label": "portugal",
		        "probability": 0.9,
                "parts": [
                  {
                    "img": "base64",
                    "relevancy": 0.3,
                    "rect": "",
                    "labels": {
                      "portugal": [
                        "hills"
                      ]
                    }
                  },
                  {
                    "img": "base64",
                    "relevancy": 0.4,
                    "rect": "",
                    "labels": {
                      "portugal": [
                        "traffic light"
                      ]
                    }
                  },
                                    {
                    "img": "base64",
                    "relevancy": 0.45,
                    "rect": "",
                    "labels": {
                      "portugal": [
                        "building"
                      ]
                    }
                  }
                ]
            }
        ]
    }

Example Germany:

{
    "img": "base64",
        "objects" : [
            {
                "heatmap": "image",
                "label": "germany",
		        "probability": 0.9,
                "parts": [
                  {
                    "img": "base64",
                    "relevancy": 0.3,
                    "rect": "",
                    "labels": {
                      "germany": [
                        "apartments"
                      ]
                    }
                  },
                  {
                    "img": "base64",
                    "relevancy": 0.5,
                    "rect": "",
                    "labels": {
                      "germany": [
                        "traffic light"
                      ]
                    }
                  },
                                    {
                    "img": "base64",
                    "relevancy": 0.5,
                    "rect": "",
                    "labels": {
                      "germany": [
                        "building"
                      ]
                    }
                  }
                ]
            }
        ]
    }

Internal Format

After communication with the other groups, a new format came up, that we now use as the input. Internally, as of September 2023, we parse that format into our old one you can see below, to make sure all our components still work. Our internal format of Labels is a python dictionary or a list of dictionary objects. More specifically this object has to contain a non empty key main_label, which is the minimum requirement. Optionally a probability with a float value can be set. Part labels are set as a list with a key parts. Each part has to contain a part_label field and optionally a relevance and position.

{
  "objects": [
    {
      "label": "plane", 
      "probability": 0.9, 
      "position": "left",
      "parts": [
        {"part_label": "cockpit", "relevance": 0.6}, 
        {"part_label": "wings", "relevance": 0.2}
       ]
    },
    {
      "label": "tree", 
      "probability": 0.6, 
      "position": "top"
    }
  ]
}

This is an example of the label format of an object plane. Internally the method generate_explanation validates, filters and sorts the given main and part labels.

Test Bench Usage

To test the different explanation methods, we created a test bench. It reads and parses sample labels from a folder with .txt files and creates explanation texts with a given mode. Additionally, it can randomize the samples and write the explanations into an output file.

demoApiToken = "<Your Huggingface API Token>"
testBench = TestBench('testData', demoApiToken, mode="ExplanationGeneratorV1")
testBench.test_data(100, writeToFile=True, randomize=True)

All parameters, except for inputFolderPath and SampleCount are optional. If the sample count number is higher than the given samples, the test bench uses all the samples. The default output folder path is ExplanationText/output/ and can be set with an additional parameter outputPath. If parameter mode is set to All, all implemented modes will be tested with the given data.

Explanation Modes

Here is an overview of the different explanation text modes.

ExplanationGeneratorV1

This is currently the best mode to generate texts. It follows the structure shown in diagram in confluence under 'Language Models' -> 'Umsetzung' -> 'aktueller Stand' and described in chapter 3 of the 'Zwischenbericht'. To use it, set attribute mode to 'ExplanationGeneratorV1'.

Positive example of current version:

    "overview": "This image contains a train, a horse and a boat.",
    "horse": {
        "medium": 
            "The horse in the image was classified as a horse with 
            99% certainty. The horse was mainly classified that 
            way, because of the head with 40% relevance and neck
            with 24% relevance. The head is strongly connected to
            the horse, while the neck is very strongly connected
            to the horse.",
        "detailed": 
            "The main function of the following object is to 
            provide transportation and support to its rider. The 
            object in the image was classified as a horse with a
            99% certainty, due to the distinctive head and neck 
            features. The head was found to have a high relevance,
            as it is a prominent feature of the horse. The neck 
            was also found to be important, as it is connected to
            the head and appears to provide support to the horse."
    },
    ....

It is important to notice, that the implementation of component partConnection is currently faked with random connections.

ExplanationGeneratorGG

This mode is used to generate Explanation Texts for explanations of GeoGuessr images. The structure of the mode and its components can be found in the Language Model V3 PDF file.

Example of current version:

'overview': 'The image was classified as being located in Germany.', 
'germany': {
    'medium': 
        'The model classified the image as Germany with a high degree of confidence.
         The location of the image was primarily identified based on the presence of 
         a traffic light, a building, and apartments. The relevance of the different
         elements of the image was also taken into consideration, with the traffic light
         having a high relevance, the building having a moderate relevance, and the 
         apartments having a low relevance.', 
    'detailed': 
        'The model classified the image as Germany with a high degree of certainty.
         Germany is a country in the western region of Central Europe. The traffic light
         in the image was relevant for the classification. The image contained buildings
         that had a high relevancy, and the location of the image was identified due to 
         the presence of apartments with a medium relevancy. In urban Germany, apartment 
         buildings are a common sight. They typically line the streets and are three to five
         storeys high. The apartments are usually bland in color and have a simple layout,
         with kitchens, bathrooms, and living rooms on different floors.'
         }

Overview

The mode overview describes only the main labels of an image and should give an overview of the objects detected. The mode can be set using the string 'Overview'.

The image contains a <main label 1>, .. and a <main label N>.

Sentence Construction

Sentence Construction is the easiest way of creating a detailed explanation text. It basically just puts the main and part label with additional information into a fixed sentence structure without any additionally information or processing. The current variant creates two sentences with the following structure:

Object in image was classified as a <main label> with <percentage>%
certainty. The <main label> was mainly classified that way, because
of the <part label> with <percentage>% relevance at position 
<position> , <more part labels> ... and <last part label>

The whole second sentence and information about relevance and position are optional and will be set if the necessary information is given. This method can be used with the string 'SentenceConstruction'.

API Test

API Test is a method to evaluate large language models on HugginFace. It uses the HuggingFace API to test out sets of models, tasks and prompts. For this, a test can be set up in class api_test and run with prompts provided by the Testbench. Examples for test setups can be found in class api_test_utils. It is necessary to create an account on HuggingFace and provide a custom API Key at the top of api_test.

This method can be used with the string 'APITest'.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ExplanationText-0.1.6.tar.gz (5.5 MB view hashes)

Uploaded Source

Built Distribution

ExplanationText-0.1.6-py3-none-any.whl (5.5 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page