Skip to main content

Python implementation of FALCON: Feedback Adaptive Loop for Content-Based Retrieval

Project description

halcon

Status GitHub issues GitHub forks GitHub stars PyPI version GitHub license Coverage Status

halcon is a python implementation of the Feedback Adaptive Loop for Content-Based Retrieval (FALCON) algorithm as described in

  • Leejay Wu, Christos Faloutsos, Katia P. Sycara, and Terry R. Payne. 2000. FALCON: Feedback Adaptive Loop for Content-Based Retrieval. In Proceedings of the 26th International Conference on Very Large Data Bases (VLDB '00), Amr El Abbadi, Michael L. Brodie, Sharma Chakravarthy, Umeshwar Dayal, Nabil Kamel, Gunter Schlageter, and Kyu-Young Whang (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 297-306.

FALCON is, as described in the article abstract, "a novel method that is designed to handle disjunctive queries within metric spaces. The user provides weights for positive examples; our system 'learns' the implied concept and returns similar objects."

Pre-Requisites

Installation

To install halcon run

pip3 install --user halcon

Usage

There is only one method that you need to know about

halcon.search.query(good_set, candidates, alpha=-5,
        metric='euclidean', normalization='zscore', debug=False)

Here is a brief description of each of the input arguments

  • good_set and candidates are two lists of lists where each member of both lists has the same shape.

    record = [ <identifier>, <initial_score>, <feature_vector>]
    

    For example in wine.py, I download a CSV file where the first feature_vector looks like this

    1,14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065
    

    and then I modify it like this

    good_set = []
    identifier = 'wine00'
    initial_score = 1
    feature_vector = [1,14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065]
    good_set.append([identifier, initial_score, feature_vector])
    

    For more information about the definition of the initial score, please refer to the article. In all my examples I use a initial score of 1, that is, all images have the same weight. The identifier should be unique (though not enforced), so you can tell images apart. This package assumes every object is represented by a feature vector. Feature calculation and feature selection is beyond the scope of this package. There are many feature calculation/machine learning packages out there that you might find useful, like

  • alpha. For more information about alpha, please refer to the article. The recommended value by the paper is -5, which is the default value used in this package.

  • metric. In the research article, a measure of distance d is used to calculate the distance between two feature vectors. The default value is euclidean (Euclidean distance) and other supported metrics are 1) cityblock (Manhattan distance) and 2) hamming (Hamming distance).

  • normalization. Feature normalization option. Default is zscore. Alternative option is standard.

  • debug. If debug flag is on, then it should print more information about the calculation as they happen.

Examples

iris.py

$ python examples/iris.py
This example uses the iris dataset from
Machine Learning Repository
Center for Machine Learning and Intelligent Systems
http://archive.ics.uci.edu/ml/datasets/Iris
I will use the first feature vector as my query image
[[0, 1, array([ 5.1,  3.5,  1.4,  0.2,  1. ])]]
And I will use the rest of the feature vectors to find the most similar images
Now notice that feature vector with iid1 has the same values iid0
[1, 1, array([ 5.1,  3.5,  1.4,  0.2,  1. ])]
So I expect that if FALCON is working correctly, then iid1 should be the top hit!
Elapsed time: 0.0221660137177 seconds

  Ranking    Identifier  Class                  Score
---------  ------------  ---------------  -----------
        0             1  Iris-setosa      0
        1            28  Iris-setosa      1.27788e-43
        2             5  Iris-setosa      2.40121e-40
        3            29  Iris-setosa      2.40121e-40
        4            40  Iris-setosa      5.83391e-40
        5             8  Iris-setosa      7.04398e-39
        6            18  Iris-setosa      1.1259e-35
        7            41  Iris-setosa      1.51906e-34
        8            50  Iris-versicolor  6.99696e-34
        9            37  Iris-setosa      1.09221e-32
       10            12  Iris-setosa      1.22203e-32
       11            49  Iris-setosa      2.05046e-32
       12            11  Iris-setosa      4.25801e-31
       13            21  Iris-setosa      6.55842e-31
       14            47  Iris-setosa      5.54098e-29
       15            36  Iris-setosa      7.93943e-29
       16             7  Iris-setosa      2.16985e-28
       17            20  Iris-setosa      4.23544e-28
       18            25  Iris-setosa      1.67453e-27
       19             3  Iris-setosa      2.40919e-27
Do the top results in the list above belong to the same class as the query image?
If so, then SCORE! It seems to work.

wine.py

$ python examples/wine.py
This example uses the wine dataset from
Machine Learning Repository
Center for Machine Learning and Intelligent Systems
http://archive.ics.uci.edu/ml/datasets/Wine
I will use the first three feature vectors as my query wine set
And I will use the rest of the feature vectors to find the most similar images
Elapsed time: 0.0280928611755 seconds

  Ranking  Identifier          Score
---------  ------------  -----------
        0  wine1         0
        1  wine2         0
        2  wine3         0
        3  wine21        2.77663e-05
        4  wine30        0.000629879
        5  wine23        0.00252617
        6  wine49        0.00318536
        7  wine57        0.00456123
        8  wine36        0.0152067
        9  wine39        0.0197516
       10  wine58        0.0243848
       11  wine9         0.024467
       12  wine55        0.045762
       13  wine24        0.046893
       14  wine7         0.113906
       15  wine45        0.188355
       16  wine27        0.201802
       17  wine41        0.206469
       18  wine31        0.288536
       19  wine56        0.291853

Bugs and Questions

To submit bugs about the source code visit

To submit bugs about the documentation visit

For any other inquiries visit those links as well.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

halcon-1.0.0.tar.gz (19.8 kB view details)

Uploaded Source

Built Distribution

halcon-1.0.0-py3-none-any.whl (19.8 kB view details)

Uploaded Python 3

File details

Details for the file halcon-1.0.0.tar.gz.

File metadata

  • Download URL: halcon-1.0.0.tar.gz
  • Upload date:
  • Size: 19.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for halcon-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d554a9948b3d1023091c0f8397c76f3ea0d68034286ed3b2dc2079537e15344a
MD5 de14bd78162fd2a4a1931f874528bca9
BLAKE2b-256 9326c146b7a3f2f75da4e4ee362a262b6384c1f1892b833cb287f80fbcb99546

See more details on using hashes here.

File details

Details for the file halcon-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: halcon-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 19.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for halcon-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dff1b972784a7a25274ce09b00d40980dd92c4c184f1e5bec40af1b190ff8cf2
MD5 ac06027408bb0b9b4738174958506d7d
BLAKE2b-256 956710bac27282a199c8e791fa9456108cbb38e67a893b6a805fffbfe3aecac8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page