Skip to main content

ISCC - Semantic Code Image

Project description

ISCC - Semantic Image-Code

Tests Version Downloads

iscc-sci is a proof of concept implementation of a semantic Image-Code for the ISCC (International Standard Content Code). Semantic Image-Codes are designed to capture and represent the semantic content of images for improved similarity detection.

[!CAUTION] This is a proof of concept. All releases with version numbers below v1.0.0 may break backward compatibility and produce incompatible Semantic Image-Codes. The algorithms of this iscc-sci repository are experimental and not part of the official ISO 24138:2024 standard.

What is ISCC Semantic Image-Code

The ISCC framework already comes with an Image-Code that is based on perceptual hashing and can match near duplicates. The ISCC Semantic Image-Code is planned as a new additional ISCC-UNIT focused on capturing a more abstract and broad semantic similarity. As such the Semantic Image-Code is engineered to be robust against a broader range of variations that cannot be matched with the perceptual Image-Code.

Features

  • Semantic Similarity: Leverages deep learning models to generate codes that reflect the semantic content of images.
  • Bit-Length Flexibility: Supports generating codes of various bit lengths (up to 256 bits), allowing for adjustable granularity in similarity detection.
  • ISCC Compatible: Generates codes that are fully compatible with the ISCC specification, facilitating integration with existing ISCC-based systems.

Installation

Ensure you have Python 3.11 or newer installed on your system. The package requires an ONNX runtime that is selected via install extras. For CPU inference (works everywhere):

pip install "iscc-sci[cpu]"

For NVIDIA CUDA accelerated inference (requires CUDA 12.x and cuDNN 9.x):

pip install "iscc-sci[gpu]"

[!NOTE] Install exactly one of the cpu/gpu extras. The underlying onnxruntime and onnxruntime-gpu packages unpack into the same directory and overwrite each other, so installing both silently disables GPU support. A plain pip install iscc-sci installs no ONNX runtime and fails on import with instructions.

Usage

To generate a Semantic Image-Code for an image, use the code_image_semantic function. You can specify the bit length of the code to control the level of granularity in the semantic representation.

import iscc_sci as sci

# Generate a 64-bit ISCC Semantic Image-Code for an image file
image_file_path = "path/to/your/image.jpg"
semantic_code = sci.code_image_semantic(image_file_path, bits=64)

print(semantic_code)

How It Works

iscc-sci uses a pre-trained deep learning model based on the 1st Place Solution of the Image Similarity Challenge (ISC21) to create semantic embeddings of images. The model generates a feature vector that captures the essential characteristics of the image. This vector is then binarized to produce a Semantic Image-Code that is robust to variations in image presentation but sensitive to content differences.

Development

This is a proof of concept and welcomes contributions to enhance its capabilities, efficiency, and compatibility with the broader ISCC ecosystem. For development, install the project with uv. The default uv sync installs the test group, which provides a CPU ONNX runtime:

git clone https://github.com/iscc/iscc-sci.git
cd iscc-sci
uv sync

Contributing

Contributions are welcome! If you have suggestions for improvements or bug fixes, please open an issue or pull request. For major changes, please open an issue first to discuss what you would like to change.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iscc_sci-0.3.0.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iscc_sci-0.3.0-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file iscc_sci-0.3.0.tar.gz.

File metadata

  • Download URL: iscc_sci-0.3.0.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for iscc_sci-0.3.0.tar.gz
Algorithm Hash digest
SHA256 68f85086dccb0e4433a5601f73c66daf8e50e19df8cba4522853a0437283ea79
MD5 ed7f443970d394bd52aac9545b20b169
BLAKE2b-256 a2e33c899244e6e04789a1d0af3920b07846e398b25aeb5476f65e97de3841d9

See more details on using hashes here.

File details

Details for the file iscc_sci-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: iscc_sci-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.21 {"installer":{"name":"uv","version":"0.11.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for iscc_sci-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5ddf3d50bb078eec9e26bce905c4786a227dd09726d1b802bd880bd4973a37a2
MD5 772a66cfdcd00a6a9b1c04f514f1b8c2
BLAKE2b-256 60cd32c16d1b87c47195c8943332705ab3bd29e357f355040bbaa2bf732f84fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page