Skip to main content

SDK for creating ISCCs (International Standard Content Codes)

Project description

ISCC Software Development Kit (iscc-sdk)

Build Version Coverage Quality Downloads License Ask DeepWiki

A comprehensive Python toolkit for creating and managing ISCC (International Standard Content Code) identifiers for digital media assets.

Overview

ISCC Architecture

What is an ISCC?

The International Standard Content Code (ISCC) is a content-dependent, similarity-preserving identifier and fingerprint system for digital content, standardized as ISO 24138:2024.

ISCCs are neither manually nor automatically assigned but are derived from the digital content itself. Generated algorithmically using various hash algorithms, ISCCs create composite identifiers with similarity-preserving properties (soft hashes) that can be independently derived by unrelated parties from the same media asset.

Digital content is dynamic - continuously re-encoded, resized, and re-compressed as it travels through complex networks. The ISCC remains robust across these transformations while preserving estimates of data, content, and metadata similarity.

The component-based structure of ISCC identifies content at multiple levels of abstraction, creating a multi-layered fingerprint. These components work together to create a robust, similarity-preserving identifier that remains stable despite modifications to the underlying digital asset. With this multi-layered approach, the ISCC can track content throughout its lifecycle, even as it's re-encoded, resized, or re-compressed.

Each component is self-describing, modular, and can be used separately or together, enabling ISCCs to support numerous digital asset management use-cases across all domains concerned with producing, processing, and distributing digital information (science, journalism, books, music, film, etc.):

  • Content deduplication and discovery
  • Database synchronization and indexing
  • Integrity verification and timestamping
  • Versioning and data provenance tracking
  • Similarity clustering and matching
  • Anomaly detection in content collections
  • Usage tracking and royalty allocation
  • Fact-checking and content verification
  • Interoperability between different systems and actors
  • Association with higher-level identifiers (work/product identifiers)

What is iscc-sdk?

iscc-sdk builds on top of iscc-core to provide high-level features for generating and handling ISCC codes across different media types. It serves as a complete toolkit for implementing ISCC-based workflows in Python applications.

Features

  • Comprehensive Media Support: Process text, image, audio, and video files
  • Mediatype Detection: Automatically identify file formats
  • Metadata Management: Extract and embed metadata across different file formats
  • Content Processing: Handle mediatype-specific content extraction and normalization
  • Rich CLI: Command-line interface for easy integration into workflows
  • Cross-Platform: Works on Windows, macOS, and Linux
  • Built-in Tools: Includes necessary binaries for media processing
  • Standards Compliant: Build on top of the ISO 24138:2024 reference implementation

Requirements

  • Python 3.9 to 3.13 on 64-bit systems
  • Supported platforms: Windows, macOS, Linux

Installation

Using pip

pip install iscc-sdk

Using uv

uv add iscc-sdk

Usage

Python API

Create an ISCC-CODE for a media file:

import iscc_sdk as idk

# Generate a complete ISCC code
iscc_meta = idk.code_iscc("/path/to/mediafile.jpg")
print(iscc_meta.iscc)  # Full ISCC code
print(iscc_meta.json(indent=2))  # All metadata as JSON

# Generate specific ISCC components
meta_code = idk.code_meta("/path/to/mediafile.jpg")
content_code = idk.code_content("/path/to/mediafile.jpg")
data_code = idk.code_data("/path/to/mediafile.jpg")
instance_code = idk.code_instance("/path/to/mediafile.jpg")

# Process specific media types
text_code = idk.code_text("/path/to/document.pdf")
image_code = idk.code_image("/path/to/image.png")
audio_code = idk.code_audio("/path/to/audio.mp3")
video_code = idk.code_video("/path/to/video.mp4")

Extract and embed metadata:

import iscc_sdk as idk
from iscc_schema import IsccMeta

# Extract metadata
metadata = idk.extract_metadata("/path/to/mediafile.jpg")

# Create custom metadata
custom_meta = IsccMeta(
    name="My Asset Title",
    description="Description of the asset",
    creator="Creator Name",
    license="https://creativecommons.org/licenses/by/4.0/",
)

# Embed metadata into a copy of the file
new_file = idk.embed_metadata("/path/to/mediafile.jpg", custom_meta)

Command Line Interface

The SDK includes a command-line interface called iscc-sdk.

Create an ISCC code for a single file:

iscc-sdk create /path/to/mediafile.jpg

Process multiple files in a directory:

iscc-sdk batch /folder_with_media_files

Install required binaries:

iscc-sdk install

Run self-tests:

iscc-sdk selftest

Documentation

For complete documentation, visit https://sdk.iscc.codes

Project Status

The ISCC is an official standard published as ISO 24138:2024 - International Standard Content Code within ISO/TC 46/SC 9/WG 18.

Note: The iscc-sdk library and the accompanying documentation are under active development. API changes and other backward incompatible changes are to be expected until a v1.0 stable release.

Development

Setup

# Install dependencies
uv sync

# Install pre-commit hooks (using prek, a fast Rust-based drop-in replacement)
uv tool install prek
prek install
prek install --hook-type pre-push

Quality Gates

Pre-commit hooks run automatically via prek:

On commit (fast, auto-fix):

  • File hygiene (line endings, trailing whitespace, EOF)
  • Config validation (YAML, JSON, TOML)
  • Markdown formatting (mdformat)
  • Ruff linting with auto-fix
  • Ruff formatting

On push (thorough):

  • Type checking (zuban)
  • Security scan (Ruff S rules)
  • Complexity check (Ruff C901)
  • Tests with 100% coverage

Run all hooks manually:

prek run --all-files                # pre-commit hooks
prek run --all-files --hook-stage pre-push  # pre-push hooks

Testing

uv run pytest --cov=iscc_sdk --cov-fail-under=100 -p no:warnings

Contributing

Contributions are welcome! Here's how you can help:

  1. Issues: Report bugs or suggest features via the issue tracker
  2. Pull Requests: Submit PRs for bug fixes or new features
  3. Discussion: For significant changes, please open an issue first to discuss your plans
  4. Testing: Please make sure to update tests as appropriate

Join our developer chat on Telegram at https://t.me/iscc_dev.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iscc_sdk-0.8.9.tar.gz (39.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iscc_sdk-0.8.9-py3-none-any.whl (47.4 kB view details)

Uploaded Python 3

File details

Details for the file iscc_sdk-0.8.9.tar.gz.

File metadata

  • Download URL: iscc_sdk-0.8.9.tar.gz
  • Upload date:
  • Size: 39.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for iscc_sdk-0.8.9.tar.gz
Algorithm Hash digest
SHA256 1240c6aca06f8d775257a44364f5d79c41c999215a0568bfc3b2ef6f741f0963
MD5 c0d6e94f2150fa50558ef237cb9e7d56
BLAKE2b-256 9395d4a6546ccd4003f939e5a244171ae7229978cdc4c8516d9b651b9c1ee3b7

See more details on using hashes here.

Provenance

The following attestation bundles were made for iscc_sdk-0.8.9.tar.gz:

Publisher: release.yml on iscc/iscc-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file iscc_sdk-0.8.9-py3-none-any.whl.

File metadata

  • Download URL: iscc_sdk-0.8.9-py3-none-any.whl
  • Upload date:
  • Size: 47.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for iscc_sdk-0.8.9-py3-none-any.whl
Algorithm Hash digest
SHA256 99d2f3dd2f2fc6b33082b9db2167cb2d8e52dbadf7a187932ad61994be2ae631
MD5 a17c93f292206b69b50000b1e4e6d384
BLAKE2b-256 b0a5bcd1dab2ca9f8817221d8d774ad8b3a827127d657064100576194da94e44

See more details on using hashes here.

Provenance

The following attestation bundles were made for iscc_sdk-0.8.9-py3-none-any.whl:

Publisher: release.yml on iscc/iscc-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page