Skip to main content

ISCC - Core Algorithms

Project description

ISCC - Codec & Algorithms

Build Version Coverage Quality Downloads

iscc-core is the reference implementation of the core algorithms of the ISCC (International Standard Content Code)

What is the ISCC

The ISCC is a similarity preserving fingerprint and identifier for digital media assets.

ISCCs are generated algorithmically from digital content, just like cryptographic hashes. However, instead of using a single cryptographic hash function to identify data only, the ISCC uses various algorithms to create a composite identifier that exhibits similarity-preserving properties (soft hash).

The component-based structure of the ISCC identifies content at multiple levels of abstraction. Each component is self-describing, modular, and can be used separately or with others to aid in various content identification tasks. The algorithmic design supports content deduplication, database synchronization, indexing, integrity verification, timestamping, versioning, data provenance, similarity clustering, anomaly detection, usage tracking, allocation of royalties, fact-checking and general digital asset management use-cases.

What is iscc-core

iscc-core is the python based reference implementation of the ISCC core algorithms as defined by ISO 24138. It also a good reference for porting ISCC to other programming languages.

!!! tip This is a low level reference implementation that does not inlcude features like mediatype detection, metadata extraction or file format specific content extraction. Please have a look at iscc-sdk which adds those higher level features on top of the iscc-core library.

Implementors Guide

Reproducible Environment

For reproducible installation of the reference implementation we included a poetry.lock file with pinned dependencies. Install them using Python Poetry with the command poetry install in the root folder.

Repository structure

iscc-core
├── docs       # Markdown and other assets for mkdocs documentation
├── examples   # Example scripts using the reference code
├── iscc_core  # Actual source code of the reference implementation
├── tests      # Tests for the reference implementation
└── tools      # Development tools

Testing & Conformance

The reference implementation comes with 100% test coverage. To run the conformance selftest from the repository root use poetry run python -m iscc_core. To run the complete test suite use poetry run pytest.

To build a conformant implementation work through the follwing top level entrypoint functions:

gen_meta_code_v0
gen_text_code_v0
gen_image_code_v0
gen_audio_code_v0
gen_video_code_v0
gen_mixed_code_v0
gen_data_code_v0
gen_instance_code_v0
gen_iscc_code_v0

The corresponding test vectors can be found in iscc_core/data.json.

ISCC Architecture

ISCC Architecture

ISCC MainTypes

Idx Slug Bits Purpose
0 META 0000 Match on metadata similarity
1 SEMANTIC 0001 Match on semantic content similarity
2 CONTENT 0010 Match on perceptual content similarity
3 DATA 0011 Match on data similarity
4 INSTANCE 0100 Match on data identity
5 ISCC 0101 Composite of two or more components with common header

Installation

Use the package manager pip to install iscc-core as a library.

pip install iscc-core

Quick Start

import json
import iscc_core as ic

meta_code = ic.gen_meta_code(name="ISCC Test Document!")

print(f"Meta-Code:     {meta_code['iscc']}")
print(f"Structure:     {ic.iscc_explain(meta_code['iscc'])}\n")

# Extract text from file
with open("demo.txt", "rt", encoding="utf-8") as stream:
    text = stream.read()
    text_code = ic.gen_text_code_v0(text)
    print(f"Text-Code:     {text_code['iscc']}")
    print(f"Structure:     {ic.iscc_explain(text_code['iscc'])}\n")

# Process raw bytes of textfile
with open("demo.txt", "rb") as stream:
    data_code = ic.gen_data_code(stream)
    print(f"Data-Code:     {data_code['iscc']}")
    print(f"Structure:     {ic.iscc_explain(data_code['iscc'])}\n")

    stream.seek(0)
    instance_code = ic.gen_instance_code(stream)
    print(f"Instance-Code: {instance_code['iscc']}")
    print(f"Structure:     {ic.iscc_explain(instance_code['iscc'])}\n")

# Combine ISCC-UNITs into ISCC-CODE
iscc_code = ic.gen_iscc_code(
    (meta_code["iscc"], text_code["iscc"], data_code["iscc"], instance_code["iscc"])
)

# Create convenience `Code` object from ISCC string
iscc_obj = ic.Code(iscc_code["iscc"])
print(f"ISCC-CODE:     {ic.iscc_normalize(iscc_obj.code)}")
print(f"Structure:     {iscc_obj.explain}")
print(f"Multiformat:   {iscc_obj.mf_base32}\n")

# Compare with changed ISCC-CODE:
new_dc, new_ic = ic.Code.rnd(mt=ic.MT.DATA), ic.Code.rnd(mt=ic.MT.INSTANCE)
new_iscc = ic.gen_iscc_code((meta_code["iscc"], text_code["iscc"], new_dc.uri, new_ic.uri))
print(f"Compare ISCC-CODES:\n{iscc_obj.uri}\n{new_iscc['iscc']}")
print(json.dumps(ic.iscc_compare(iscc_obj.code, new_iscc["iscc"]), indent=2))

The output of this example is as follows:

Meta-Code:     ISCC:AAAT4EBWK27737D2
Structure:     META-NONE-V0-64-3e103656bffdfc7a

Text-Code:     ISCC:EAAQMBEYQF6457DP
Structure:     CONTENT-TEXT-V0-64-060498817dcefc6f

Data-Code:     ISCC:GAA7UJMLDXHPPENG
Structure:     DATA-NONE-V0-64-fa258b1dcef791a6

Instance-Code: ISCC:IAA3Y7HR2FEZCU4N
Structure:     INSTANCE-NONE-V0-64-bc7cf1d14991538d

ISCC-CODE:     ISCC:KACT4EBWK27737D2AYCJRAL5Z36G76RFRMO4554RU26HZ4ORJGIVHDI
Structure:     ISCC-TEXT-V0-MCDI-3e103656bffdfc7a060498817dcefc6ffa258b1dcef791a6bc7cf1d14991538d
Multiformat:   bzqavabj6ca3fnp757r5ambeyqf6457dp7isywhoo66i2npd46hiutektru

Compare ISCC-CODES:
ISCC:KACT4EBWK27737D2AYCJRAL5Z36G76RFRMO4554RU26HZ4ORJGIVHDI
ISCC:KACT4EBWK27737D2AYCJRAL5Z36G7Y7HA2BMECKMVRBEQXR2BJOS6NA
{
  "meta_dist": 0,
  "content_dist": 0,
  "data_dist": 33,
  "instance_match": false
}

Documentation

Documentation is published at https://core.iscc.codes

Development

Requirements

  • Python 3.7.2 or higher for code generation and static site building.
  • Poetry for installation and dependency management.

Development Setup

git clone https://github.com/iscc/iscc-core.git
cd iscc-core
poetry install

Development Tasks

Tests, coverage, code formatting and other tasks can be run with the poe command:

poe

Poe the Poet - A task runner that works well with poetry.
version 0.18.1

Result: No task specified.

USAGE
  poe [-h] [-v | -q] [--root PATH] [--ansi | --no-ansi] task [task arguments]

GLOBAL OPTIONS
  -h, --help     Show this help page and exit
  --version      Print the version and exit
  -v, --verbose  Increase command output (repeatable)
  -q, --quiet    Decrease command output (repeatable)
  -d, --dry-run  Print the task contents but don't actually run it
  --root PATH    Specify where to find the pyproject.toml
  --ansi         Force enable ANSI output
  --no-ansi      Force disable ANSI output
CONFIGURED TASKS
  gentests       Generate conformance test data
  format         Code style formating with black
  docs           Copy README.md to /docs
  format-md      Markdown formating with mdformat
  lf             Convert line endings to lf
  test           Run tests with coverage
  sec            Security check with bandit
  all

Use poe all to run all tasks before committing any changes.

Maintainers

@titusz

Contributing

Pull requests are welcome. For significant changes, please open an issue first to discuss your plans. Please make sure to update tests as appropriate.

You may also want join our developer chat on Telegram at https://t.me/iscc_dev.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iscc_core-1.0.9.tar.gz (58.7 kB view details)

Uploaded Source

Built Distributions

iscc_core-1.0.9-cp312-cp312-win_amd64.whl (634.8 kB view details)

Uploaded CPython 3.12 Windows x86-64

iscc_core-1.0.9-cp312-cp312-manylinux_2_31_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.31+ x86-64

iscc_core-1.0.9-cp312-cp312-macosx_11_0_x86_64.whl (822.5 kB view details)

Uploaded CPython 3.12 macOS 11.0+ x86-64

iscc_core-1.0.9-cp311-cp311-win_amd64.whl (634.2 kB view details)

Uploaded CPython 3.11 Windows x86-64

iscc_core-1.0.9-cp311-cp311-manylinux_2_31_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.31+ x86-64

iscc_core-1.0.9-cp311-cp311-macosx_11_0_x86_64.whl (820.3 kB view details)

Uploaded CPython 3.11 macOS 11.0+ x86-64

iscc_core-1.0.9-cp310-cp310-win_amd64.whl (634.0 kB view details)

Uploaded CPython 3.10 Windows x86-64

iscc_core-1.0.9-cp310-cp310-manylinux_2_31_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.31+ x86-64

iscc_core-1.0.9-cp310-cp310-macosx_11_0_x86_64.whl (644.1 kB view details)

Uploaded CPython 3.10 macOS 11.0+ x86-64

iscc_core-1.0.9-cp39-cp39-win_amd64.whl (634.1 kB view details)

Uploaded CPython 3.9 Windows x86-64

iscc_core-1.0.9-cp39-cp39-manylinux_2_31_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.31+ x86-64

iscc_core-1.0.9-cp39-cp39-macosx_11_0_x86_64.whl (644.4 kB view details)

Uploaded CPython 3.9 macOS 11.0+ x86-64

iscc_core-1.0.9-cp38-cp38-win_amd64.whl (634.8 kB view details)

Uploaded CPython 3.8 Windows x86-64

iscc_core-1.0.9-cp38-cp38-manylinux_2_31_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.31+ x86-64

iscc_core-1.0.9-cp38-cp38-macosx_11_0_x86_64.whl (644.7 kB view details)

Uploaded CPython 3.8 macOS 11.0+ x86-64

iscc_core-1.0.9-cp37-cp37m-win_amd64.whl (635.3 kB view details)

Uploaded CPython 3.7m Windows x86-64

iscc_core-1.0.9-cp37-cp37m-manylinux_2_31_x86_64.whl (1.4 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.31+ x86-64

iscc_core-1.0.9-cp37-cp37m-macosx_11_0_x86_64.whl (646.0 kB view details)

Uploaded CPython 3.7m macOS 11.0+ x86-64

File details

Details for the file iscc_core-1.0.9.tar.gz.

File metadata

  • Download URL: iscc_core-1.0.9.tar.gz
  • Upload date:
  • Size: 58.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.1 Windows/10

File hashes

Hashes for iscc_core-1.0.9.tar.gz
Algorithm Hash digest
SHA256 222a96d21c408cffa102aaeeb6da3ffee2d292797b9bad0e46e0fb4eef8ce87f
MD5 8d8416a5aa49c046f1a6cdf5b9c8f41e
BLAKE2b-256 1e3f35f3247474437d0215fb77a4ec2fc4aa5de8e80f928fced7bfb620701e5b

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: iscc_core-1.0.9-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 634.8 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.1 Windows/10

File hashes

Hashes for iscc_core-1.0.9-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 1b5459cdadd5f0bfa29264f0a3a9b84e17a8328ed344ec34524a1df7726e8c34
MD5 91cbe2942fdc42c70d41e174e3d8eb3a
BLAKE2b-256 2b008924f6b3e0f53372bf8e8153ce1bfbcae472a8c8ef859b54c6c16cfacfaf

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp312-cp312-manylinux_2_31_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp312-cp312-manylinux_2_31_x86_64.whl
Algorithm Hash digest
SHA256 2158f832f98c54b9b5492b111714177c9ed50db328c820db867bb7f9c384ac9c
MD5 f31553eeaa04a492bd234710004d8446
BLAKE2b-256 2a99f4ebe70ad867a48554d665ab0e0d2e9c95b5603ebcafa913608e7b906a72

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp312-cp312-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp312-cp312-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 d01c2a4f5054e7bb93e0dba9348b068fdd90897e77b95e4f0355cd9869434c0b
MD5 ec48e9d50925705c9c3bc82dc2646e66
BLAKE2b-256 09fef57acac16822ecfa19c0ba884c49b0107995df3a7f3e53539048b35ec36b

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: iscc_core-1.0.9-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 634.2 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.1 Windows/10

File hashes

Hashes for iscc_core-1.0.9-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 6cb51787ddf960a09315d4ba1b8df59da9c26e183dca1cec20f0b6537eab655f
MD5 af3767a310dd562ae42a1e27f566e628
BLAKE2b-256 c58828c3842ed6a5b96a311f644bb1b2d8294be8b44d5ff3604d5c73164122c3

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp311-cp311-manylinux_2_31_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp311-cp311-manylinux_2_31_x86_64.whl
Algorithm Hash digest
SHA256 1def2031f28955e74b94a0d1b20a3135fad1cd61a67259c1e2a1438a9e3bd0ec
MD5 a57875ada65896748953b660ca4f05de
BLAKE2b-256 754fa8e2b36221957a942e14a715bbdf52312a140330aabdea72d89f644a58f6

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp311-cp311-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp311-cp311-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 2df463dd3fb103c4537bf1db3dbe142b4fad6332b2b2fb7e4ad1fc5c2cb894b5
MD5 f17820467ddb671f30cb58b97aef1af3
BLAKE2b-256 4ee7e0e2d2b865ba86800c3ccb1c28d0dd910433a05b07661401c12f8af6d76f

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: iscc_core-1.0.9-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 634.0 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.1 Windows/10

File hashes

Hashes for iscc_core-1.0.9-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 5edf50404c72f82d66b464a044dc1b2b6c00aacace5d3178bd88276da2ffbd8d
MD5 7749f57bcf4381cfa0834dab8ac17b44
BLAKE2b-256 8ac26d9cad59ea9d8fe13bf0436b9bcf5151c347952b370efd3a80b419a4943b

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp310-cp310-manylinux_2_31_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp310-cp310-manylinux_2_31_x86_64.whl
Algorithm Hash digest
SHA256 66e1091d4640b8aaec62b73208630ebf9ef4ef4e68ab5a7fb91d5bd8665158b7
MD5 438fa54d37afd3cc368454a5bb2ba3ca
BLAKE2b-256 d607ce3c7f02f78b0686eb0ba45bcdf57353f7546d93cde26d97a495aeef27a3

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 c644a908567eea87950fb0a00ca074a2e3b031d5c5de5539932ca7ecfc1aac70
MD5 f9be5a37106f502b1388c1b98b1f1452
BLAKE2b-256 80d3dc302a0249b9460fc6e39491130bd95fbe53ee2af318a0f18971102d892d

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: iscc_core-1.0.9-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 634.1 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.1 Windows/10

File hashes

Hashes for iscc_core-1.0.9-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 7975eb61b810d26f72253a6250feaf415c125988b859a9bb54fbf69b0067091c
MD5 9d3398b2adf7c2d783418eb59f45a403
BLAKE2b-256 727518410044c9b5abc01f24a7ed8487e79cc418848a8fa134ef6afd66c482d3

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp39-cp39-manylinux_2_31_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp39-cp39-manylinux_2_31_x86_64.whl
Algorithm Hash digest
SHA256 97c901499d851c61176f1383c5e12f6e8fc072764fb07d27589cdd6d5ddae3f1
MD5 bfa85bb9f012e7cf9d8a02e077ce7b26
BLAKE2b-256 56e2076bea59cd49749c60872c01c16a441bf0009ea460889a22635d9a239d06

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 9992582fda258f570dc4e494547962cdd8a693907f7016c860bf054b5ce400fe
MD5 368af5445ba0dc90dfbdd40898d4b88a
BLAKE2b-256 b90a52c9d2099cc05a1255dbf5602784b369978a47b2aaf8178d23afd49c4518

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: iscc_core-1.0.9-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 634.8 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.1 Windows/10

File hashes

Hashes for iscc_core-1.0.9-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 c4e5b0c5a39beaf7ac76098ac3a36740e902ef826df4213e7e21e58c73c7ab17
MD5 add61296496d9dfd1b1a7e3548c4d4c1
BLAKE2b-256 29540e69c97efedb2fee420346e1747b31ac81937e7f3bf1d0722b2ade1e2621

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp38-cp38-manylinux_2_31_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp38-cp38-manylinux_2_31_x86_64.whl
Algorithm Hash digest
SHA256 c0a07af5ac6689341674cba3f33e646b6312a0c7347491ac55dd40ea22060c4c
MD5 103555e09f31c96a650d7feb634e388d
BLAKE2b-256 d5cc60eecc6c24977a85000cdfd0802358353ff52ef11e7e2c9d991fb59ce12d

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp38-cp38-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp38-cp38-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 27e78b6bac05e98e72db7de73a8060d4704f4e36257d0d7ba13ff44375d597cc
MD5 f0eaa30109d48c64180b637ed1bd57a4
BLAKE2b-256 4570e9deff7915cdf7f30a70e8d0f72521e320b53f6a0f0be77c7eddf8ef59c7

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: iscc_core-1.0.9-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 635.3 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.1 Windows/10

File hashes

Hashes for iscc_core-1.0.9-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 5dc78ed2f117ff279d564f3ed48bb497c92465a635f2e43b5ae5d37912a1c67b
MD5 35549fbfd68ab7f08e3d807c7b6c7f2c
BLAKE2b-256 389eb0a4af322f0e914b59f7da319327dedb5a7b69fb78400fe0b3468d6a24ca

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp37-cp37m-manylinux_2_31_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp37-cp37m-manylinux_2_31_x86_64.whl
Algorithm Hash digest
SHA256 2bbdec4367fe57770a0ff1143a9d275dedd60df391dfffa12b50d4b88a266881
MD5 3dbbbca50cc388f72b7a31df42a1fe5e
BLAKE2b-256 9b6020077b28bc5714b9a630e3c75b5147556fee40aa11650fd1bfcf5a084f8d

See more details on using hashes here.

File details

Details for the file iscc_core-1.0.9-cp37-cp37m-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for iscc_core-1.0.9-cp37-cp37m-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 bd80ffe6f139e28bfe4fbd7a267c3462550ad3465dbba14cedd73e9c09bc2825
MD5 2a4c35ebdd6d1dc3fa7b43e9d6b74160
BLAKE2b-256 d281257e0722c000d72fc71b21f359c9e57774934597c903790072492e58581c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page