Crackle 3D dense segmentation compression codec.
Project description
Crackle: Next gen. 3D segmentation compression codec.
crackle data.npy # creates data.ckl
crackle -d data.ckl # recovers data.npy
import crackle
import numpy
labels = np.load("example.npy") # a 2D or 3D dense segmentation
binary = crackle.compress(labels)
labels = crackle.decompress(binary)
# get unique labels without decompressing
uniq = crackle.labels(binary)
# Remap labels without decompressing. Could
# be useful for e.g. proofreading.
remapped = crackle.remap(
binary, { 1: 2, 2: 3, ... },
preserve_missing_labels=True
)
# for working with files
# if .gz is appended to the filename, the file will be
# automatically gzipped (or ungzipped)
crackle.save(labels, "example.ckl.gz")
labels = crackle.load("example.ckl.gz")
arr = crackle.CrackleArray(binary)
res = arr[:10,:10,:10] # array slicing (efficient z ranges)
20 in arr # highly efficient search
This repository is currently experimental.
Crackle is a new codec inspired by Compresso [1] for creating highly compressed 3D dense segmentation images. Compresso innovated by separating labels from boundary structures. There were conceptually four (but really five) elements in the format: header, labels, bit packed and RLE encoded binary image boundaries, and indeterminate boundary locations.
Crackle improves upon Compresso by replacing the bit-packed boundary map with a "crack code" and also uses 3D information to reduce redundancy in labels using "pins". Like Compresso, Crackle uses a two pass compression strategy where the output of crackle may be further comrpessed with a bitstream compressor like gzip, bzip2, zstd, or lzma.
Based on benchmarks, it seems likely that the output of Crackle will be in the ballpark of 20% to 50% the size of Compresso. The second stage compressed Crackle file will likely be about 60% to 85% the size of the equivalent Compresso file.
Boundary Structure: Crack Code
Our different approach is partially inspired by the work of Zingaretti et al. [2]. We represent the boundary not by border voxels, but by a "crack code" that represents the edges between voxels. This code can be thought of as directions to draw edges on a graph where the vertices are where the corners of four pixels touch and the edges are the cracks in between them.
Since this regular graph is 4-connected, each "move" in a cardinal direction can be described using two bits. To represent special symbols such as "branch" and "terminate", an impossible set of instructions on an undirected graph such as "left-right" or "up-down" can be used (occupying 4 bits). In order to avoid creating palendromic sequences such as (3, 0, 3) meaning (down, branch) but can be read (terminate, down), we can use the left-right impossible directions to rewrite it as (3, 2, 1).
While the image is 3D, we treat the image in layers because working in 3D introduces a large increase in geometric complexity (a cube has 6 faces, 12 edges, and 8 corners while a square has 4 edges and 4 corners). This increase in complexity would inflate the size of the crack code and make the implementation more difficult.
Label Map: Method of Pins
Each 2D CCL region must has a label assigned. Due to the 2D nature of the crack code, we cannot use 3D CCL. However, for example, a solid cube of height 100 would need 100 labels to represent the same color on every slice as in Compresso.
It is still possible to reduce the amount of redundant information even without 3D CCL. For each label, we find a set of vertical line segments ("pins") that fully cover the label's 2D CCL regions. Sharp readers may note that this is the NP-hard set cover problem.
Once a reasonably small or minimal set of pins are found, they can be encoded in two forms:
Condensed Form: [label][num_pins][pin_1][pin_2]...[pin_N]
Fixed Width Form: [label][pin_1][label][pin_2]...[label][pin_N]
Pin Format: [linear index of pin top][number of voxels to bottom]
Fixed width example with label 1 with a pin between (1,1,1) and (1,1,5) on a 10x10x10 image: [1][111][4]
An alternative formulation [label][idx1][idx2]
was shown in an experiment on connectomics.npy.cpso
to compress slightly worse than Compresso labels. However, this alternative formulation theoretically allows arbitrary pin orientations and so might be useful for reducing the overall number of pins.
The condensed format is a bit smaller than the fixed width format, but the fixed width format enables rapid searches if the set of pins are sorted by either the label (enables fast label in file
) or the likely more useful sorting by top index to filter candidate pins when performing random access to a z-slice.
References
-
Matejek, B., Haehn, D., Lekschas, F., Mitzenmacher, M., Pfister, H., 2017. Compresso: Efficient Compression of Segmentation Data for Connectomics, in: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (Eds.), Medical Image Computing and Computer Assisted Intervention − MICCAI 2017, Lecture Notes in Computer Science. Springer International Publishing, Cham, pp. 781–788. https://doi.org/10.1007/978-3-319-66182-7_89
-
Zingaretti, P., Gasparroni, M., Vecci, L., 1998. Fast chain coding of region boundaries. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 407–415. https://doi.org/10.1109/34.677272
-
Freeman, H., 1974. Computer Processing of Line-Drawing Images. ACM Comput. Surv. 6, 57–97. https://doi.org/10.1145/356625.356627
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for crackle_codec-0.1.0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9283e7937ac5caa6c6db803b0ded8cfcae3f5e2e16fe85e35c0c702040c7a23 |
|
MD5 | ea926bd28afa8be5909f3ccf27297ac7 |
|
BLAKE2b-256 | cbc080260565720e359e6ffc42bbe801962dee0d25cee10d181d913eeaca97c8 |
Hashes for crackle_codec-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d563459b0e09340ccd0c436765a3f8d09296cfafcb5196d712105bd895f33cf5 |
|
MD5 | 89dd060fed68cca8654342cf37cdc299 |
|
BLAKE2b-256 | 0ef7f6161ced35060e16a732865098309c28a4e50cee92a90bd441122a1d10ed |
Hashes for crackle_codec-0.1.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2cb2c16a854617698e00f1bfcd83be115d8d6f591fffc66a2e0eaa830c2fde59 |
|
MD5 | f5f31d6a981e9caf8a6487a3762cf525 |
|
BLAKE2b-256 | 248ed24000494ebd909b606ad75174b7ae52bb29070a7496f7633882b587f378 |
Hashes for crackle_codec-0.1.0-cp311-cp311-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1fe1760d03482879e8b67756da4a4e79b994713c3ae2960d1d50a69af580543 |
|
MD5 | 60f41b75827b64a5dad2e429026a066f |
|
BLAKE2b-256 | e9edb34c0ab1401dd1f52937b81bf45c8861b83cd73a20da8cf3725a32a29443 |
Hashes for crackle_codec-0.1.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c93742b4e611ac8f71e35eb00c34d61b59a0cfa86e164860799807f4420f2039 |
|
MD5 | c68669cd24704ff0f527c5b0b9afc5a3 |
|
BLAKE2b-256 | c9a44e54374e8e62f1b2a6f837ca4d81133eed2276911fc7c2e3c8a724d022ec |
Hashes for crackle_codec-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 294ac84f2f0fd8046a33486c7545fc11aa4817c60918dc1f7172b0bbad389575 |
|
MD5 | 715641de5a561b8a6394b8e4be75df11 |
|
BLAKE2b-256 | c6c9513759f2141607753a1821b240db4aeda91031a1fc9aa67d637480bfff21 |
Hashes for crackle_codec-0.1.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 40a84344cb5b315061c2f309243ebaaac2c41957af899808734972659f82a404 |
|
MD5 | 65651bba44aefc52d9df7df0caf2f93e |
|
BLAKE2b-256 | e95abbc9db684e42bbf59292161790aa0fc3607cc606a3564ec79c7a03383567 |
Hashes for crackle_codec-0.1.0-cp310-cp310-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8a387f0409d056ac866aa7a8701396677eea42a3fe0e0c926dc17379dc608632 |
|
MD5 | 5973951d1f0474ff817cbd98739b78d5 |
|
BLAKE2b-256 | c45a30c4dd27e8e4e745ddbb5c08fcc4b2f08aa2028ff7b7968ce2a35415c830 |
Hashes for crackle_codec-0.1.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99732397c454c5a1704019f6b56a94f034513186e0fb6a60cef6d5283e8e6d43 |
|
MD5 | 045b65918848666d0ea59fbff8d34890 |
|
BLAKE2b-256 | 2e49c9148add0a068dee8bc83981401a6a8e1081c2ad9c9b61a60f7aba5fe4e4 |
Hashes for crackle_codec-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 70e3655e7f90d68168faf79b70c0404586aadc6fa2c71dbd272fc8258a0f04e6 |
|
MD5 | a3039f8589df3982860575b0d99cf049 |
|
BLAKE2b-256 | 8130a6646a8032a2f79ed9765ce7111a26402fc295b5551b9c7f724d80e9fd6b |
Hashes for crackle_codec-0.1.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a4237f6e81bf2a4aa825d9fe92bb2ea219203823d59fe79bf813ce4d48431917 |
|
MD5 | b6fb712712333bdd851826d9c1b0f978 |
|
BLAKE2b-256 | 0b8fe49125c3ef3f1f07bd0cb0a0fe0184beed04acac4fdd25a8dd0984cb6140 |
Hashes for crackle_codec-0.1.0-cp39-cp39-macosx_10_9_universal2.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fb49c72a37cd88b8aa8d94f302d7e7f0d6cba95cbca02a2ca973871b159dd271 |
|
MD5 | 4bba3dc045659925aa917172eff4b032 |
|
BLAKE2b-256 | 7560a0f26eb1bbbe286c8beeb55a029f80e598104a620ff0d5f7d5dd76f895f0 |
Hashes for crackle_codec-0.1.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 855685ee93594d2696b939df1b7b3e13201189958e9f333077603376ab0c55e0 |
|
MD5 | 1ad10af006b4fc29074f20b9e04ddb39 |
|
BLAKE2b-256 | d850eaf686d6778f28efe4e98dbc0175ff15c37d3d6a9ae1029dbc78a2012332 |
Hashes for crackle_codec-0.1.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 770e09cc8a77cafe7b0f67a5f28903c1782c41df57b5601fb9791b740f894c74 |
|
MD5 | 30227b085384d33edd91b74b5502f205 |
|
BLAKE2b-256 | 6fbb0a1939da1e77b4887930916ceabca6758c8ac6eaa0f9c137e17c874abd03 |
Hashes for crackle_codec-0.1.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9409c56bc68dcc843e110dba53ede48e20a37846c00913d463f4e7f699778aa5 |
|
MD5 | 99c3eabca1cfdfe3296727f1e4c95581 |
|
BLAKE2b-256 | cccb418e5f0f8fb8e539697eb06a39f8cf79858cd437bd428b74aeb4e96b9cee |
Hashes for crackle_codec-0.1.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2fc2c551bf28ab27169e29b1d581293ffe2efd89691730682870db158c13d267 |
|
MD5 | e3074eea88671830fafa345a1944eab4 |
|
BLAKE2b-256 | ef655ccf7c24286a25427aefb655acfd7d7eb02c4d69b880843ad59b0df91aee |
Hashes for crackle_codec-0.1.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 919f681408820c4409332906a81dc11de91d7f97b08a8dd3936333a881508acc |
|
MD5 | 3f77e1561eb8babbb67f8b97176ecd17 |
|
BLAKE2b-256 | 26df5a515431d2b9c6107cc4ae3e597066e8734aad883027661cdda5f88c2e2b |
Hashes for crackle_codec-0.1.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a6f553fe3dbce23e3eed067eb2f75350dfb5259229e90e96b9db010088b3843 |
|
MD5 | d6be733e9b24a56f766d1f396dbd073a |
|
BLAKE2b-256 | e4a576dd8431b255e119a6b6cb748223cbfec959a760e16352ee35fcfd855ee0 |
Hashes for crackle_codec-0.1.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a95afc5a9efc341e01486430710cf4bbc5387fc6e31104432f3add9a0c4a231 |
|
MD5 | 3cac896a9b4cfbe4c062c83294157b5f |
|
BLAKE2b-256 | 0ec837a54ac13062d251c5bf4644707088136c609dde4dd8b67ed411cb13ab95 |
Hashes for crackle_codec-0.1.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5451d5c45e84207f4840b9c949f899ca119a4b1e564dcab35d32e86c096dc818 |
|
MD5 | 07f7d065d624f5c4d316c3b06c9f6fec |
|
BLAKE2b-256 | 161b4081073f362041af235a5f3a24eda81d7f4e1363ecf867e1533608b3ac25 |
Hashes for crackle_codec-0.1.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5bc3302ed0a438474d8630bddeec9714387c4595332aad37ef00ea0ec9f135b8 |
|
MD5 | 268bc805c1b57be66536e758684d95a6 |
|
BLAKE2b-256 | 7e97440c5c469acc57823a72a44c17d2fa2099403f9121bd6a2c86d8c3b99b3b |