Skip to main content

Search, hash, sort, and process strings faster via SWAR and SIMD

Project description

StringZilla 🦖

StringZilla banner

Strings are the first fundamental data type every programming language implements in software rather than hardware, so dedicated CPU instructions are rare - and the few that exist are hardly ideal. That's why most languages lean on the C standard library (libc) for their string operations, which, despite its name, ships its hottest code in hand-tuned assembly. It does exploit SIMD, but it isn't perfect. 1️⃣ Even on ubiquitous hardware - over a billion 64-bit ARM CPUs - routines such as strstr and memmem top out at roughly one-third of available throughput. 2️⃣ SIMD coverage is uneven: fast forward scans don't guarantee speedy reverse searches, hashing and case-mapping is not even part of the standard. 3️⃣ Many higher-level languages can't rely on libc at all because their strings aren't NUL-terminated - or may even contain embedded zeroes. That's why StringZilla exists: predictable, high performance on every modern platform, OS, and programming language.

StringZilla Python installs StringZilla Rust installs StringZilla code size

StringZilla is the GodZilla of string libraries, using SIMD and SWAR to accelerate binary and UTF-8 string operations on modern CPUs and GPUs. It delivers up to 10x higher CPU throughput in C, C++, Rust, Python, and other languages, and can be 100x faster than existing GPU kernels, covering a broad range of functionality. It accelerates exact and fuzzy string matching, hashing, edit distance computations, sorting, provides allocation-free lazily-evaluated smart-iterators, and even random-string generators.

  • 🐂 C: Upgrade LibC's <string.h> to <stringzilla/stringzilla.h> in C 99
  • 🐉 C++: Upgrade STL's <string> to <stringzilla/stringzilla.hpp> in C++ 11
  • 🧮 CUDA: Process in-bulk with <stringzillas/stringzillas.cuh> in CUDA C++ 17
  • 🐍 Python: Upgrade your str to faster Str
  • 🦀 Rust: Use the StringZilla traits crate
  • 🦫 Go: Use the StringZilla cGo module
  • 🍎 Swift: Use the String+StringZilla extension
  • 🟨 JavaScript: Use the StringZilla library
  • 🐚 Shell: Accelerate common CLI tools with sz- prefix
  • 📚 Researcher? Jump to Algorithms & Design Decisions
  • 💡 Thinking to contribute? Look for "good first issues"
  • 🤝 And check the guide to set up the environment
  • Want more bindings or features? Let me know!

Who is this for?

  • For data-engineers parsing large datasets, like the CommonCrawl, RedPajama, or LAION.
  • For software engineers optimizing strings in their apps and services.
  • For bioinformaticians and search engineers looking for edit-distances for USearch.
  • For DBMS devs, optimizing LIKE, ORDER BY, and GROUP BY operations.
  • For hardware designers, needing a SWAR baseline for string-processing functionality.
  • For students studying SIMD/SWAR applications to non-data-parallel operations.

Performance

C C++ Python StringZilla
Unicode case-folding, expanding characters like ßss
.casefold
x86: 0.4 GB/s
sz.utf8_case_fold
x86: 1.3 GB/s
Unicode case-insensitive substring search
icu.StringSearch
x86: 0.02 GB/s
utf8_case_insensitive_find
x86: 3.0 GB/s
find the first occurrence of a random word from text, ≅ 5 bytes long
strstr 1
x86: 7.4 · arm: 2.0 GB/s
.find
x86: 2.9 · arm: 1.6 GB/s
.find
x86: 1.1 · arm: 0.6 GB/s
sz_find
x86: 10.6 · arm: 7.1 GB/s
find the last occurrence of a random word from text, ≅ 5 bytes long
.rfind
x86: 0.5 · arm: 0.4 GB/s
.rfind
x86: 0.9 · arm: 0.5 GB/s
sz_rfind
x86: 10.8 · arm: 6.7 GB/s
split lines separated by \n or \r 2
strcspn 1
x86: 5.42 · arm: 2.19 GB/s
.find_first_of
x86: 0.59 · arm: 0.46 GB/s
re.finditer
x86: 0.06 · arm: 0.02 GB/s
sz_find_byteset
x86: 4.08 · arm: 3.22 GB/s
find the last occurrence of any of 6 whitespaces 2
.find_last_of
x86: 0.25 · arm: 0.25 GB/s
sz_rfind_byteset
x86: 0.43 · arm: 0.23 GB/s
Random string from a given alphabet, 20 bytes long 3
rand() % n
x86: 18.0 · arm: 9.4 MB/s
uniform_int_distribution
x86: 47.2 · arm: 20.4 MB/s
join(random.choices(x))
x86: 13.3 · arm: 5.9 MB/s
sz_fill_random
x86: 56.2 · arm: 25.8 MB/s
Mapping characters with lookup table transforms
std::transform
x86: 3.81 · arm: 2.65 GB/s
str.translate
x86: 260.0 · arm: 140.0 MB/s
sz_lookup
x86: 21.2 · arm: 8.5 GB/s
Get sorted order, ≅ 8 million English words 4
qsort_r
x86: 3.55 · arm: 5.77 s
std::sort
x86: 2.79 · arm: 4.02 s
numpy.argsort
x86: 7.58 · arm: 13.00 s
sz_sequence_argsort
x86: 1.91 · arm: 2.37 s
Levenshtein edit distance, text lines ≅ 100 bytes long
via NLTK 5 and CuDF
x86: 1,615,306 · arm: 1,349,980 · cuda: 6,532,411,354 CUPS
szs_levenshtein_distances_t
x86: 3,434,427,548 · arm: 1,605,340,403 · cuda: 93,662,026,653 CUPS
Needleman-Wunsch alignment scores, proteins ≅ 1 K amino acids long
via biopython 6
x86: 575,981,513 · arm: 436,350,732 CUPS
szs_needleman_wunsch_scores_t
x86: 452,629,942 · arm: 520,170,239 · cuda: 9,017,327,818 CUPS

Most StringZilla modules ship ready-to-run benchmarks for C, C++, Python, and more. Grab them from ./scripts, and see CONTRIBUTING.md for instructions. On CPUs that permit misaligned loads, even the 64-bit SWAR baseline outruns both libc and the STL. For wider head-to-heads against Rust and Python favorites, browse the StringWars repository. To inspect collision resistance and distribution shapes for our hashers, see HashEvals.

Most benchmarks were conducted on a 1 GB English text corpus, with an average word length of 6 characters. The code was compiled with GCC 12, using glibc v2.35. The benchmarks were performed on Arm-based Graviton3 AWS c7g instances and r7iz Intel Sapphire Rapids. Most modern Arm-based 64-bit CPUs will have similar relative speedups. Variance within x86 CPUs will be larger. For CUDA benchmarks, the Nvidia H100 GPUs were used. 1 Unlike other libraries, LibC requires strings to be NULL-terminated. 2 Six whitespaces in the ASCII set are: \t\n\v\f\r. Python's and other standard libraries have specialized functions for those. 3 All modulo operations were conducted with uint8_t to allow compilers more optimization opportunities. The C++ STL and StringZilla benchmarks used a 64-bit Mersenne Twister as the generator. For C, C++, and StringZilla, an in-place update of the string was used. In Python every string had to be allocated as a new object, which makes it less fair. 4 Contrary to the popular opinion, Python's default sorted function works faster than the C and C++ standard libraries. That holds for large lists or tuples of strings, but fails as soon as you need more complex logic, like sorting dictionaries by a string key, or producing the "sorted order" permutation. The latter is very common in database engines and is most similar to numpy.argsort. The current StringZilla solution can be at least 4x faster without loss of generality. 5 Most Python libraries for strings are also implemented in C. 6 Unlike the rest of BioPython, the alignment score computation is implemented in C.

Functionality

StringZilla is compatible with most modern CPUs, and provides a broad range of functionality. It's split into 2 layers:

  1. StringZilla: single-header C library and C++ wrapper for high-performance string operations.
  2. StringZillas: parallel CPU/GPU backends used for large-batch operations and accelerators.

Having a second C++/CUDA layer greatly simplifies the implementation of similarity scoring and fingerprinting functions, which would otherwise require too much error-prone boilerplate code in pure C. Both layers are designed to be extremely portable:

  • across both little-endian and big-endian architectures.
  • across 32-bit and 64-bit hardware architectures.
  • across operating systems and compilers.
  • across ASCII and UTF-8 encoded inputs.

Not all features are available across all bindings. Consider contributing if you need a feature that's not yet implemented.

Maturity C C++ Python Rust JS Swift Go
Substring Search 🌳
Character Set Search 🌳
Sorting & Sequence Operations 🌳
Lazy Ranges, Compressed Arrays 🌳
One-Shot & Streaming Hashes 🌳
Cryptographic Hashes 🌳
Small String Class 🧐
Random String Generation 🌳
Unicode Case Folding 🧐
Case-Insensitive UTF-8 Search 🚧
TR29 Word Boundary Detection 🚧
Parallel Similarity Scoring 🌳
Parallel Rolling Fingerprints 🌳

🌳 parts are used in production. 🧐 parts are in beta. 🚧 parts are under active development, and are likely to break in subsequent releases. ✅ are implemented. ⚪ are considered. ❌ are not intended.

Quick Start: Python

Python bindings are available on PyPI for Python 3.8+, and can be installed with pip.

pip install stringzilla         # for serial algorithms
pip install stringzillas-cpus   # for parallel multi-CPU backends
pip install stringzillas-cuda   # for parallel Nvidia GPU backend

You can immediately check the installed version and the used hardware capabilities with following commands:

python -c "import stringzilla; print(stringzilla.__version__)"
python -c "import stringzillas; print(stringzillas.__version__)"
python -c "import stringzilla; print(stringzilla.__capabilities__)"     # for serial algorithms
python -c "import stringzillas; print(stringzillas.__capabilities__)"   # for parallel algorithms

Basic Usage

If you've ever used the Python str, bytes, bytearray, or memoryview classes, you'll know what to expect. StringZilla's Str class is a hybrid of the above, providing a str-like interface to byte arrays.

from stringzilla import Str, File

text_from_str = Str('some-string') # no copies, just a view
text_from_bytes = Str(b'some-array') # no copies, just a view
text_from_file = Str(File('some-file.txt')) # memory-mapped file

import numpy as np
alphabet_array = np.arange(ord("a"), ord("z"), dtype=np.uint8)
text_from_array = Str(memoryview(alphabet_array))

The File class memory-maps a file from persistent storage without loading its copy into RAM. The contents of that file would remain immutable, and the mapping can be shared by multiple Python processes simultaneously. A standard dataset pre-processing use case would be to map a sizable textual dataset like Common Crawl into memory, spawn child processes, and split the job between them.

Basic Operations

  • Length: len(text) -> int
  • Indexing: text[42] -> str
  • Slicing: text[42:46] -> Str
  • Substring check: 'substring' in text -> bool
  • Hashing: hash(text) -> int
  • String conversion: str(text) -> str

Advanced Operations

import sys

x: bool = text.contains('substring', start=0, end=sys.maxsize)
x: int = text.find('substring', start=0, end=sys.maxsize)
x: int = text.count('substring', start=0, end=sys.maxsize, allowoverlap=False)
x: str = text.decode(encoding='utf-8', errors='strict')
x: Strs = text.split(separator=' ', maxsplit=sys.maxsize, keepseparator=False)
x: Strs = text.rsplit(separator=' ', maxsplit=sys.maxsize, keepseparator=False)
x: Strs = text.splitlines(keeplinebreaks=False, maxsplit=sys.maxsize)

It's important to note that the last function's behavior is slightly different from Python's str.splitlines. The native version matches \n, \r, \v or \x0b, \f or \x0c, \x1c, \x1d, \x1e, \x85, \r\n, \u2028, \u2029, including 3x two-byte-long runes. The StringZilla version matches only \n, \v, \f, \r, \x1c, \x1d, \x1e, \x85, avoiding two-byte-long runes.

Character Set Operations

Python strings don't natively support character set operations. This forces people to use regular expressions, which are slow and hard to read. To avoid the need for re.finditer, StringZilla provides the following interfaces:

x: int = text.find_first_of('chars', start=0, end=sys.maxsize)
x: int = text.find_last_of('chars', start=0, end=sys.maxsize)
x: int = text.find_first_not_of('chars', start=0, end=sys.maxsize)
x: int = text.find_last_not_of('chars', start=0, end=sys.maxsize)
x: Strs = text.split_byteset(separator='chars', maxsplit=sys.maxsize, keepseparator=False)
x: Strs = text.rsplit_byteset(separator='chars', maxsplit=sys.maxsize, keepseparator=False)

StringZilla also provides string trimming functions and random string generation:

x: str = text.lstrip('chars')  # Strip leading characters
x: str = text.rstrip('chars')  # Strip trailing characters
x: str = text.strip('chars')   # Strip both ends
x: bytes = sz.random(length=100, seed=42, alphabet='ACGT')  # Random string generation
sz.fill_random(buffer, seed=42, alphabet=None)  # Fill mutable buffer with random bytes

You can also transform the string using Look-Up Tables (LUTs), mapping it to a different character set. This would result in a copy - str for str inputs and bytes for other types.

x: str = text.translate('chars', {}, start=0, end=sys.maxsize, inplace=False)
x: bytes = text.translate(b'chars', {}, start=0, end=sys.maxsize, inplace=False)

For efficiency reasons, pass the LUT as a string or bytes object, not as a dictionary. This can be useful in high-throughput applications dealing with binary data, including bioinformatics and image processing. Here is an example:

import stringzilla as sz
look_up_table = bytes(range(256)) # Identity LUT
image = open("/image/path.jpeg", "rb").read()
sz.translate(image, look_up_table, inplace=True)

Hash

Single-shot and incremental hashing are both supported:

import stringzilla as sz

# One-shot - stable 64-bit output across all platforms!
one = sz.hash(b"Hello, world!", seed=42)

# Incremental updates return itself; digest does not consume state
hasher = sz.Hasher(seed=42)
hasher.update(b"Hello, ").update(b"world!")
streamed = hasher.digest() # or `hexdigest()` for a string
assert one == streamed

SHA-256 Checksums

SHA-256 cryptographic checksums are also available for single-shot and incremental hashing:

import stringzilla as sz

# One-shot SHA-256
digest_bytes = sz.sha256(b"Hello, world!")
assert len(digest_bytes) == 32

# Incremental SHA-256
hasher = sz.Sha256()
hasher.update(b"Hello, ").update(b"world!")
digest_bytes = hasher.digest()
digest_hex = hasher.hexdigest()  # 64 character lowercase hex string

# HMAC-SHA256 for message authentication
mac = sz.hmac_sha256(key=b"secret", message=b"Hello, world!")

StringZilla integrates seamlessly with memory-mapped files for efficient large file processing. The traditional approach with hashlib:

import hashlib

with open("xlsum.csv", "rb") as streamed_file:
    hasher = hashlib.sha256()
    while chunk := streamed_file.read(4096):
        hasher.update(chunk)
    checksum = hasher.hexdigest()

Can be simplified with StringZilla:

from stringzilla import Sha256, File

mapped_file = File("xlsum.csv")
checksum = Sha256().update(mapped_file).hexdigest()

Both output the same digest: 7278165ce01a4ac1e8806c97f32feae908036ca3d910f5177d2cf375e20aeae1. OpenSSL (powering hashlib) has faster Assembly kernels, but StringZilla avoids file I/O overhead with memory mapping and skips Python's abstraction layers:

  • OpenSSL-backed hashlib.sha256: 12.6s
  • StringZilla end-to-end: 4.0s — 3× faster!

Unicode Case-Folding and Case-Insensitive Search

StringZilla implements both Unicode Case Folding and Case-Insensitive UTF-8 Search. Unlike most libraries only capable of lower-casing ASCII-represented English alphabet, StringZilla covers over 1M+ codepoints. The case-folding API expects the output buffer to be at least 3× larger than the input, to accommodate for the worst-case character expansions scenarios.

import stringzilla as sz

sz.utf8_case_fold('HELLO')      # b'hello'
sz.utf8_case_fold('Straße')     # b'strasse' — ß (1 char) expands to "ss" (2 chars)
sz.utf8_case_fold('efficient')    # b'efficient' — ffi ligature (1 char) expands to "ffi" (3 chars)

The case-insensitive search returns the byte offset of the match, handling expansions correctly.

import stringzilla as sz

sz.utf8_case_insensitive_find('Der große Hund', 'GROSSE')   # 4 — finds "große" at codepoint 4
sz.utf8_case_insensitive_find('Straße', 'STRASSE')          # 0 — ß matches "SS"
sz.utf8_case_insensitive_find('efficient', 'EFFICIENT')       # 0 — ffi ligature matches "FFI"

# Iterator for finding ALL matches
haystack = 'Straße STRASSE strasse'
for match in sz.utf8_case_insensitive_find_iter(haystack, 'strasse'):
    print(match, match.offset_within(haystack))  # Yields: 'Straße', 'STRASSE', 'strasse'

# With overlapping matches
list(sz.utf8_case_insensitive_find_iter('aaaa', 'aa'))  # ['aa', 'aa'] — 2 non-overlapping
list(sz.utf8_case_insensitive_find_iter('aaaa', 'aa', include_overlapping=True))  # 3 matches

Collection-Level Operations

Once split into a Strs object, you can sort, shuffle, and reorganize the slices with minimal memory footprint. If all the chunks are located in consecutive memory regions, the memory overhead can be as low as 4 bytes per chunk.

lines: Strs = text.split(separator='\n') # 4 bytes per line overhead for under 4 GB of text
batch: Strs = lines.sample(seed=42) # 10x faster than `random.choices`
lines_shuffled: Strs = lines.shuffled(seed=42) # or shuffle all lines and shard with slices
lines_sorted: Strs = lines.sorted() # returns a new Strs in sorted order
order: tuple = lines.argsort() # similar to `numpy.argsort`

Working on RedPajama, addressing 20 billion annotated English documents, one will need only 160 GB of RAM instead of terabytes. Once loaded, the data will be memory-mapped, and can be reused between multiple Python processes without copies. And of course, you can use slices to navigate the dataset and shard it between multiple workers.

lines[::3] # every third line
lines[1::1] # every odd line
lines[:-100:-1] # last 100 lines in reverse order

Iterators and Memory Efficiency

Python's operations like split() and readlines() immediately materialize a list of copied parts. This can be very memory-inefficient for large datasets. StringZilla saves a lot of memory by viewing existing memory regions as substrings, but even more memory can be saved by using lazily evaluated iterators.

x: SplitIterator[Str] = text.split_iter(separator=' ', keepseparator=False)
x: SplitIterator[Str] = text.rsplit_iter(separator=' ', keepseparator=False)
x: SplitIterator[Str] = text.split_byteset_iter(separator='chars', keepseparator=False)
x: SplitIterator[Str] = text.rsplit_byteset_iter(separator='chars', keepseparator=False)

StringZilla can easily be 10x more memory efficient than native Python classes for tokenization. With lazy operations, it practically becomes free.

import stringzilla as sz
%load_ext memory_profiler

text = open("enwik9.txt", "r").read() # 1 GB, mean word length 7.73 bytes
%memit text.split() # increment: 8670.12 MiB (152 ms)
%memit sz.split(text) # increment: 530.75 MiB (25 ms)
%memit sum(1 for _ in sz.split_iter(text)) # increment: 0.00 MiB

Low-Level Python API

Aside from calling the methods on the Str and Strs classes, you can also call the global functions directly on str and bytes instances. Assuming StringZilla CPython bindings are implemented without any intermediate tools like SWIG or PyBind, the call latency should be similar to native classes.

import stringzilla as sz

contains: bool = sz.contains("haystack", "needle", start=0, end=sys.maxsize)
offset: int = sz.find("haystack", "needle", start=0, end=sys.maxsize)
count: int = sz.count("haystack", "needle", start=0, end=sys.maxsize, allowoverlap=False)

Similarity Scores

StringZilla exposes high-performance, batch-oriented similarity via the stringzillas module. Use DeviceScope to pick hardware and optionally limit capabilities per engine.

import stringzilla as sz
import stringzillas as szs

cpu_scope = szs.DeviceScope(cpu_cores=4)    # force CPU-only
gpu_scope = szs.DeviceScope(gpu_device=0)   # pick GPU 0 if available

strings_a = sz.Strs(["kitten", "flaw"])
strings_b = sz.Strs(["sitting", "lawn"])

strings_a = szs.to_device(strings_a) # optional ahead of time transfer
strings_b = szs.to_device(strings_b) # optional ahead of time transfer

engine = szs.LevenshteinDistances(
    match=0, mismatch=2,        # costs don't have to be 1
    open=3, extend=1,           # may be different in Bio
    capabilities=("serial",)    # avoid SIMD 🤭
)
distances = engine(strings_a, strings_b, device=cpu_scope)
assert int(distances[0]) == 3 and int(distances[1]) == 2

Note, that this computes byte-level distances. For UTF-8 codepoints, use a different engine class:

strings_a = sz.Strs(["café", "αβγδ"])
strings_b = sz.Strs(["cafe", "αγδ"])
engine = szs.LevenshteinDistancesUTF8(capabilities=("serial",))
distances = engine(strings_a, strings_b, device=cpu_scope)
assert int(distances[0]) == 1 and int(distances[1]) == 1

For alignment scoring provide a 256×256 substitution matrix using NumPy:

import numpy as np
import stringzilla as sz
import stringzillas as szs

substitution_matrix = np.zeros((256, 256), dtype=np.int8)
substitution_matrix.fill(-1)                # mismatch score
np.fill_diagonal(substitution_matrix, 0)    # match score

engine = szs.NeedlemanWunsch(substitution_matrix=substitution_matrix, open=1, extend=1)
scores = engine(strings_a, strings_b, device=cpu_scope)

Several Python libraries provide edit distance computation. Most are implemented in C but may be slower than StringZilla on large inputs. For proteins ~10k chars, 100 pairs:

Using the same proteins for Needleman-Wunsch alignment scores:

§ Example converting from BioPython to StringZilla.
import numpy as np
from Bio import Align
from Bio.Align import substitution_matrices

aligner = Align.PairwiseAligner()
aligner.substitution_matrix = substitution_matrices.load("BLOSUM62")
aligner.open_gap_score = 1
aligner.extend_gap_score = 1

# Convert the matrix to NumPy
subs_packed = np.array(aligner.substitution_matrix).astype(np.int8)
subs_reconstructed = np.zeros((256, 256), dtype=np.int8)

# Initialize all banned characters to a the largest possible penalty
subs_reconstructed.fill(127)
for packed_row, packed_row_aminoacid in enumerate(aligner.substitution_matrix.alphabet):
    for packed_column, packed_column_aminoacid in enumerate(aligner.substitution_matrix.alphabet):
        reconstructed_row = ord(packed_row_aminoacid)
        reconstructed_column = ord(packed_column_aminoacid)
        subs_reconstructed[reconstructed_row, reconstructed_column] = subs_packed[packed_row, packed_column]

# Let's pick two examples of tripeptides (made of 3 amino acids)
glutathione = "ECG" # Need to rebuild human tissue?
thyrotropin_releasing_hormone = "QHP" # Or to regulate your metabolism?

import stringzillas as szs
engine = szs.NeedlemanWunsch(substitution_matrix=subs_reconstructed, open=1, extend=1)
score = int(engine(sz.Strs([glutathione]), sz.Strs([thyrotropin_releasing_hormone]))[0])
assert score == aligner.score(glutathione, thyrotropin_releasing_hormone) # Equal to 6

Rolling Fingerprints

MinHashing is a common technique for Information Retrieval, producing compact representations of large documents. For $D$ hash-functions and a text of length $L$, in the worst case it involves computing $O(D \cdot L)$ hashes.

import numpy as np
import stringzilla as sz
import stringzillas as szs

texts = sz.Strs([
    "quick brown fox jumps over the lazy dog",
    "quick brown fox jumped over a very lazy dog",
])

cpu = szs.DeviceScope(cpu_cores=4)
ndim = 1024
window_widths = np.array([4, 6, 8, 10], dtype=np.uint64)
engine = szs.Fingerprints(
    ndim=ndim,
    window_widths=window_widths,    # optional
    alphabet_size=256,              # default for byte strings
    capabilities=("serial",),       # defaults to all, can also pass a `DeviceScope`
)

hashes, counts = engine(texts, device=cpu)
assert hashes.shape == (len(texts), ndim)
assert counts.shape == (len(texts), ndim)
assert hashes.dtype == np.uint32 and counts.dtype == np.uint32

Serialization

Filesystem

Similar to how File can be used to read a large file, other interfaces can be used to dump strings to disk faster. The Str class has write_to to write the string to a file, and offset_within to obtain integer offsets of substring view in larger string for navigation.

web_archive = Str("<html>...</html><html>...</html>")
_, end_tag, next_doc = web_archive.partition("</html>") # or use `find`
next_doc_offset = next_doc.offset_within(web_archive)
web_archive.write_to("next_doc.html") # no GIL, no copies, just a view

PyArrow

A Str is easy to cast to PyArrow buffers.

from pyarrow import foreign_buffer
from stringzilla import Strs

strs = Strs(["alpha", "beta", "gamma"])
arrow = foreign_buffer(strs.address, strs.nbytes, strs)

And only slightly harder to convert in reverse direction:

arr = pa.Array.from_buffers(
    pa.large_string() if strs.offsets_are_large else pa.string(),
    len(strs),
    [None,
     pa.foreign_buffer(strs.offsets_address, strs.offsets_nbytes, strs),
     pa.foreign_buffer(strs.tape_address, strs.tape_nbytes, strs)],
)

That means you can convert Str to pyarrow.Buffer and Strs to pyarrow.Array without extra copies. For more details on the tape-like layouts, refer to the StringTape repository.

Quick Start: C/C++

The C library is header-only, so you can just copy the stringzilla.h header into your project. Same applies to C++, where you would copy the stringzilla.hpp header. Alternatively, add it as a submodule, and include it in your build system.

git submodule add https://github.com/ashvardanian/StringZilla.git external/stringzilla
git submodule update --init --recursive

Or using a pure CMake approach:

FetchContent_Declare(
    stringzilla
    GIT_REPOSITORY https://github.com/ashvardanian/StringZilla.git
    GIT_TAG main  # or specify a version tag
)
FetchContent_MakeAvailable(stringzilla)

Last, but not the least, you can also install it as a library, and link against it. This approach is worse for inlining, but brings dynamic runtime dispatch for the most advanced CPU features.

Basic Usage with C 99 and Newer

There is a stable C 99 interface, where all function names are prefixed with sz_. Most interfaces are well documented, and come with self-explanatory names and examples. In some cases, hardware specific overloads are available, like sz_find_skylake or sz_find_neon. Both are companions of the sz_find, first for x86 CPUs with AVX-512 support, and second for Arm NEON-capable CPUs.

#include <stringzilla/stringzilla.h>

// Initialize your haystack and needle
sz_string_view_t haystack = {your_text, your_text_length};
sz_string_view_t needle = {your_subtext, your_subtext_length};

// Perform string-level operations auto-picking the backend or dispatching manually
sz_cptr_t ptr = sz_find(haystack.start, haystack.length, needle.start, needle.length);
sz_size_t substring_position = ptr ? (sz_size_t)(ptr - haystack.start) : SZ_SIZE_MAX; // SZ_SIZE_MAX if not found

// Backend-specific variants return pointers as well
sz_cptr_t ptr = sz_find_skylake(haystack.start, haystack.length, needle.start, needle.length);
sz_cptr_t ptr = sz_find_haswell(haystack.start, haystack.length, needle.start, needle.length);
sz_cptr_t ptr = sz_find_westmere(haystack.start, haystack.length, needle.start, needle.length);
sz_cptr_t ptr = sz_find_neon(haystack.start, haystack.length, needle.start, needle.length);

// Hash strings at once
sz_u64_t hash = sz_hash(haystack.start, haystack.length, 42);    // 42 is the seed
sz_u64_t checksum = sz_bytesum(haystack.start, haystack.length); // or accumulate byte values

// Hash strings incrementally with "init", "update", and "digest":
sz_hash_state_t state;
sz_hash_state_init(&state, 42);
sz_hash_state_update(&state, haystack.start, 1);                        // first char
sz_hash_state_update(&state, haystack.start + 1, haystack.length - 1);  // rest of the string
sz_u64_t streamed_hash = sz_hash_state_digest(&state);

// SHA-256 cryptographic checksums
sz_u8_t digest[32];
sz_sha256_state_t sha_state;
sz_sha256_state_init(&sha_state);
sz_sha256_state_update(&sha_state, haystack.start, haystack.length);
sz_sha256_state_digest(&sha_state, digest);

// Perform collection level operations
sz_sequence_t array = {your_handle, your_count, your_get_start, your_get_length};
sz_sorted_idx_t order[your_count];
sz_sequence_argsort(&array, NULL, order); // NULL allocator uses default
§ Mapping from LibC to StringZilla.

By design, StringZilla has a couple of notable differences from LibC:

  1. all strings are expected to have a length, and are not necessarily null-terminated.
  2. every operations has a reverse order counterpart.

That way sz_find and sz_rfind are similar to strstr and strrstr in LibC. Similarly, sz_find_byte and sz_rfind_byte replace memchr and memrchr. The sz_find_byteset maps to strspn and strcspn, while sz_rfind_byteset has no sibling in LibC.

LibC Functionality StringZilla Equivalents
memchr(haystack, needle, haystack_length), strchr sz_find_byte(haystack, haystack_length, needle)
memrchr(haystack, needle, haystack_length) sz_rfind_byte(haystack, haystack_length, needle)
memcmp, strcmp sz_order, sz_equal
strlen(haystack) sz_find_byte(haystack, haystack_length, needle)
strcspn(haystack, reject) sz_find_byteset(haystack, haystack_length, reject_bitset)
strspn(haystack, accept) sz_find_byte_not_from(haystack, haystack_length, accept, accept_length)
memmem(haystack, haystack_length, needle, needle_length), strstr sz_find(haystack, haystack_length, needle, needle_length)
memcpy(destination, source, destination_length) sz_copy(destination, source, destination_length)
memmove(destination, source, destination_length) sz_move(destination, source, destination_length)
memset(destination, value, destination_length) sz_fill(destination, destination_length, value)

Basic Usage with C++ 11 and Newer

There is a stable C++ 11 interface available in the ashvardanian::stringzilla namespace. It comes with two STL-like classes: string_view and string. The first is a non-owning view of a string, and the second is a mutable string with a Small String Optimization.

#include <stringzilla/stringzilla.hpp>

namespace sz = ashvardanian::stringzilla;

sz::string haystack = "some string";
sz::string_view needle = sz::string_view(haystack).substr(0, 4);

auto substring_position = haystack.find(needle); // Or `rfind`
auto hash = std::hash<sz::string_view>{}(haystack); // Compatible with STL's `std::hash`

haystack.end() - haystack.begin() == haystack.size(); // Or `rbegin`, `rend`
haystack.find_first_of(" \v\t") == 4; // Or `find_last_of`, `find_first_not_of`, `find_last_not_of`
haystack.starts_with(needle) == true; // Or `ends_with`
haystack.remove_prefix(needle.size()); // Why is this operation in-place?!
haystack.contains(needle) == true; // STL has this only from C++ 23 onwards
haystack.compare(needle) == 1; // Or `haystack <=> needle` in C++ 20 and beyond

StringZilla also provides string literals for automatic type resolution, similar to STL:

using sz::literals::operator""_sv;
using std::literals::operator""sv;

auto a = "some string"; // char const *
auto b = "some string"sv; // std::string_view
auto b = "some string"_sv; // sz::string_view

Unicode Case-Folding and Case-Insensitive Search

StringZilla implements both Unicode Case Folding and Case-Insensitive UTF-8 Search. Unlike most libraries only capable of lower-casing ASCII-represented English alphabet, StringZilla covers over 1M+ codepoints. The case-folding API expects the output buffer to be at least 3× larger than the input, to accommodate for the worst-case character expansions scenarios.

char source[] = "Straße";  // German: "Street"
char destination[64];      // Must be at least 3x source length
sz_size_t result_len = sz_utf8_case_fold(source, strlen(source), destination);
// destination now contains "strasse" (7 bytes), result_len = 7

The case-insensitive search API returns a pointer to the start of the first relevant glyph in the haystack, or NULL if not found. It outputs the length of the matched haystack substring in bytes, and accepts a metadata structure to speed up repeated searches for the same needle.

sz_utf8_case_insensitive_needle_metadata_t metadata = {};
sz_size_t match_length;
sz_cptr_t match = sz_utf8_case_insensitive_find(
    haystack, haystack_len,
    needle, needle_len,
    &metadata,      // Reuse for queries with the same needle
    &match_length   // Output: bytes consumed in haystack
);

Same functionality is available in C++:

namespace sz = ashvardanian::stringzilla;

sz::string_view text = "Hello World"; // Single search
auto [offset, length] = text.utf8_case_insensitive_find("HELLO");

sz::utf8_case_insensitive_needle pattern("hello"); // Repeated searches with pre-compiled pattern
for (auto const& haystack : haystacks)
    auto match = haystack.utf8_case_insensitive_find(pattern);

Similarity Scores

StringZilla exposes high-performance, batch-oriented similarity via the stringzillas/stringzillas.h header. Use szs_device_scope_t to pick hardware and optionally limit capabilities per engine.

#include <stringzillas/stringzillas.h>

szs_device_scope_t device = NULL;
szs_device_scope_init_default(&device);

szs_levenshtein_distances_t engine = NULL;
szs_levenshtein_distances_init(0, 1, 1, 1, /*alloc*/ NULL, /*caps*/ sz_cap_serial_k, &engine);

sz_sequence_u32tape_t strings_a {data_a, offsets_a, count}; // or `sz_sequence_u64tape_t` for large inputs
sz_sequence_u32tape_t strings_b {data_b, offsets_b, count}; // or `sz_sequence_t` to pass generic containers

sz_size_t distances[count];
szs_levenshtein_distances_u32tape(engine, device, &strings_a, &strings_b, distances, sizeof(distances[0]));

szs_levenshtein_distances_free(engine);
szs_device_scope_free(device);

To target a different device, use the appropriate szs_device_scope_init_{cpu_cores,gpu_device} function. When dealing with GPU backends, make sure to use the "unified memory" allocators exposed as szs_unified_{alloc,free}. Similar stable C ABIs are exposed for other workloads as well.

  • UTF-8: szs_levenshtein_distances_utf8_{sequence,u32tape,u64tape}
  • Needleman-Wunsch: szs_needleman_wunsch_scores_{sequence,u32tape,u64tape}
  • Smith-Waterman: szs_smith_waterman_scores_{sequence,u32tape,u64tape}

Moreover, in C++ codebases one can tap into the raw templates implementing that functionality, customizing them with custom executors, SIMD plugins, etc. For that include stringzillas/similarities.hpp for C++ and stringzillas/similarities.cuh for CUDA.

#include <stringzillas/similarities.hpp>
#include <stringzilla/types.hpp>       // tape of strings
#include <fork_union.hpp>              // optional thread pool

namespace sz = ashvardanian::stringzilla;
namespace szs = ashvardanian::stringzillas;

// Pack strings into an Arrow-like tape
std::vector<std::string> left = {"kitten", "flaw"};
std::vector<std::string> right = {"sitting", "lawn"};
sz::arrow_strings_tape<char, sz::size_t, std::allocator<char>> tape_a, tape_b;
auto _ = tape_a.try_assign(left.begin(), left.end());
auto _ = tape_b.try_assign(right.begin(), right.end());

// Run on the current thread
using levenshtein_t = szs::levenshtein_distances<char, szs::linear_gap_costs_t, std::allocator<char>, sz_cap_serial_k>;
levenshtein_t engine {szs::uniform_substitution_costs_t{0,1}, szs::linear_gap_costs_t{1}};
std::size_t distances[2];
auto _ = engine(tape_a, tape_b, distances);

// Or run in parallel with a pool
fork_union::basic_pool_t pool;
auto _ = pool.try_spawn(std::thread::hardware_concurrency());
auto _ = engine(tape_a, tape_b, distances, pool);

All of the potentially failing StringZillas' interfaces return error codes, and none raise C++ exceptions. Parallelism is enabled at both collection-level and within individual pairs of large inputs.

Rolling Fingerprints

StringZilla exposes parallel fingerprinting (Min-Hashes or Count-Min-Sketches) via the stringzillas/stringzillas.h header. Use szs_device_scope_t to pick hardware and optionally limit capabilities per engine.

#include <stringzillas/stringzillas.h>

szs_device_scope_t device = NULL;
szs_device_scope_init_default(&device);

szs_fingerprints_t engine = NULL;
sz_size_t const dims = 1024; sz_size_t const window_widths[] = {4, 6, 8, 10};
szs_fingerprints_init(dims, /*alphabet*/ 256, window_widths, 4, /*alloc*/ NULL, /*caps*/ sz_cap_serial_k, &engine);

sz_sequence_u32tape_t texts = {data, offsets, count};
sz_u32_t *min_hashes = (sz_u32_t*)szs_unified_alloc(count * dims * sizeof(*min_hashes));
sz_u32_t *min_counts = (sz_u32_t*)szs_unified_alloc(count * dims * sizeof(*min_counts));
szs_fingerprints_u32tape(engine, device, &texts,
    min_hashes, dims * sizeof(*min_hashes),     // support strided matrices
    min_counts, dims * sizeof(*min_counts));    // for both output arguments

szs_fingerprints_free(engine);
szs_device_scope_free(device);

Moreover, in C++ codebases one can tap into the raw templates implementing that functionality, customizing them with custom executors, SIMD plugins, etc. For that include stringzillas/fingerprints.hpp for C++ and stringzillas/fingerprints.cuh for CUDA.

#include <stringzillas/fingerprints.hpp>
#include <stringzilla/types.hpp>       // tape of strings
#include <fork_union.hpp>              // optional thread pool

namespace sz = ashvardanian::stringzilla;
namespace szs = ashvardanian::stringzillas;

// Pack strings into an Arrow-like tape
std::vector<std::string> docs = {"alpha beta", "alpha betta"};
sz::arrow_strings_tape<char, sz::size_t, std::allocator<char>> tape;
auto _ = tape.try_assign(docs.begin(), docs.end());

// Run on the current thread with a Rabin-Karp family hasher
constexpr std::size_t dimensions_k = 256;
constexpr std::size_t window_width_k = 7;
using row_t = std::array<sz_u32_t, 256>;
using fingerprinter_t = szs::floating_rolling_hashers<sz_cap_serial_k, dimensions_k>;
fingerprinter_t engine;
auto _ = engine.try_extend(window_width_k, dimensions_k);
std::vector<row_t> hashes(docs.size()), counts(docs.size());
auto _ = engine(tape, hashes, counts);

// Or run in parallel with a pool
fork_union::basic_pool_t pool;
auto _ = pool.try_spawn(std::thread::hardware_concurrency());
auto _ = engine(tape, hashes, counts, pool);

CUDA

StringZilla provides CUDA C++ templates for composable string batch-processing operations. Different GPUs have varying warp sizes, shared memory capacities, and register counts, affecting algorithm selection, so it's important to query the gpu_specs_t via gpu_specs_fetch. For memory management, ensure that you use GPU-visible' unified memoryexposed in an STL-compatible manner as aunified_alloctemplate class. For error handling,cuda_status_textends the traditionalstatus_twith GPU-specific information. It's implicitly convertible tostatus_t, so you can use it in places expecting a status_t`.

Most algorithms can load-balance both a large number of small strings and a small number of large strings. Still, with large H100-scale GPUs, it's best to submit thousands of inputs at once.

Memory Ownership and Small String Optimization

Most operations in StringZilla don't assume any memory ownership. But in addition to the read-only search-like operations StringZilla provides a minimalistic C and C++ implementations for a memory owning string "class". Like other efficient string implementations, it uses the Small String Optimization (SSO) to avoid heap allocations for short strings.

typedef union sz_string_t {
    struct internal {
        sz_ptr_t start;
        sz_u8_t length;
        char chars[SZ_STRING_INTERNAL_SPACE]; /// Ends with a null-terminator.
    } internal;

    struct external {
        sz_ptr_t start;
        sz_size_t length;        
        sz_size_t space; /// The length of the heap-allocated buffer.
        sz_size_t padding;
    } external;

} sz_string_t;

As one can see, a short string can be kept on the stack, if it fits within internal.chars array. Before 2015 GCC string implementation was just 8 bytes, and could only fit 7 characters. Different STL implementations today have different thresholds for the Small String Optimization. Similar to GCC, StringZilla is 32 bytes in size, and similar to Clang it can fit 22 characters on stack. Our layout might be preferential, if you want to avoid branches. If you use a different compiler, you may want to check its SSO buffer size with a simple Gist.

libstdc++ in GCC 13 libc++ in Clang 17 StringZilla
String sizeof 32 24 32
Inner Capacity 15 22 22

This design has been since ported to many high-level programming languages. Swift, for example, can store 15 bytes in the String instance itself. StringZilla implements SSO at the C level, providing the sz_string_t union and a simple API for primary operations.

sz_memory_allocator_t allocator;
sz_string_t string;

// Init and make sure we are on stack
sz_string_init(&string);
sz_string_is_on_stack(&string); // == sz_true_k

// Optionally pre-allocate space on the heap for future insertions.
sz_string_grow(&string, 100, &allocator); // == sz_true_k

// Append, erase, insert into the string.
sz_string_expand(&string, 0, "_Hello_", 7, &allocator); // == sz_true_k
sz_string_expand(&string, SZ_SIZE_MAX, "world", 5, &allocator); // == sz_true_k
sz_string_erase(&string, 0, 1);

// Unpacking & introspection.
sz_ptr_t string_start;
sz_size_t string_length;
sz_size_t string_space;
sz_bool_t string_is_external;
sz_string_unpack(string, &string_start, &string_length, &string_space, &string_is_external);
sz_equal(string_start, "Hello_world", 11); // == sz_true_k

// Reclaim some memory.
sz_string_shrink_to_fit(&string, &allocator); // == sz_true_k
sz_string_free(&string, &allocator);

Unlike the conventional C strings, the sz_string_t is allowed to contain null characters. To safely print those, pass the string_length to printf as well.

printf("%.*s\n", (int)string_length, string_start);

What's Wrong with the C Standard Library?

StringZilla is not a drop-in replacement for the C Standard Library. It's designed to be a safer and more modern alternative. Conceptually:

  1. LibC strings are expected to be null-terminated, so to use the efficient LibC implementations on slices of larger strings, you'd have to copy them, which is more expensive than the original string operation.
  2. LibC functionality is asymmetric - you can find the first and the last occurrence of a character within a string, but you can't find the last occurrence of a substring.
  3. LibC function names are typically very short and cryptic.
  4. LibC lacks crucial functionality like hashing and doesn't provide primitives for less critical but relevant operations like fuzzy matching.

Something has to be said about its support for UTF-8. Aside from a single-byte char type, LibC provides wchar_t:

  • The size of wchar_t is not consistent across platforms. On Windows, it's typically 16 bits (suitable for UTF-16), while on Unix-like systems, it's usually 32 bits (suitable for UTF-32). This inconsistency can lead to portability issues when writing cross-platform code.
  • wchar_t is designed to represent wide characters in a fixed-width format (UTF-16 or UTF-32). In contrast, UTF-8 is a variable-length encoding, where each character can take from 1 to 4 bytes. This fundamental difference means that wchar_t and UTF-8 are incompatible.

StringZilla partially addresses those issues.

What's Wrong with the C++ Standard Library?

C++ Code Evaluation Result Invoked Signature
"Loose"s.replace(2, 2, "vath"s, 1) "Loathe" 🤢 (pos1, count1, str2, pos2)
"Loose"s.replace(2, 2, "vath", 1) "Love" 🥰 (pos1, count1, str2, count2)

StringZilla is designed to be a drop-in replacement for the C++ Standard Templates Library. That said, some of the design decisions of STL strings are highly controversial, error-prone, and expensive. Most notably:

  1. Argument order for replace, insert, erase and similar functions is impossible to guess.
  2. Bounds-checking exceptions for substr-like functions are only thrown for one side of the range.
  3. Returning string copies in substr-like functions results in absurd volume of allocations.
  4. Incremental construction via push_back-like functions goes through too many branches.
  5. Inconsistency between string and string_view methods, like the lack of remove_prefix and remove_suffix.

Check the following set of asserts validating the std::string specification. It's not realistic to expect the average developer to remember the 14 overloads of std::string::replace.

using str = std::string;

assert(str("hello world").substr(6) == "world");
assert(str("hello world").substr(6, 100) == "world"); // 106 is beyond the length of the string, but its OK
assert_throws(str("hello world").substr(100), std::out_of_range);   // 100 is beyond the length of the string
assert_throws(str("hello world").substr(20, 5), std::out_of_range); // 20 is beyond the length of the string
assert_throws(str("hello world").substr(-1, 5), std::out_of_range); // -1 casts to unsigned without any warnings...
assert(str("hello world").substr(0, -1) == "hello world");          // -1 casts to unsigned without any warnings...

assert(str("hello").replace(1, 2, "123") == "h123lo");
assert(str("hello").replace(1, 2, str("123"), 1) == "h23lo");
assert(str("hello").replace(1, 2, "123", 1) == "h1lo");
assert(str("hello").replace(1, 2, "123", 1, 1) == "h2lo");
assert(str("hello").replace(1, 2, str("123"), 1, 1) == "h2lo");
assert(str("hello").replace(1, 2, 3, 'a') == "haaalo");
assert(str("hello").replace(1, 2, {'a', 'b'}) == "hablo");

To avoid those issues, StringZilla provides an alternative consistent interface. It supports signed arguments, and doesn't have more than 3 arguments per function or The standard API and our alternative can be conditionally disabled with SZ_SAFETY_OVER_COMPATIBILITY=1. When it's enabled, the subjectively risky overloads from the Standard will be disabled.

using str = sz::string;

str("a:b").front(1) == "a"; // no checks, unlike `substr`
str("a:b").front(2) == "2"; // take first 2 characters
str("a:b").back(-1) == "b"; // accepting negative indices
str("a:b").back(-2) == ":b"; // similar to Python's `"a:b"[-2:]`
str("a:b").sub(1, -1) == ":"; // similar to Python's `"a:b"[1:-1]`
str("a:b").sub(-2, -1) == ":"; // similar to Python's `"a:b"[-2:-1]`
str("a:b").sub(-2, 1) == ""; // similar to Python's `"a:b"[-2:1]`
"a:b"_sv[{-2, -1}] == ":"; // works on views and overloads `operator[]`

Assuming StringZilla is a header-only library you can use the full API in some translation units and gradually transition to safer restricted API in others. Bonus - all the bound checking is branchless, so it has a constant cost and won't hurt your branch predictor.

Beyond the C++ Standard Library - Learning from Python

Python is arguably the most popular programming language for data science. In part, that's due to the simplicity of its standard interfaces. StringZilla brings some of that functionality to C++.

  • Content checks: isalnum, isalpha, isascii, isdigit, islower, isspace, isupper.
  • Trimming character sets: lstrip, rstrip, strip.
  • Trimming string matches: remove_prefix, remove_suffix.
  • Ranges of search results: splitlines, split, rsplit.
  • Number of non-overlapping substring matches: count.
  • Partitioning: partition, rpartition.

For example, when parsing documents, it is often useful to split it into substrings. Most often, after that, you would compute the length of the skipped part, the offset and the length of the remaining part. This results in a lot of pointer arithmetic and is error-prone. StringZilla provides a convenient partition function, which returns a tuple of three string views, making the code cleaner.

auto parts = haystack.partition(':'); // Matching a character
auto [before, match, after] = haystack.partition(':'); // Structure unpacking
auto [before, match, after] = haystack.partition(sz::byteset(":;")); // Character-set argument
auto [before, match, after] = haystack.partition(" : "); // String argument
auto [before, match, after] = haystack.rpartition(sz::whitespaces_set()); // Split around the last whitespace

Combining those with the split function, one can easily parse a CSV file or HTTP headers.

for (auto line : haystack.split("\r\n")) {
    auto [key, _, value] = line.partition(':');
    headers[key.strip()] = value.strip();
}

Some other extensions are not present in the Python standard library either. Let's go through the C++ functionality category by category.

Some of the StringZilla interfaces are not available even Python's native str class. Here is a sneak peek of the most useful ones.

text.hash(); // -> 64 bit unsigned integer 
text.ssize(); // -> 64 bit signed length to avoid `static_cast<std::ssize_t>(text.size())`
text.contains_only(" \w\t"); // == text.find_first_not_of(sz::byteset(" \w\t")) == npos;
text.contains(sz::whitespaces_set()); // == text.find(sz::byteset(sz::whitespaces_set())) != npos;

// Simpler slicing than `substr`
text.front(10); // -> sz::string_view
text.back(10); // -> sz::string_view

// Safe variants, which clamp the range into the string bounds
using sz::string::cap;
text.front(10, cap) == text.front(std::min(10, text.size()));
text.back(10, cap) == text.back(std::min(10, text.size()));

// Character set filtering
text.lstrip(sz::whitespaces_set()).rstrip(sz::newlines_set()); // like Python
text.front(sz::whitespaces_set()); // all leading whitespaces
text.back(sz::digits_set()); // all numerical symbols forming the suffix

// Incremental construction
using sz::string::unchecked;
text.push_back('x'); // no surprises here
text.push_back('x', unchecked); // no bounds checking, Rust style
text.try_push_back('x'); // returns `false` if the string is full and the allocation failed

sz::concatenate(text, "@", domain, ".", tld); // No allocations

Splits and Ranges

One of the most common use cases is to split a string into a collection of substrings. Which would often result in StackOverflow lookups and snippets like the one below.

std::vector<std::string> lines = split(haystack, "\r\n"); // string delimiter
std::vector<std::string> words = split(lines, ' '); // character delimiter

Those allocate memory for each string and the temporary vectors. Each allocation can be orders of magnitude more expensive, than even serial for-loop over characters. To avoid those, StringZilla provides lazily-evaluated ranges, compatible with the Range-v3 library.

for (auto line : haystack.split("\r\n"))
    for (auto word : line.split(sz::byteset(" \w\t.,;:!?")))
        std::cout << word << std::endl;

Each of those is available in reverse order as well. It also allows interleaving matches, if you want both inclusions of xx in xxx. Debugging pointer offsets is not a pleasant exercise, so keep the following functions in mind.

  • haystack.[r]find_all(needle, interleaving)
  • haystack.[r]find_all(sz::byteset(""))
  • haystack.[r]split(needle)
  • haystack.[r]split(sz::byteset(""))

For $N$ matches the split functions will report $N+1$ matches, potentially including empty strings. Ranges have a few convenience methods as well:

range.size(); // -> std::size_t
range.empty(); // -> bool
range.template to<std::set<std::sting>>(); 
range.template to<std::vector<std::sting_view>>(); 

Concatenating Strings without Allocations

Another common string operation is concatenation. The STL provides std::string::operator+ and std::string::append, but those are not very efficient, if multiple invocations are performed.

std::string name, domain, tld;
auto email = name + "@" + domain + "." + tld; // 4 allocations

The efficient approach would be to pre-allocate the memory and copy the strings into it.

std::string email;
email.reserve(name.size() + domain.size() + tld.size() + 2);
email.append(name), email.append("@"), email.append(domain), email.append("."), email.append(tld);

That's mouthful and error-prone. StringZilla provides a more convenient concatenate function, which takes a variadic number of arguments. It also overrides the operator| to concatenate strings lazily, without any allocations.

auto email = sz::concatenate(name, "@", domain, ".", tld);   // 0 allocations
auto email = name | "@" | domain | "." | tld;                // 0 allocations
sz::string email = name | "@" | domain | "." | tld;          // 1 allocations

Random Generation

Software developers often need to generate random strings for testing purposes. The STL provides std::generate and std::random_device, that can be used with StringZilla.

sz::string random_string(std::size_t length, char const *alphabet, std::size_t cardinality) {
    sz::string result(length, '\0');
    static std::random_device seed_source; // Expensive to construct - due to system calls
    static std::mt19937 generator(seed_source()); // Also expensive - due to the state size
    std::uniform_int_distribution<std::size_t> distribution(0, cardinality);
    std::generate(result.begin(), result.end(), [&]() { return alphabet[distribution(generator)]; });
    return result;
}

Mouthful and slow. StringZilla provides a C native method - sz_fill_random and a convenient C++ wrapper - sz::generate. Similar to Python it also defines the commonly used character sets.

auto protein = sz::string::random(300, "ARNDCQEGHILKMFPSTWYV"); // static method
auto dna = sz::basic_string<custom_allocator>::random(3_000_000_000, "ACGT");

dna.fill_random("ACGT"); // `noexcept` pre-allocated version
dna.fill_random(&std::rand, "ACGT"); // pass any generator, like `std::mt19937`

char uuid[36];
sz::fill_random(sz::string_span(uuid, 36), "0123456789abcdef-"); // Overwrite any buffer

Bulk Replacements

In text processing, it's often necessary to replace all occurrences of a specific substring or set of characters within a string. Standard library functions may not offer the most efficient or convenient methods for performing bulk replacements, especially when dealing with large strings or performance-critical applications.

  • haystack.replace_all(needle_string, replacement_string)
  • haystack.replace_all(sz::byteset(""), replacement_string)
  • haystack.try_replace_all(needle_string, replacement_string)
  • haystack.try_replace_all(sz::byteset(""), replacement_string)
  • haystack.lookup(sz::look_up_table::identity())
  • haystack.lookup(sz::look_up_table::identity(), haystack.data())

Sorting in C and C++

LibC provides qsort and STL provides std::sort. Both have their quirks. The LibC standard has no way to pass a context to the comparison function, that's only possible with platform-specific extensions. Those have different arguments order on every OS.

// Linux: https://linux.die.net/man/3/qsort_r
void qsort_r(void *elements, size_t count, size_t element_width, 
    int (*compare)(void const *left, void const *right, void *context),
    void *context);
// macOS and FreeBSD: https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/qsort_r.3.html
void qsort_r(void *elements, size_t count, size_t element_width, 
    void *context,
    int (*compare)(void *context, void const *left, void const *right));
// Windows conflicts with ISO `qsort_s`: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
void qsort_s(id *elements, size_t count, size_t element_width, 
    int (*compare)(void *context, void const *left, void const *right),
    void *context);

C++ generic algorithm is not perfect either. There is no guarantee in the standard that std::sort won't allocate any memory. If you are running on embedded, in real-time or on 100+ CPU cores per node, you may want to avoid that. StringZilla doesn't solve the general case, but hopes to improve the performance for strings. Use sz_sequence_argsort, or the high-level sz::argsort, which can be used sort any collection of elements convertible to sz::string_view.

std::vector<std::string> data({"c", "b", "a"});
std::vector<std::size_t> order = sz::argsort(data); //< Simple shortcut

// Or, taking care of memory allocation:
sz::argsort(data.begin(), data.end(), order.data(), [](auto const &x) -> sz::string_view { return x; });

Standard C++ Containers with String Keys

The C++ Standard Templates Library provides several associative containers, often used with string keys.

std::map<std::string, int, std::less<std::string>> sorted_words;
std::unordered_map<std::string, int, std::hash<std::string>, std::equal_to<std::string>> words;

The performance of those containers is often limited by the performance of the string keys, especially on reads. StringZilla can be used to accelerate containers with std::string keys, by overriding the default comparator and hash functions.

std::map<std::string, int, sz::less> sorted_words;
std::unordered_map<std::string, int, sz::hash, sz::equal_to> words;

Alternatively, a better approach would be to use the sz::string class as a key. The right hash function and comparator would be automatically selected and the performance gains would be more noticeable if the keys are short.

std::map<sz::string, int> sorted_words;
std::unordered_map<sz::string, int> words;

Compilation Settings and Debugging

SZ_DEBUG:

For maximal performance, the C library does not perform any bounds checking in Release builds. In C++, bounds checking happens only in places where the STL std::string would do it. If you want to enable more aggressive bounds-checking, define SZ_DEBUG before including the header. If not explicitly set, it will be inferred from the build type.

SZ_USE_GOLDMONT, SZ_USE_WESTMERE, SZ_USE_HASWELL, SZ_USE_SKYLAKE, SZ_USE_ICE, SZ_USE_NEON, SZ_USE_NEON_AES, SZ_USE_NEON_SHA, SZ_USE_SVE, SZ_USE_SVE2, SZ_USE_SVE2_AES:

One can explicitly disable certain families of SIMD instructions for compatibility purposes. Default values are inferred at compile time depending on compiler support (for dynamic dispatch) and the target architecture (for static dispatch).

SZ_USE_CUDA, SZ_USE_KEPLER, SZ_USE_HOPPER:

One can explicitly disable certain families of PTX instructions for compatibility purposes. Default values are inferred at compile time depending on compiler support (for dynamic dispatch) and the target architecture (for static dispatch).

SZ_ENFORCE_SVE_OVER_NEON:

SVE and SVE2 are expected to supersede NEON on ARM architectures. Still, oftentimes the equivalent SVE kernels are slower due to equally small register files and higher complexity of the instructions. By default, when both SVE and NEON are available, SVE is used selectively only for the algorithms that benefit from it. If you want to enforce SVE usage everywhere, define this flag.

SZ_DYNAMIC_DISPATCH:

By default, StringZilla is a header-only library. But if you are running on different generations of devices, it makes sense to pre-compile the library for all supported generations at once, and dispatch at runtime. This flag does just that and is used to produce the stringzilla.so shared library, as well as the Python bindings.

SZ_USE_MISALIGNED_LOADS:

Default is platform-dependent: enabled on x86 (where unaligned accesses are fast), disabled on others by default. When enabled, many byte-level operations use word-sized loads, which can significantly accelerate the serial (SWAR) backend. Consider enabling it explicitly if you are targeting platforms that support fast unaligned loads.

SZ_AVOID_LIBC and SZ_OVERRIDE_LIBC:

When using the C header-only library one can disable the use of LibC. This may affect the type resolution system on obscure hardware platforms. Moreover, one may let stringzilla override the common symbols like the memcpy and memset with its own implementations. In that case you can use the LD_PRELOAD trick to prioritize its symbols over the ones from the LibC and accelerate existing string-heavy applications without recompiling them. It also adds a layer of security, as the stringzilla isn't undefined for NULL inputs like memcpy(NULL, NULL, 0).

SZ_AVOID_STL and SZ_SAFETY_OVER_COMPATIBILITY:

When using the C++ interface one can disable implicit conversions from std::string to sz::string and back. If not needed, the <string> and <string_view> headers will be excluded, reducing compilation time. Moreover, if STL compatibility is a low priority, one can make the API safer by disabling the overloads, which are subjectively error prone.

STRINGZILLA_BUILD_SHARED, STRINGZILLA_BUILD_TEST, STRINGZILLA_BUILD_BENCHMARK, STRINGZILLA_TARGET_ARCH for CMake users:

When compiling the tests and benchmarks, you can explicitly set the target hardware architecture. It's synonymous to GCC's -march flag and is used to enable/disable the appropriate instruction sets. You can also disable the shared library build, if you don't need it.

Quick Start: Rust

StringZilla is available as a Rust crate, with documentation available on docs.rs/stringzilla. You can immediately check the installed version and the used hardware capabilities with following commands:

cargo add stringzilla
cargo run --example version

To use the latest crate release in your project, add the following to your Cargo.toml:

[dependencies]
stringzilla = ">=3"                                     # for serial algorithms
stringzilla = { version = ">=3", features = ["cpus"] }  # for parallel multi-CPU backends
stringzilla = { version = ">=3", features = ["cuda"] }  # for parallel Nvidia GPU backend

Or if you want to use the latest pre-release version from the repository:

[dependencies]
stringzilla = { git = "https://github.com/ashvardanian/stringzilla", branch = "main-dev" }

Once installed, all of the functionality is available through the stringzilla namespace. Many interfaces will look familiar to the users of the memchr Rust crate.

use stringzilla::sz;

// Identical to `memchr::memmem::find` and `memchr::memmem::rfind` functions
sz::find("Hello, world!", "world") // 7
sz::rfind("Hello, world!", "world") // 7

// Generalizations of `memchr::memrchr[123]`
sz::find_byte_from("Hello, world!", "world") // 2
sz::rfind_byte_from("Hello, world!", "world") // 11

It also provides no constraints on the size of the character set, while memchr allows only 1, 2, or 3 characters. In addition to global functions, stringzilla provides a StringZilla extension trait:

use stringzilla::StringZilla;

let my_string: String = String::from("Hello, world!");
let my_str = my_string.as_str();
let my_cow_str = Cow::from(&my_string);

// Use the generic function with a String
assert_eq!(my_string.sz_find("world"), Some(7));
assert_eq!(my_string.sz_rfind("world"), Some(7));
assert_eq!(my_string.sz_find_byte_from("world"), Some(2));
assert_eq!(my_string.sz_rfind_byte_from("world"), Some(11));
assert_eq!(my_string.sz_find_byte_not_from("world"), Some(0));
assert_eq!(my_string.sz_rfind_byte_not_from("world"), Some(12));

// Same works for &str and Cow<'_, str>
assert_eq!(my_str.sz_find("world"), Some(7));
assert_eq!(my_cow_str.as_ref().sz_find("world"), Some(7));

Hash

Single-shot and incremental hashing are both supported:

let mut hasher = sz::Hasher::new(42);
hasher.write(b"Hello, ");
hasher.write(b"world!");
let streamed = hasher.finish();

let mut hasher = sz::Hasher::new(42);
hasher.write(b"Hello, world!");
assert_eq!(streamed, hasher.finish());

To use StringZilla with std::collections:

use std::collections::HashMap;
let mut map: HashMap<&str, i32, sz::BuildSzHasher> =
    HashMap::with_hasher(sz::BuildSzHasher::with_seed(42));
map.insert("a", 1);
assert_eq!(map.get("a"), Some(&1));

SHA-256 Checksums

SHA-256 cryptographic checksums are available:

use stringzilla::sz;

// One-shot SHA-256
let digest = sz::Sha256::hash(b"Hello, world!");
assert_eq!(digest.len(), 32);

// Incremental SHA-256
let mut hasher = sz::Sha256::new();
hasher.update(b"Hello, ");
hasher.update(b"world!");
let digest = hasher.digest();

// HMAC-SHA256 for message authentication
let mac = sz::hmac_sha256(b"secret", b"Hello, world!");

Unicode Case-Folding and Case-Insensitive Search

StringZilla implements both Unicode Case Folding and Case-Insensitive UTF-8 Search. Unlike most libraries only capable of lower-casing ASCII-represented English alphabet, StringZilla covers over 1M+ codepoints. The case-folding API expects the output buffer to be at least 3× larger than the input, to accommodate for the worst-case character expansions scenarios.

use stringzilla::stringzilla as sz;

let source = "Straße";           // German: "Street"
let mut dest = [0u8; 64];        // Must be at least 3x source length
let len = sz::utf8_case_fold(source, &mut dest);
assert_eq!(&dest[..len], b"strasse");  // ß (2 bytes) → "ss" (2 bytes)

The case-insensitive search returns Some((offset, matched_length)) or None. The matched_length may differ from needle length due to expansions.

use stringzilla::stringzilla::{utf8_case_insensitive_find, Utf8CaseInsensitiveNeedle};

// Single search — ß (C3 9F) matches "SS"
if let Some((offset, len)) = utf8_case_insensitive_find("Straße", "STRASSE") {
    assert_eq!(offset, 0);
    assert_eq!(len, 7);  // "Straße" is 7 bytes
}

// Repeated searches with pre-compiled needle metadata
let needle = Utf8CaseInsensitiveNeedle::new(b"STRASSE");
for haystack in &["Straße", "STRASSE", "strasse"] {
    if let Some((offset, len)) = utf8_case_insensitive_find(haystack, &needle) {
        println!("Found at byte {} with length {}", offset, len);
    }
}

Similarity Scores

StringZilla exposes high-performance, batch-oriented similarity via the szs module. Use DeviceScope to pick hardware and optionally limit capabilities per engine.

use stringzilla::szs; // re-exported as `szs`

let cpu_scope = szs::DeviceScope::cpu_cores(4).unwrap();    // force CPU-only
let gpu_scope = szs::DeviceScope::gpu_device(0).unwrap();   // pick GPU 0 if available
let strings_a = vec!["kitten", "flaw"];
let strings_b = vec!["sitting", "lawn"];

let engine = szs::LevenshteinDistances::new(
    &cpu_scope,
    0,  // match cost
    2,  // mismatch cost - costs don't have to be 1
    3,  // open cost - may be different in Bio
    1,  // extend cost
).unwrap();
let distances = engine.compute(&cpu_scope, &strings_a, &strings_b).unwrap();
assert_eq!(distances[0], 3);
assert_eq!(distances[1], 2);

Note, that this computes byte-level distances. For UTF-8 codepoints, use a different engine class:

let strings_a = vec!["café", "αβγδ"];
let strings_b = vec!["cafe", "αγδ"];
let engine = szs::LevenshteinDistancesUtf8::new(&cpu_scope, 0, 1, 1, 1).unwrap();
let distances = engine.compute(&cpu_scope, &strings_a, &strings_b).unwrap();
assert_eq!(distances, vec![1, 1]);

Similarly, for variable substitution costs, also pass in a a weights matrix:

let mut substitution_matrix = [-1i8; 256 * 256];
for i in 0..256 { substitution_matrix[i * 256 + i] = 0; }
let engine = szs::NeedlemanWunschScores::new(&cpu_scope, &substitution_matrix, -3, -1).unwrap();
let scores = engine.compute(&cpu_scope, &strings_a, &strings_b).unwrap();

Or for local alignment scores:

let engine = szs::SmithWatermanScores::new(&cpu_scope, &substitution_matrix, -3, -1).unwrap();
let local_scores = engine.compute(&cpu_scope, &strings_a, &strings_b).unwrap();

For high-performance applications, use the StringTape crate to pass strings to compute_into methods without extra memory allocations:

use stringzilla::{szs, StringTape};

// Create StringTape from data (zero-copy compatible format)
let tape_a = StringTape::from_strings(&["kitten", "sitting", "flaw"]);
let tape_b = StringTape::from_strings(&["sitting", "kitten", "lawn"]);

// Use unified memory vector for GPU compatibility, initialized with max values
let mut distances = szs::UnifiedVec::<u32>::from_elem(u32::MAX, tape_a.len());

let engine = szs::LevenshteinDistances::new(&gpu_scope, 0, 1, 1, 1).unwrap();
engine.compute_into(&gpu_scope, &tape_a, &tape_b, &mut distances).unwrap();

// Results computed directly into unified memory, accessible from both CPU/GPU
assert_eq!(distances[0], 3);  // kitten -> sitting
assert_eq!(distances[1], 3);  // sitting -> kitten
assert_eq!(distances[2], 2);  // flaw -> lawn

Rolling Fingerprints

MinHashing is a common technique for Information Retrieval, producing compact representations of large documents. For $D$ hash-functions and a text of length $L$, in the worst case it involves computing $O(D \cdot L)$ hashes.

use stringzilla::szs;

let texts = vec![
    "quick brown fox jumps over the lazy dog",
    "quick brown fox jumped over a very lazy dog",
];
let cpu = szs::DeviceScope::cpu_cores(4).unwrap();
let ndim = 1024;
let window_widths = vec![4u64, 6, 8, 10];

let engine = szs::Fingerprints::new(
    ndim,           // number of hash functions & dimensions
    &window_widths, // optional predefined window widths
    256,            // default alphabet size for byte strings
    &cpu            // device scope
).unwrap();

let (hashes, counts) = engine.compute(&cpu, &texts).unwrap();
assert_eq!(hashes.len(), texts.len() * ndim);
assert_eq!(counts.len(), texts.len() * ndim);

For zero-copy processing with StringTape format and unified memory:

use stringzilla::{szs, StringTape};

let tape = StringTape::from_strings(&[
    "quick brown fox jumps over the lazy dog",
    "quick brown fox jumped over a very lazy dog",
]);

// Pre-allocate unified memory buffers
let mut hashes = szs::UnifiedVec::<u32>::from_elem(u32::MAX, tape.len() * ndim);
let mut counts = szs::UnifiedVec::<u32>::from_elem(u32::MAX, tape.len() * ndim);

let engine = szs::Fingerprints::new(ndim, &window_widths, 256, &cpu).unwrap();
engine.compute_into(&cpu, &tape, &mut hashes, &mut counts).unwrap();

// Results computed directly into unified memory buffers
assert!(hashes.iter().any(|&h| h != u32::MAX));  // Verify computation occurred
assert!(counts.iter().any(|&c| c != u32::MAX));

Quick Start: JavaScript

Install the Node.js package and use zero-copy Buffer APIs.

npm install stringzilla
node -p "require('stringzilla').default.capabilities" # for CommonJS
node -e "import('stringzilla').then(m=>console.log(m.default.capabilities)).catch(console.error)" # for ESM
import sz from 'stringzilla';

const haystack = Buffer.from('Hello, world!');
const needle = Buffer.from('world');

// Substring search (BigInt offsets)
const firstIndex = sz.find(haystack, needle);      // 7n
const lastIndex = sz.findLast(haystack, needle);   // 7n

// Character / charset search
const firstOIndex = sz.findByte(haystack, 'o'.charCodeAt(0));                 // 4n
const firstVowelIndex = sz.findByteFrom(haystack, Buffer.from('aeiou'));      // 1n
const lastVowelIndex = sz.findLastByteFrom(haystack, Buffer.from('aeiou'));   // 8n

// Counting (optionally overlapping)
const lCount = sz.count(haystack, Buffer.from('l'));                // 3n
const llOverlapCount = sz.count(haystack, Buffer.from('ll'), true); // 1n

// Equality/ordering utilities
const isEqual = sz.equal(Buffer.from('a'), Buffer.from('a'));
const order = sz.compare(Buffer.from('a'), Buffer.from('b')); // -1, 0, or 1

// Other helpers
const byteSum = sz.byteSum(haystack); // sum of bytes as BigInt

Unicode Case-Folding and Case-Insensitive Search

StringZilla provides full Unicode case folding (including expansions like ß → ss, ligatures like fi → fi, and special folds like µ → μ, K → k) and a case-insensitive substring search that accounts for those expansions.

import sz from "stringzilla";

// Case folding (returns a UTF-8 Buffer)
console.log(sz.utf8CaseFold(Buffer.from("Straße")).toString("utf8")); // "strasse"
console.log(sz.utf8CaseFold(Buffer.from("office")).toString("utf8"));  // "office" (U+FB01 ligature)

// Case-insensitive substring search (full Unicode case folding)
const text = Buffer.from(
    "Die Temperaturschwankungen im kosmischen Mikrowellenhintergrund sind ein Maß von etwa 20 µK.\n" +
    "Typografisch sieht man auch: ein Maß von etwa 20 μK."
);
const patternBytes = Buffer.from("EIN MASS VON ETWA 20 μK");

const first = sz.utf8CaseInsensitiveFind(text, patternBytes);
console.log(first); // { index: 69n, length: ... } (byte offsets)

// Reuse the same needle efficiently
const pattern = new sz.Utf8CaseInsensitiveNeedle(patternBytes);
const again = pattern.findIn(text);
console.log(again.index === first.index);

Hash

Single-shot and incremental hashing are both supported:

import sz from 'stringzilla';

// One-shot - stable 64-bit output across all platforms!
const hash = sz.hash(Buffer.from('Hello, world!'), 42); // returns BigInt

// Incremental updates - hasher maintains state
const hasher = new sz.Hasher(42); // seed: 42
hasher.update(Buffer.from('Hello, '));
hasher.update(Buffer.from('world!'));
const streamedHash = hasher.digest(); // returns BigInt
console.assert(hash === streamedHash);

SHA-256 Checksums

SHA-256 cryptographic checksums are available:

import sz from 'stringzilla';

// One-shot SHA-256
const digest = sz.sha256(Buffer.from('Hello, world!')); // returns Buffer (32 bytes)

// Incremental SHA-256
const hasher = new sz.Sha256();
hasher.update(Buffer.from('Hello, '));
hasher.update(Buffer.from('world!'));
const digestBuffer = hasher.digest();     // returns Buffer (32 bytes)
const digestHex = hasher.hexdigest();     // returns string (64 hex chars)

Quick Start: Swift

StringZilla can be added as a dependency in the Swift Package Manager. In your Package.swift file, add the following:

dependencies: [
    .package(url: "https://github.com/ashvardanian/stringzilla")
]

The package currently covers only the most basic functionality, but is planned to be extended to cover the full C++ API.

var s = "Hello, world! Welcome to StringZilla. 👋"
s[s.findFirst(substring: "world")!...] // "world! Welcome to StringZilla. 👋"
s[s.findLast(substring: "o")!...] // "o StringZilla. 👋"
s[s.findFirst(characterFrom: "aeiou")!...] // "ello, world! Welcome to StringZilla. 👋"
s[s.findLast(characterFrom: "aeiou")!...] // "a. 👋")
s[s.findFirst(characterNotFrom: "aeiou")!...] // "Hello, world! Welcome to StringZilla. 👋"

Unicode Case-Folding and Case-Insensitive Search

import StringZilla

let folded = "Straße".utf8CaseFoldedBytes()
print(String(decoding: folded, as: UTF8.self)) // "strasse"

let haystack =
    "Die Temperaturschwankungen im kosmischen Mikrowellenhintergrund sind ein Maß von etwa 20 µK.\n"
    + "Typografisch sieht man auch: ein Maß von etwa 20 μK."
let needle = "EIN MASS VON ETWA 20 μK"

if let range = haystack.utf8CaseInsensitiveFind(substring: needle) {
    print(haystack[range]) // "ein Maß von etwa 20 µK"
}

// Reuse the same needle efficiently
let compiledNeedle = Utf8CaseInsensitiveNeedle(needle)
if let range = compiledNeedle.findFirst(in: haystack) {
    print(haystack[range])
}

Hash

StringZilla provides high-performance hashing for Swift strings:

import StringZilla

// One-shot hashing - stable 64-bit output across all platforms!
let hash = "Hello, world!".hash(seed: 42)

// Incremental hashing for streaming data
var hasher = StringZillaHasher(seed: 42)
hasher.update("Hello, ")
hasher.update("world!")
let streamedHash = hasher.digest()
assert(hash == streamedHash)

SHA-256 Checksums

SHA-256 cryptographic checksums are available:

import StringZilla

// One-shot SHA-256
let digest = "Hello, world!".sha256() // returns [UInt8] (32 bytes)

// Incremental SHA-256
var hasher = StringZillaSha256()
hasher.update("Hello, ")
hasher.update("world!")
let digestBytes = hasher.digest()     // [UInt8] (32 bytes)
let digestHex = hasher.hexdigest()    // String (64 hex chars)

Quick Start: GoLang

Add the Go binding as a module dependency:

go get github.com/ashvardanian/stringzilla/golang@latest

Build the shared C library once, then ensure your runtime can locate it (Linux shown):

cmake -B build_shared -D STRINGZILLA_BUILD_SHARED=1 -D CMAKE_BUILD_TYPE=Release
cmake --build build_shared --target stringzilla_shared --config Release
export LD_LIBRARY_PATH="$PWD/build_shared:$LD_LIBRARY_PATH"

Use finders (substring, bytes, and sets):

package main

import (
    "fmt"
    sz "github.com/ashvardanian/stringzilla/golang"
)

func main() {
    s := "the quick brown fox jumps over the lazy dog"

    // Substrings
    fmt.Println(sz.Contains(s, "brown"))        // true
    fmt.Println(sz.Index(s, "the"))             // 0
    fmt.Println(sz.LastIndex(s, "the"))         // 35

    // Single bytes
    fmt.Println(sz.IndexByte(s, 'o'))            // 12
    fmt.Println(sz.LastIndexByte(s, 'o'))        // 41

    // Byte sets
    fmt.Println(sz.IndexAny(s, "aeiou"))        // 2  (first vowel)
    fmt.Println(sz.LastIndexAny(s, "aeiou"))    // 43 (last vowel)

    // Counting with/without overlaps
    fmt.Println(sz.Count("aaaaa", "aa", false)) // 2
    fmt.Println(sz.Count("aaaaa", "aa", true))  // 4
    fmt.Println(sz.Count("abc", "", false))     // 4
    fmt.Println(sz.Bytesum("ABC"), sz.Bytesum("ABCD"))
}

Unicode Case-Folding and Case-Insensitive Search

package main

import (
    "fmt"
    sz "github.com/ashvardanian/stringzilla/golang"
)

func main() {
    folded, _ := sz.Utf8CaseFold("Straße", true)
    fmt.Println(folded) // "strasse"

    haystack := "Die Temperaturschwankungen im kosmischen Mikrowellenhintergrund sind ein Maß von etwa 20 µK.\n" +
        "Typografisch sieht man auch: ein Maß von etwa 20 μK."
    needle := "EIN MASS VON ETWA 20 μK"

    start64, len64, _ := sz.Utf8CaseInsensitiveFind(haystack, needle, true)
    start, end := int(start64), int(start64+len64)
    fmt.Println(haystack[start:end]) // "ein Maß von etwa 20 µK"

    // Reuse the same needle efficiently
    compiled, _ := sz.NewUtf8CaseInsensitiveNeedle(needle, true)
    start64, len64, _ = compiled.FindIn(haystack, true)
    start, end = int(start64), int(start64+len64)
    fmt.Println(haystack[start:end])
}

Hash

Single-shot and incremental hashing are both supported. The Hasher type implements Go's standard hash.Hash64 and io.Writer interfaces:

import (
    "io"
    sz "github.com/ashvardanian/stringzilla/golang"
)

// One-shot hashing
one := sz.Hash("Hello, world!", 42)

// Streaming hasher (implements hash.Hash64 and io.Writer)
hasher := sz.NewHasher(42)
hasher.Write([]byte("Hello, "))
hasher.Write([]byte("world!"))
streamed := hasher.Digest()         // or hasher.Sum64()
fmt.Println(one == streamed)        // true

// Works with io.Copy and any io.Reader
file, _ := os.Open("data.txt")
hasher.Reset()
io.Copy(hasher, file)
fileHash := hasher.Sum64()

SHA-256 Checksums

SHA-256 cryptographic checksums are available. The Sha256 type implements Go's standard hash.Hash and io.Writer interfaces:

import (
    "io"
    sz "github.com/ashvardanian/stringzilla/golang"
)

// One-shot SHA-256
digest := sz.HashSha256([]byte("Hello, world!"))
fmt.Printf("%x\n", digest)          // prints 32-byte hash in hex

// Streaming SHA-256 (implements hash.Hash and io.Writer)
hasher := sz.NewSha256()
hasher.Write([]byte("Hello, "))
hasher.Write([]byte("world!"))
digestBytes := hasher.Digest()      // [32]byte
digestHex := hasher.Hexdigest()     // string (64 hex chars)

// Works with io.Copy and any io.Reader
file, _ := os.Open("data.bin")
hasher.Reset()
io.Copy(hasher, file)
fileDigest := hasher.Digest()

// Standard hash.Hash interface methods
sum := hasher.Sum(nil)              // []byte with 32 bytes
size := hasher.Size()               // 32
blockSize := hasher.BlockSize()     // 64

Algorithms & Design Decisions

StringZilla aims to optimize some of the slowest string operations. Some popular operations, however, like equality comparisons and relative order checking, almost always complete on some of the very first bytes in either string. In such operations vectorization is almost useless, unless huge and very similar strings are considered. StringZilla implements those operations as well, but won't result in substantial speedups. Where vectorization stops being effective, parallelism takes over with the new layered cake architecture:

  • StringZilla C library w/out dependencies
  • StringZillas parallel extensions:
    • Parallel C++ algorithms built with Fork Union
    • Parallel CUDA algorithms for Nvidia GPUs
    • Parallel ROCm algorithms for AMD GPUs 🔜

Exact Substring Search

Substring search algorithms are generally divided into: comparison-based, automaton-based, and bit-parallel. Different families are effective for different alphabet sizes and needle lengths. The more operations are needed per-character - the more effective SIMD would be. The longer the needle - the more effective the skip-tables are. StringZilla uses different exact substring search algorithms for different needle lengths and backends:

  • When no SIMD is available - SWAR (SIMD Within A Register) algorithms are used on 64-bit words.
  • Boyer-Moore-Horspool (BMH) algorithm with Raita heuristic variation for longer needles.
  • SIMD backends compare characters at multiple strategically chosen offsets within the needle to reduce degeneracy.

On very short needles, especially 1-4 characters long, brute force with SIMD is the fastest solution. On mid-length needles, bit-parallel algorithms are effective, as the character masks fit into 32-bit or 64-bit words. Either way, if the needle is under 64-bytes long, on haystack traversal we will still fetch every CPU cache line. So the only way to improve performance is to reduce the number of comparisons.

For 2-byte needles, see sz_find_2byte_serial_ in include/stringzilla/find.h:

https://github.com/ashvardanian/StringZilla/blob/e1966de91600298d3c5cf4fe7be40d434f0f405e/include/stringzilla/find.h#L422-L463

Going beyond that, to long needles, Boyer-Moore (BM) and its variants are often the best choice. It has two tables: the good-suffix shift and the bad-character shift. Common choice is to use the simplified BMH algorithm, which only uses the bad-character shift table, reducing the pre-processing time. We do the same for mid-length needles up to 256 bytes long. That way the stack-allocated shift table remains small.

For mid-length needles (≤256 bytes), see sz_find_horspool_upto_256bytes_serial_ in include/stringzilla/find.h:

https://github.com/ashvardanian/StringZilla/blob/e1966de91600298d3c5cf4fe7be40d434f0f405e/include/stringzilla/find.h#L620-L667

In the C++ Standards Library, the std::string::find function uses the BMH algorithm with Raita's heuristic. Before comparing the entire string, it matches the first, last, and the middle character. Very practical, but can be slow for repetitive characters. Both SWAR and SIMD backends of StringZilla have a cheap pre-processing step, where we locate unique characters. This makes the library a lot more practical when dealing with non-English corpora.

The offset selection heuristic is implemented in sz_locate_needle_anomalies_ in include/stringzilla/find.h:

https://github.com/ashvardanian/StringZilla/blob/e1966de91600298d3c5cf4fe7be40d434f0f405e/include/stringzilla/find.h#L244-L305

All those, still, have $O(hn)$ worst case complexity. To guarantee $O(h)$ worst case time complexity, the Apostolico-Giancarlo (AG) algorithm adds an additional skip-table. Preprocessing phase is $O(n+sigma)$ in time and space. On traversal, performs from $(h/n)$ to $(3h/2)$ comparisons. It however, isn't practical on modern CPUs. A simpler idea, the Galil-rule might be a more relevant optimizations, if many matches must be found.

Other algorithms previously considered and deprecated:

  • Apostolico-Giancarlo algorithm for longer needles. Control-flow is too complex for efficient vectorization.
  • Shift-Or-based Bitap algorithm for short needles. Slower than SWAR.
  • Horspool-style bad-character check in SIMD backends. Effective only for very long needles, and very uneven character distributions between the needle and the haystack. Faster "character-in-set" check needed to generalize.

§ Reading materials. Exact String Matching Algorithms in Java. SIMD-friendly algorithms for substring searching.

Exact Multiple Substring Search

Few algorithms for multiple substring search are known. Most are based on the Aho-Corasick automaton, which is a generalization of the KMP algorithm. The naive implementation, however:

  • Allocates disjoint memory for each Trie node and Automaton state.
  • Requires a lot of pointer chasing, limiting speculative execution.
  • Has a lot of branches and conditional moves, which are hard to predict.
  • Matches text a character at a time, which is slow on modern CPUs.

There are several ways to improve the original algorithm. One is to use sparse DFA representation, which is more cache-friendly, but would require extra processing to navigate state transitions.

Levenshtein Edit Distance

Levenshtein distance is the best known edit-distance for strings, that checks, how many insertions, deletions, and substitutions are needed to transform one string to another. It's extensively used in approximate string-matching, spell-checking, and bioinformatics.

The computational cost of the Levenshtein distance is $O(n * m)$, where $n$ and $m$ are the lengths of the string arguments. To compute that, the naive approach requires $O(n * m)$ space to store the "Levenshtein matrix", the bottom-right corner of which will contain the Levenshtein distance. The algorithm producing the matrix has been simultaneously studied/discovered by the Soviet mathematicians Vladimir Levenshtein in 1965, Taras Vintsyuk in 1968, and American computer scientists - Robert Wagner, David Sankoff, Michael J. Fischer in the following years. Several optimizations are known:

  1. Space Optimization: The matrix can be computed in $O(min(n,m))$ space, by only storing the last two rows of the matrix.
  2. Divide and Conquer: Hirschberg's algorithm can be applied to decompose the computation into subtasks.
  3. Automata: Levenshtein automata can be effective, if one of the strings doesn't change, and is a subject to many comparisons.
  4. Shift-Or: Bit-parallel algorithms transpose the matrix into a bit-matrix, and perform bitwise operations on it.

The last approach is quite powerful and performant, and is used by the great RapidFuzz library. It's less known, than the others, derived from the Baeza-Yates-Gonnet algorithm, extended to bounded edit-distance search by Manber and Wu in 1990s, and further extended by Gene Myers in 1999 and Heikki Hyyro between 2002 and 2004.

StringZilla focuses on a different approach, extensively used in Unum's internal combinatorial optimization libraries. It doesn't change the number of trivial operations, but performs them in a different order, removing the data dependency, that occurs when computing the insertion costs. StringZilla evaluates diagonals instead of rows, exploiting the fact that all cells within a diagonal are independent, and can be computed in parallel. We'll store 3 diagonals instead of the 2 rows, and each consecutive diagonal will be computed from the previous two. Substitution costs will come from the sooner diagonal, while insertion and deletion costs will come from the later diagonal.

Row-by-Row Algorithm
Computing row 4:
    ∅  A  B  C  D  E
 ∅  0  1  2  3  4  5
 P  1  ░  ░  ░  ░  ░
 Q  2  ■  ■  ■  ■  ■
 R  3  ■  ■  □  →  .
 S  4  .  .  .  .  .
 T  5  .  .  .  .  .
Anti-Diagonal Algorithm
Computing diagonal 5:
    ∅  A  B  C  D  E
 ∅  0  1  2  3  4  5
 P  1  ░  ░  ■  ■  □
 Q  2  ░  ■  ■  □  ↘
 R  3  ■  ■  □  ↘  .
 S  4  ■  □  ↘  .  .
 T  5  □  ↘  .  .  .
Legend:
0,1,2,3... = initialization constants    = cells processed and forgotten    = stored cells    = computing in parallel    → ↘ = movement direction    . = cells to compute later

This results in much better vectorization for intra-core parallelism and potentially multi-core evaluation of a single request. Moreover, it's easy to generalize to weighted edit-distances, where the cost of a substitution between two characters may not be the same for all pairs, often used in bioinformatics.

§ Reading materials. Faster Levenshtein Distances with a SIMD-friendly Traversal Order.

Needleman-Wunsch and Smith-Waterman Scores for Bioinformatics

The field of bioinformatics studies various representations of biological structures. The "primary" representations are generally strings over sparse alphabets:

  • DNA sequences, where the alphabet is {A, C, G, T}, ranging from ~100 characters for short reads to 3 billion for the human genome.
  • RNA sequences, where the alphabet is {A, C, G, U}, ranging from ~50 characters for tRNA to thousands for mRNA.
  • Proteins, where the alphabet is made of 22 amino acids, ranging from 2 characters for dipeptide to 35,000 for Titin, the longest protein.

The shorter the representation, the more often researchers may want to use custom substitution matrices. Meaning that the cost of a substitution between two characters may not be the same for all pairs. In the general case the serial algorithm is supposed to work for arbitrary substitution costs for each of 256×256 possible character pairs. That lookup table, however, is too large to fit into CPU registers, so instead, the upcoming design focuses on 32×32 substitution matrices, which fit into 1 KB with single-byte "error costs". That said, most BLOSUM and PAM substitution matrices only contain 4-bit values, so they can be packed even further.

Next design goals:

  • Needleman-Wunsch Automata

Memory Copying, Fills, and Moves

A lot has been written about the time computers spend copying memory and how that operation is implemented in LibC. Interestingly, the operation can still be improved, as most Assembly implementations use outdated instructions. Even performance-oriented STL replacements, like Meta's Folly v2024.09.23 focus on AVX2, and don't take advantage of the new masked instructions in AVX-512 or SVE.

In AVX-512, StringZilla uses non-temporal stores to avoid cache pollution, when dealing with very large strings. Moreover, it handles the unaligned head and the tails of the target buffer separately, ensuring that writes in big copies are always aligned to cache-line boundaries. That's true for both AVX2 and AVX-512 backends.

StringZilla also contains "drafts" of smarter, but less efficient algorithms, that minimize the number of unaligned loads, performing shuffles and permutations. That's a topic for future research, as the performance gains are not yet satisfactory.

§ Reading materials. memset benchmarks by Nadav Rotem. Cache Associativity by Sergey Slotin.

Hashing

StringZilla implements a high-performance 64-bit hash function inspired by the "AquaHash", "aHash", and "GxHash" design and optimized for modern CPU architectures. The algorithm utilizes AES encryption rounds combined with shuffle-and-add operations to achieve exceptional mixing properties while maintaining consistent output across platforms. It passes the rigorous SMHasher test suite, including the --extra flag with no collisions.

The core algorithm operates on a dual-state design:

  • AES State: Initialized with seed XOR-ed against π constants.
  • Sum State: Accumulates shuffled input data with a permutation.

For strings ≤64 bytes, a minimal state processes data in 16-byte blocks. Longer strings employ a 4× wider state (512 bits) that processes 64-byte chunks, maximizing throughput on modern superscalar CPUs. The algorithm can be expressed in pseudocode as:

function sz_hash(text: u8[], length: usize, seed: u64) -> u64:
    # 1024 bits worth of π constants
    pi: u64[16] = [
        0x243F6A8885A308D3, 0x13198A2E03707344, 0xA4093822299F31D0, 0x082EFA98EC4E6C89,
        0x452821E638D01377, 0xBE5466CF34E90C6C, 0xC0AC29B7C97C50DD, 0x3F84D5B5B5470917,
        0x9216D5D98979FB1B, 0xD1310BA698DFB5AC, 0x2FFD72DBD01ADFB7, 0xB8E1AFED6A267E96,
        0xBA7C9045F12C7F99, 0x24A19947B3916CF7, 0x0801F2E2858EFC16, 0x636920D871574E69]

    # Permutation order for the sum state
    shuffle_pattern: u8[16] = [
        0x04, 0x0b, 0x09, 0x06, 0x08, 0x0d, 0x0f, 0x05,
        0x0e, 0x03, 0x01, 0x0c, 0x00, 0x07, 0x0a, 0x02]

    # Initialize key and states
    keys_u64s: u64[2] = [seed, seed]
    aes_u64s: u64[2] = [seed ⊕ pi[0], seed ⊕ pi[1]]
    sum_u64s: u64[2] = [seed ⊕ pi[8], seed ⊕ pi[9]]

    if length ≤ 64:
        # Small input: process 1-4 zero-padded blocks of 16 bytes each
        blocks_u8s: u8[16][] = split_into_blocks(text, length, 16)
        for each block_u8s: u8[16] in blocks_u8s:
            aes_u64s = AESENC(aes_u64s, block_u8s)
            sum_u64s = SHUFFLE(sum_u64s, shuffle_pattern) + block_u8s
    else:
        # Large input: use 4× wider 512-bits states
        aes_u64s: u64[8] = [
            seed ⊕ pi[0], seed ⊕ pi[1], seed ⊕ pi[2], seed ⊕ pi[3],
            seed ⊕ pi[4], seed ⊕ pi[5], seed ⊕ pi[6], seed ⊕ pi[7]]
        sum_u64s: u64[8] = [
            seed ⊕ pi[8], seed ⊕ pi[9], seed ⊕ pi[10], seed ⊕ pi[11],
            seed ⊕ pi[12], seed ⊕ pi[13], seed ⊕ pi[14], seed ⊕ pi[15]]

        # Process 64-byte chunks (4×16-byte blocks)
        for each chunk_u8s: u8[64] in text:
            blocks_u8s: u8[16][4] = split_chunk_into_4_blocks(chunk_u8s)
            for i in 0..3:
                offset: usize = i * 2  # Each lane stores two u64s
                aes_u64s[offset:offset+1] = AESENC(aes_u64s[offset:offset+1], blocks_u8s[i])
                sum_u64s[offset:offset+1] = SHUFFLE(sum_u64s[offset:offset+1], shuffle_pattern) + blocks_u8s[i]

        # Fold 8×u64 state back to 2×u64 for finalization
        aes_u64s: u64[2] = fold_to_2u64(aes_u64s)
        sum_u64s: u64[2] = fold_to_2u64(sum_u64s)

    # Finalization: mix length into key
    key_with_length: u64[2] = [keys_u64s[0] + length, keys_u64s[1]]

    # Multiple AES rounds for SMHasher compliance
    mixed_u64s: u64[2] = AESENC(sum_u64s, aes_u64s)
    result_u64s: u64[2] = AESENC(AESENC(mixed_u64s, key_with_length), mixed_u64s)

    return result_u64s[0]  # Extract low 64 bits

This allows us to balance several design trade-offs. First, it allows us to achieve a high port-level parallelism. Looking at AVX-512 capable CPUs and their ZMM instructions, on each cycle, we'll have at least 2 ports busy when dealing with long strings:

  • VAESENC: 5 cycles on port 0 on Intel Ice Lake, 4 cycles on ports 0/1 on AMD Zen4.
  • VPSHUFB_Z: 3 cycles on port 5 on Intel Ice Lake, 2 cycles on ports 1/2 on AMD Zen4.
  • VPADDQ: 1 cycle on ports 0/5 on Intel Ice Lake, 1 cycle on ports 0/1/2/3 on AMD Zen4.

When dealing with smaller strings, we design our approach to avoid large registers and maintain the CPU at the same energy state, thereby avoiding downclocking and expensive power-state transitions.

Unlike some AES-accelerated alternatives, the length of the input is not mixed into the AES block at the start to allow incremental construction, when the final length is not known in advance. Also, unlike some alternatives, with "masked" AVX-512 and "predicated" SVE loads, we avoid expensive block-shuffling procedures on non-divisible-by-16 lengths.

§ Reading materials. Stress-testing hash functions for avalance behaviour, collision bias, and distribution.

SHA-256 Checksums

In addition to the fast AES-based hash, StringZilla implements hardware-accelerated SHA-256 cryptographic checksums. The implementation follows the FIPS 180-4 specification and provides multiple backends.

Random Generation

StringZilla implements a fast Pseudorandom Number Generator inspired by the "AES-CTR-128" algorithm, reusing the same AES primitives as the hash function. Unlike "NIST SP 800-90A" which uses multiple AES rounds, StringZilla uses only one round of AES mixing for performance while maintaining reproducible output across platforms. The generator operates in counter mode with AESENC(nonce + lane_index, nonce ⊕ pi_constants), rotating through the first 512 bits of π for each 16-byte block. The only state required to reproduce an output is a 64-bit nonce, which is much cheaper than a Mersenne Twister.

Sorting

For lexicographic sorting of string collections, StringZilla exports pointer-sized n‑grams ("pgrams") into a contiguous buffer to improve locality, then recursively QuickSorts those pgrams with a 3‑way partition and dives into equal pgrams to compare deeper characters. Very small inputs fall back to insertion sort.

  • Average time complexity: O(n log n)
  • Worst-case time complexity: quadratic (due to QuickSort), mitigated in practice by 3‑way partitioning and the n‑gram staging

Unicode 17, UTF-8, and Wide Characters

Most StringZilla operations are byte-level, so they work well with ASCII and UTF-8 content out of the box. In some cases, like edit-distance computation, the result of byte-level evaluation and character-level evaluation may differ.

  • szs_levenshtein_distances_utf8("αβγδ", "αγδ") == 1 — one unicode symbol.
  • szs_levenshtein_distances("αβγδ", "αγδ") == 2 — one unicode symbol is two bytes long.

Java, JavaScript, Python 2, C#, and Objective-C, however, use wide characters (wchar) - two byte long codes, instead of the more reasonable fixed-length UTF-32 or variable-length UTF-8. This leads to all kinds of offset-counting issues when facing four-byte long Unicode characters. StringZilla uses proper 32-bit "runes" to represent unpacked Unicode codepoints, ensuring correct results in all operations. Moreover, it implements the Unicode 17.0 standard, being practically the only library besides ICU and PCRE2 to do so, but with order(s) of magnitude better performance.

Case-Folding and Case-Insensitive Search

StringZilla provides Unicode-aware case-insensitive substring search that handles the full complexity of Unicode case folding. This includes multi-character expansions:

Character Codepoint UTF-8 Bytes Case-Folds To Result Bytes
ß U+00DF C3 9F ss 73 73
U+FB03 EF AC 83 ffi 66 66 69
İ U+0130 C4 B0 i + ◌̇ 69 CC 87

The search returns byte offsets and lengths in the original haystack, correctly handling length differences. For example, searching for "STRASSE" (7 bytes) in "Straße" (7 bytes: 53 74 72 61 C3 9F 65) succeeds because both case-fold to "strasse".

Note that Turkish İ and ASCII I are distinct: İstanbul case-folds to i̇stanbul (with combining dot), while ISTANBUL case-folds to istanbul (without). They will not match each other — this is correct Unicode behavior for Turkish locale handling.

For wide-character environments (Java, JavaScript, Python 2, C#), consider transcoding with simdutf.

Dynamic Dispatch

Due to the high-level of fragmentation of SIMD support in different CPUs, StringZilla uses the names of select Intel and ARM CPU generations for its backends. You can query supported backends and use them manually. Use it to guarantee constant performance, or to explore how different algorithms scale on your hardware.

sz_find(text, length, pattern, 3);          // Auto-dispatch
sz_find_westmere(text, length, pattern, 3);  // Intel Westmere+ SSE4.2
sz_find_haswell(text, length, pattern, 3);  // Intel Haswell+ AVX2
sz_find_skylake(text, length, pattern, 3);  // Intel Skylake+ AVX-512
sz_find_neon(text, length, pattern, 3);     // Arm NEON 128-bit
sz_find_sve(text, length, pattern, 3);      // Arm SVE 128/256/512/1024/2048-bit

StringZilla automatically picks the most advanced backend for the given CPU. Similarly, in Python, you can log the auto-detected capabilities:

python -c "import stringzilla; print(stringzilla.__capabilities__)"         # ('serial', 'westmere', 'haswell', 'skylake', 'ice', 'neon', 'sve', 'sve2+aes')
python -c "import stringzilla; print(stringzilla.__capabilities_str__)"     # "haswell, skylake, ice, neon, sve, sve2+aes"

You can also explicitly set the backend to use, or scope the backend to a specific function.

import stringzilla as sz
sz.reset_capabilities(('serial',))          # Force SWAR backend
sz.reset_capabilities(('haswell',))         # Force AVX2 backend
sz.reset_capabilities(('neon',))            # Force NEON backend
sz.reset_capabilities(sz.__capabilities__)  # Reset to auto-dispatch

Contributing 👾

Please check out the contributing guide for more details on how to set up the development environment and contribute to this project. If you like this project, you may also enjoy USearch, UCall, UForm, and SimSIMD. 🤗

If you like strings and value efficiency, you may also enjoy the following projects:

  • simdutf - transcoding UTF-8, UTF-16, and UTF-32 LE and BE.
  • hyperscan - regular expressions with SIMD acceleration.
  • pyahocorasick - Aho-Corasick algorithm in Python.
  • rapidfuzz - fast string matching in C++ and Python.
  • memchr - fast string search in Rust.

If you are looking for more reading materials on this topic, consider the following:

License 📜

Feel free to use the project under Apache 2.0 or the Three-clause BSD license at your preference.

Release history Release notifications | RSS feed

This version

4.6.2

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stringzilla-4.6.2.tar.gz (646.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

stringzilla-4.6.2-cp314-cp314-win_arm64.whl (127.9 kB view details)

Uploaded CPython 3.14Windows ARM64

stringzilla-4.6.2-cp314-cp314-win_amd64.whl (167.5 kB view details)

Uploaded CPython 3.14Windows x86-64

stringzilla-4.6.2-cp314-cp314-win32.whl (118.0 kB view details)

Uploaded CPython 3.14Windows x86

stringzilla-4.6.2-cp314-cp314-musllinux_1_2_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.14musllinux: musl 1.2+ x86-64

stringzilla-4.6.2-cp314-cp314-musllinux_1_2_s390x.whl (632.2 kB view details)

Uploaded CPython 3.14musllinux: musl 1.2+ s390x

stringzilla-4.6.2-cp314-cp314-musllinux_1_2_riscv64.whl (602.7 kB view details)

Uploaded CPython 3.14musllinux: musl 1.2+ riscv64

stringzilla-4.6.2-cp314-cp314-musllinux_1_2_ppc64le.whl (613.8 kB view details)

Uploaded CPython 3.14musllinux: musl 1.2+ ppc64le

stringzilla-4.6.2-cp314-cp314-musllinux_1_2_i686.whl (616.6 kB view details)

Uploaded CPython 3.14musllinux: musl 1.2+ i686

stringzilla-4.6.2-cp314-cp314-musllinux_1_2_armv7l.whl (595.9 kB view details)

Uploaded CPython 3.14musllinux: musl 1.2+ ARMv7l

stringzilla-4.6.2-cp314-cp314-musllinux_1_2_aarch64.whl (662.0 kB view details)

Uploaded CPython 3.14musllinux: musl 1.2+ ARM64

stringzilla-4.6.2-cp314-cp314-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl (611.5 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.34+ riscv64manylinux: glibc 2.39+ riscv64

stringzilla-4.6.2-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

stringzilla-4.6.2-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl (600.4 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ s390xmanylinux: glibc 2.28+ s390x

stringzilla-4.6.2-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl (595.3 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ ppc64lemanylinux: glibc 2.28+ ppc64le

stringzilla-4.6.2-cp314-cp314-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl (650.4 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ ARMv7lmanylinux: glibc 2.31+ ARMv7l

stringzilla-4.6.2-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (647.8 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

stringzilla-4.6.2-cp314-cp314-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (582.4 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ i686manylinux: glibc 2.5+ i686

stringzilla-4.6.2-cp314-cp314-macosx_11_0_arm64.whl (199.2 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

stringzilla-4.6.2-cp314-cp314-macosx_10_15_x86_64.whl (212.2 kB view details)

Uploaded CPython 3.14macOS 10.15+ x86-64

stringzilla-4.6.2-cp313-cp313-win_arm64.whl (123.2 kB view details)

Uploaded CPython 3.13Windows ARM64

stringzilla-4.6.2-cp313-cp313-win_amd64.whl (162.5 kB view details)

Uploaded CPython 3.13Windows x86-64

stringzilla-4.6.2-cp313-cp313-win32.whl (114.7 kB view details)

Uploaded CPython 3.13Windows x86

stringzilla-4.6.2-cp313-cp313-musllinux_1_2_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ x86-64

stringzilla-4.6.2-cp313-cp313-musllinux_1_2_s390x.whl (632.7 kB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ s390x

stringzilla-4.6.2-cp313-cp313-musllinux_1_2_riscv64.whl (602.5 kB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ riscv64

stringzilla-4.6.2-cp313-cp313-musllinux_1_2_ppc64le.whl (613.6 kB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ ppc64le

stringzilla-4.6.2-cp313-cp313-musllinux_1_2_i686.whl (616.6 kB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ i686

stringzilla-4.6.2-cp313-cp313-musllinux_1_2_armv7l.whl (597.6 kB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ ARMv7l

stringzilla-4.6.2-cp313-cp313-musllinux_1_2_aarch64.whl (661.7 kB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ ARM64

stringzilla-4.6.2-cp313-cp313-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl (611.3 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ riscv64manylinux: glibc 2.39+ riscv64

stringzilla-4.6.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

stringzilla-4.6.2-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl (600.5 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ s390xmanylinux: glibc 2.28+ s390x

stringzilla-4.6.2-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl (595.1 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ppc64lemanylinux: glibc 2.28+ ppc64le

stringzilla-4.6.2-cp313-cp313-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl (651.1 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARMv7lmanylinux: glibc 2.31+ ARMv7l

stringzilla-4.6.2-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (647.7 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

stringzilla-4.6.2-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (582.3 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ i686manylinux: glibc 2.5+ i686

stringzilla-4.6.2-cp313-cp313-macosx_11_0_arm64.whl (199.2 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

stringzilla-4.6.2-cp313-cp313-macosx_10_13_x86_64.whl (212.2 kB view details)

Uploaded CPython 3.13macOS 10.13+ x86-64

stringzilla-4.6.2-cp312-cp312-win_arm64.whl (123.2 kB view details)

Uploaded CPython 3.12Windows ARM64

stringzilla-4.6.2-cp312-cp312-win_amd64.whl (162.5 kB view details)

Uploaded CPython 3.12Windows x86-64

stringzilla-4.6.2-cp312-cp312-win32.whl (114.7 kB view details)

Uploaded CPython 3.12Windows x86

stringzilla-4.6.2-cp312-cp312-musllinux_1_2_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

stringzilla-4.6.2-cp312-cp312-musllinux_1_2_s390x.whl (632.7 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ s390x

stringzilla-4.6.2-cp312-cp312-musllinux_1_2_riscv64.whl (602.5 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ riscv64

stringzilla-4.6.2-cp312-cp312-musllinux_1_2_ppc64le.whl (613.6 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ppc64le

stringzilla-4.6.2-cp312-cp312-musllinux_1_2_i686.whl (616.5 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ i686

stringzilla-4.6.2-cp312-cp312-musllinux_1_2_armv7l.whl (597.5 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARMv7l

stringzilla-4.6.2-cp312-cp312-musllinux_1_2_aarch64.whl (661.6 kB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

stringzilla-4.6.2-cp312-cp312-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl (611.4 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ riscv64manylinux: glibc 2.39+ riscv64

stringzilla-4.6.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (1.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

stringzilla-4.6.2-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl (600.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ s390xmanylinux: glibc 2.28+ s390x

stringzilla-4.6.2-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl (595.1 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ppc64lemanylinux: glibc 2.28+ ppc64le

stringzilla-4.6.2-cp312-cp312-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl (651.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARMv7lmanylinux: glibc 2.31+ ARMv7l

stringzilla-4.6.2-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (647.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

stringzilla-4.6.2-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (582.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ i686manylinux: glibc 2.5+ i686

stringzilla-4.6.2-cp312-cp312-macosx_11_0_arm64.whl (199.2 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

stringzilla-4.6.2-cp312-cp312-macosx_10_13_x86_64.whl (212.2 kB view details)

Uploaded CPython 3.12macOS 10.13+ x86-64

stringzilla-4.6.2-cp311-cp311-win_arm64.whl (122.9 kB view details)

Uploaded CPython 3.11Windows ARM64

stringzilla-4.6.2-cp311-cp311-win_amd64.whl (162.2 kB view details)

Uploaded CPython 3.11Windows x86-64

stringzilla-4.6.2-cp311-cp311-win32.whl (114.4 kB view details)

Uploaded CPython 3.11Windows x86

stringzilla-4.6.2-cp311-cp311-macosx_11_0_arm64.whl (198.9 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

stringzilla-4.6.2-cp311-cp311-macosx_10_13_x86_64.whl (211.5 kB view details)

Uploaded CPython 3.11macOS 10.13+ x86-64

stringzilla-4.6.2-cp310-cp310-win_arm64.whl (122.9 kB view details)

Uploaded CPython 3.10Windows ARM64

stringzilla-4.6.2-cp310-cp310-win_amd64.whl (162.3 kB view details)

Uploaded CPython 3.10Windows x86-64

stringzilla-4.6.2-cp310-cp310-win32.whl (114.4 kB view details)

Uploaded CPython 3.10Windows x86

stringzilla-4.6.2-cp310-cp310-musllinux_1_2_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

stringzilla-4.6.2-cp310-cp310-musllinux_1_2_s390x.whl (624.7 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ s390x

stringzilla-4.6.2-cp310-cp310-musllinux_1_2_riscv64.whl (598.8 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ riscv64

stringzilla-4.6.2-cp310-cp310-musllinux_1_2_ppc64le.whl (605.9 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ ppc64le

stringzilla-4.6.2-cp310-cp310-musllinux_1_2_i686.whl (607.0 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ i686

stringzilla-4.6.2-cp310-cp310-musllinux_1_2_armv7l.whl (587.7 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ ARMv7l

stringzilla-4.6.2-cp310-cp310-musllinux_1_2_aarch64.whl (655.7 kB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ ARM64

stringzilla-4.6.2-cp310-cp310-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl (607.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ riscv64manylinux: glibc 2.39+ riscv64

stringzilla-4.6.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64manylinux: glibc 2.28+ x86-64

stringzilla-4.6.2-cp310-cp310-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl (593.2 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ s390xmanylinux: glibc 2.28+ s390x

stringzilla-4.6.2-cp310-cp310-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl (586.6 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ppc64lemanylinux: glibc 2.28+ ppc64le

stringzilla-4.6.2-cp310-cp310-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl (640.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARMv7lmanylinux: glibc 2.31+ ARMv7l

stringzilla-4.6.2-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (640.5 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64manylinux: glibc 2.28+ ARM64

stringzilla-4.6.2-cp310-cp310-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl (575.7 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ i686manylinux: glibc 2.5+ i686

stringzilla-4.6.2-cp310-cp310-macosx_11_0_arm64.whl (199.0 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

stringzilla-4.6.2-cp310-cp310-macosx_10_13_x86_64.whl (211.5 kB view details)

Uploaded CPython 3.10macOS 10.13+ x86-64

File details

Details for the file stringzilla-4.6.2.tar.gz.

File metadata

  • Download URL: stringzilla-4.6.2.tar.gz
  • Upload date:
  • Size: 646.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stringzilla-4.6.2.tar.gz
Algorithm Hash digest
SHA256 f30fb35d84578236723aaf9442e9281ae29c8ad5255eb118173c9370af4c0e71
MD5 c5228f50691b59961434637cf04cb368
BLAKE2b-256 7eb540951e5eaef033aa377c111f4a18a62dbce6d6211eff754f88c73aeebe49

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-win_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-win_arm64.whl
Algorithm Hash digest
SHA256 005b4450a5d1f5e199cd4fc28c60a67695555a034981dfc4a8b0fa0b0989f0f1
MD5 5552f8138cda48474a63c33ae184e8a2
BLAKE2b-256 661237808d822c5c2ff188eff46f0ee2637530e0ce781b9aac0b147bb9713c67

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 34e3c30bd65bba91ccb6b73258e4899fa025e9e21fbb1c028ab32974fd51e377
MD5 5951a6bb5ccafa80c379132440a0adb4
BLAKE2b-256 6234e13e520393cd35e9e2cd0c4b9392781a2e30c76888d997523c549829faaa

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-win32.whl.

File metadata

  • Download URL: stringzilla-4.6.2-cp314-cp314-win32.whl
  • Upload date:
  • Size: 118.0 kB
  • Tags: CPython 3.14, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-win32.whl
Algorithm Hash digest
SHA256 8cad620429de33881cca0e78032bba301176342e987705ddefb827403bc8caf3
MD5 d047584a339f1fecfefcd51df59e62e1
BLAKE2b-256 1fcea3bfb84c0da5ca9347ec7f1774a821664763827bb8e854f38fc814ac24e2

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1a4895a105fd3de40fc82ef96f0e8a1030645e10fa13b9b6d15efeaf02e52efc
MD5 d939562e4e56b8b166e5770bc30a4748
BLAKE2b-256 4d6465ea631fc3c6b1c836156041c2895aae4874efa03ac829355224bd60c31d

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-musllinux_1_2_s390x.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-musllinux_1_2_s390x.whl
Algorithm Hash digest
SHA256 64c10e42b88e84fa1cf36f42dc8ba97421f8700d1c11873bab8de13cb5723dc1
MD5 dc31ebb7fa32c15a03bdee0d890cdaba
BLAKE2b-256 29175e2377010f92d34f3bfe4f811be6710fc495705e14901a24829a78847b51

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-musllinux_1_2_riscv64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-musllinux_1_2_riscv64.whl
Algorithm Hash digest
SHA256 e6f4d02db0efb228b724d69cb7dbf0126b534ebf87dc47469057aabe591a91f6
MD5 681e61eb8865f19ea493d06bf1a0047c
BLAKE2b-256 93d864879da8fba98bed4c4645524c57579bbc13b4c4687152d22458d0ad71f4

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-musllinux_1_2_ppc64le.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-musllinux_1_2_ppc64le.whl
Algorithm Hash digest
SHA256 4666e0b76da6e54a5a4d81561905ce09743adc8ea03ca59cbd306305095607cd
MD5 f98653ff57fbe79641f98070d1e9578e
BLAKE2b-256 881feeaa1a37a7feb7369642d17d001a8bb1c5f28ecfbd26088ed06af63b00f2

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 573abfc3eca6d903e0aab4ab6a8de841a0e437a93536fd2f75a6c9e70db29729
MD5 e50f2ee1f2e86587c25ab4f1acd1bb50
BLAKE2b-256 b1ad1890f124a9ff99c620d032ad637191ab7bcdd06c0881e90133d0bfe81298

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 0a57f406cad5225ae266f1dbfa9f07a8468a33921bd18cd03797871aa334db5e
MD5 c1a9accf7795da8ce4daaf8bd52ddc6e
BLAKE2b-256 a21f3ed79edbbf6212e0f813f85aafb15f57b60508c862c73b470002a0bf24d1

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 aba0640812f13520380318930ff716fa5a434393a70d8e7db57ced637ecd0705
MD5 0dafc88a124f3a2932c074c20e9aae49
BLAKE2b-256 ae7aeece4bb72d24cdb08ac4acf5d73aec0ccf8b970508b6f4d1633aef2513e7

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl
Algorithm Hash digest
SHA256 f8475647325b1cb923aebdc3b22dad2a8805eb77c3cc300d27960a4a12293f9c
MD5 6255c712b70d979d611d38517c439ebf
BLAKE2b-256 0d2811e549e5fada3a92b62ca3d750e88043a47f75c835ba08bc792409050024

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 bbd0df9824f32f3fc113d9348dfdcc976d4df2ac2118c53074126abb56351bd0
MD5 50bd5ee9f8efea7c739f26fd821ae901
BLAKE2b-256 6fd456be7c6ed92a96eb261a6f90cefbbea6f4f2e390ac587aa821c65c9b9a9e

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl
Algorithm Hash digest
SHA256 9862fab01ccd8007b676c30e92c9c4978201954260340f962eb4a89d5af482cb
MD5 df1fdd78a60edc8e1d40dac1f0e873de
BLAKE2b-256 0a094d8c852ab4d55ad7321672cddfc9ad5629293964b92b978afe8b4cfd363a

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl
Algorithm Hash digest
SHA256 1ff91faf487f5cef54ce23375691d70562f7134dee485b0cb7bee215a0cd82c9
MD5 0e8328d3f80a1a8f95765a0c0a22e199
BLAKE2b-256 3011278fa91c272368e816929ac927172371d4629299b9afbf8be84c6b159ac8

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl
Algorithm Hash digest
SHA256 3ca51a91d4606649789aef673e30f78b0381b915ca4963fffea18c0ca6e5bb9d
MD5 36aece8b23e18805caec63c03c57fb6d
BLAKE2b-256 9c94f67c9b02a495e45ecc7eab586b0dece8b7fdaded82a23fb132187192b488

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 3afa1595ddede82fff17df64d162ae8498ee61b35aa6c730a55419558cdc7663
MD5 987735b1010ca371189270c8c7d667d6
BLAKE2b-256 f090d09b435ef6dc56090f1b03a99d49cb16c0d4e7dd315320291f82137d7f6a

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl
Algorithm Hash digest
SHA256 1a612afc970c67bc9c5de0b1ae11ac2ad61bd80cb1c7f80fa68c16402ab4f656
MD5 0bf2f75d4bba7af839045789d7befb11
BLAKE2b-256 f061afeb2ac52b6541e91562f89fa126d9b0380e02b0a4ca169cce4e6910a614

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f4e50aaddb4aae588b988925c6e5664db08fe31555a57c649fdfa217b0b88113
MD5 9b8d8c2734eac6bd39f31700af3c7be4
BLAKE2b-256 9bc1b248df98ce0f25e795a1f5bee5e0cff9b337af3fc0c12f7e7040695e27e6

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp314-cp314-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp314-cp314-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 fbf773dd128460ae44b3839c3a85f2587e42869f1004ea327b11950d1d4a1c11
MD5 fe92ac0df32b0f78c062181caccb163c
BLAKE2b-256 044373302749e84315461722068a89fcbc892ce35ee5204d4fb112d0c1aa00d4

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-win_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-win_arm64.whl
Algorithm Hash digest
SHA256 9c182254bd7943e4c61ce83c58e6096a180905fadbd2d8b4de8b57fc5cb7c9cc
MD5 6a8be8a06c0acb9bcc3a1a64fccb6e94
BLAKE2b-256 0c73a052bc96a5166181c413c49ade7c8bd4bd6ea7bbb92f0791066c5b899053

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 1bdd87d0a92a83bc013a817afe69bc906b55476370c210791f3e132457dd461a
MD5 2069ca876cfb2b345bd4b29b539a4706
BLAKE2b-256 d82141039cf8b99f21a17709beeea7b63aef9fe8b627a1a843990e2af075af3c

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-win32.whl.

File metadata

  • Download URL: stringzilla-4.6.2-cp313-cp313-win32.whl
  • Upload date:
  • Size: 114.7 kB
  • Tags: CPython 3.13, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-win32.whl
Algorithm Hash digest
SHA256 66dace3fa65ebdbdac3644e513348678f75e94e59b7234f614eae33535be1bdc
MD5 7bb2da13e3edca3fb960c6da5f98c394
BLAKE2b-256 831fe36f84703db11b568489cf950634ff55ba20b780476989992a4fb35497b3

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 4c8c09a22da4bf6dd054a36b8eeed40760f4b5318b9f001145fa1159844475ab
MD5 4216c72808032a750ef4b26213bc8f2a
BLAKE2b-256 03c3f4299a496892111aa200b92ac3e3a4295ab44f21238afeb23f0bff6d70a6

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-musllinux_1_2_s390x.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-musllinux_1_2_s390x.whl
Algorithm Hash digest
SHA256 9d7d3d89c903ff519111eb5f9614324e7640c912ce55975eb240de662d7196c9
MD5 0fd38b3e4e189201ce8055fc0784147d
BLAKE2b-256 b6659dad1d7a415e8fd893f3106f787d499f2bd00e12ed0023a85960ecd68b22

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-musllinux_1_2_riscv64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-musllinux_1_2_riscv64.whl
Algorithm Hash digest
SHA256 f0436f1f2f2f659335bcd132ce79b970703f63b2ae478547e4f3fe83032e04d2
MD5 a33d6fd0d434fd92df905b5654dd6fe2
BLAKE2b-256 e3e6cfe4bb910a4b806089bf920dc3a875c98318a5acde034f63087f728f52d9

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-musllinux_1_2_ppc64le.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-musllinux_1_2_ppc64le.whl
Algorithm Hash digest
SHA256 2016e170f08bf956ba6f19cdf2eed6850e93a60cef7c4d481c32468ae877f340
MD5 d6038a8f025a126d379f7b8b5e38a6fa
BLAKE2b-256 3fb15a9ee7e66e320d13cc2a451c8fdc0ff186973754a12b227b0431b74bf9ca

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 24dbf7a571da484d1efb4f1674cc2f9fab8ebed4c1eb3c647833098b7cbc4b46
MD5 159e375acaa92cc8305cd59f5aa5bba9
BLAKE2b-256 378b9822536bd67d4c86e4919010e0b0cf397af9a9786f2bb04363df873ec718

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 8e8e831ce10fd629d4573a11d86bf6bf851ef456e4b757a3bb239b3ea30e5013
MD5 4f41113eb1ff13ad0264deb416841d34
BLAKE2b-256 0b94b7c71735aae038a5a74323f41691d581855fe906ba4090c63de733e2316e

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 716c4120bfed0d0b75a16ff2c99b644de1c5c41d68c57543439ffdc0caf8bd05
MD5 8dc65f6c81c68265d61003564d647ec2
BLAKE2b-256 d29a98d567f8633449e1cf659cd6ce7682688168403ca8195417fc41bb15e217

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl
Algorithm Hash digest
SHA256 fcb95e147f7709527fa4e3eddc16b9a4e033ca48cba85cc20deda62270c85e0c
MD5 dd409dc7fcb76ab9f91d4d4a11f2ec41
BLAKE2b-256 55c233ace5325d88229b565ee144a350d77bdded7a3103e2e326de8acc961d3c

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 09a67387037c66d167819cfa8d4234c713fbc6df90131a6f61dc3cd84ab19fd4
MD5 6c6323ff9ee54412304baea2ee052f3f
BLAKE2b-256 c7aec97a1689dc0cc466f71c2f958adca6ae1a8e6bf5bdbe33164c213a77525c

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl
Algorithm Hash digest
SHA256 8c2e30dfd179206e9bfced4c4e91d3f8af9eae7d6fce5ba5e5f4e66aa9215be7
MD5 9cbf56fa21898b7c1447bd528d19a23b
BLAKE2b-256 aa14ff3e14615132504806a467819999ba5840a181cc4e907044987fece7eabb

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl
Algorithm Hash digest
SHA256 53b7dbc30c3c22fb5761e90c4854213ab62bf153fe3b25011348b312bb49df03
MD5 d108429ecf9d27ac597bec738c1a4b45
BLAKE2b-256 14d372119c2e900be6451114d1d1c5947e00c79dc725c0e15a7cb34c185f55bd

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl
Algorithm Hash digest
SHA256 6a01e629f2677522abae4e9e65ada2f983ead838266d3a9b99bb512c731f325c
MD5 3ce7e00f833bcfaf42ac98ed0931bf69
BLAKE2b-256 7c00ed9bad99ef5e9c002634c61822585b792b22c19d66f40954e4efad694a72

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 c0d7149226d0455caf1aa2eec2b425fa276f1e2a61ba3760ff492f6b9f25cc23
MD5 db712bd217ba1b463fdf40a54de67497
BLAKE2b-256 1dfbb7f9f5f5068c47378017eb2747df52d3fc77f43bd4764b5719a30ed1cc9a

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl
Algorithm Hash digest
SHA256 23a2ebe9e8cceff2d0c2c0a664a72417ae7d6a9b10a5d9049c95bd152ec70b3e
MD5 2d84c60122ff8820f638692a248834c7
BLAKE2b-256 5d2447ac8d851ab53a8ee3178755a74eaf5987bb4b01b5a9b748bf2120d26022

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b2bbea488adb0320fd317cf07f35486fcb7d8e487aa5b23e60b55b36a6e9faa9
MD5 6a8dc84b9a98606e02ac46b12ee05248
BLAKE2b-256 dae39f626134a7e7a40e9cfc875aed379b123df2a5fd8484234914431840712e

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp313-cp313-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp313-cp313-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 00a791addb965cc3268e1e983d933f7aa81f19ea1a5bfed78ed3c319623597f9
MD5 dab068a92f85e06fab65d3e09896090f
BLAKE2b-256 68d8d92cc16d4e7b7e8b05f4f78d5546bfdd635d42ae5e39bcf80198d50f7d0c

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-win_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-win_arm64.whl
Algorithm Hash digest
SHA256 ec028b7db898a2131a10fcebced14cc0dad5af0d45ad05f27e20ca0feb226446
MD5 8afd36a8ac23236534ebda3fe14eb470
BLAKE2b-256 e501cb7608daeb23f427e0da01ecb4cb211b7f860055b2d94578845f5ac4f1a4

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 42dc9bf6379edb174607b7061cb2c234d3f8a57f701fee31755cbcb5fd55e93a
MD5 6539ec1b481cd286d57038c6d872bd5b
BLAKE2b-256 200f14df6be0e23b3cdd7e28732eb7f93437ce57d20510ba43875e3d98be02d8

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-win32.whl.

File metadata

  • Download URL: stringzilla-4.6.2-cp312-cp312-win32.whl
  • Upload date:
  • Size: 114.7 kB
  • Tags: CPython 3.12, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-win32.whl
Algorithm Hash digest
SHA256 0638c6bdbcc45de73ab9c85eab622712174993ef91b5b0463f333a11e9a2bbfb
MD5 3c04ab9c714e4b508daac2fdc12bae9c
BLAKE2b-256 ea3a40848f18b3801019af4f0efd8a317f7953f8287560f32ff3c4cb4c38cea2

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 215d9e5ea1f3caeeebaac1b6cf18579a8f7e519a16afba036954230bae17d12c
MD5 7ea3ff9f55a9b429764cd02ad7e72ab1
BLAKE2b-256 6f6f52211a7b7871497547ea71c93ac4a9e6911f99e6c412102792c29889dcb6

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-musllinux_1_2_s390x.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-musllinux_1_2_s390x.whl
Algorithm Hash digest
SHA256 50d61f06b66a9b4d737f9ed3611da41a53e9cb8aa3c25d9c3d34dbb67f272e6c
MD5 b7ee0bfa965cf01065175d3a392439bf
BLAKE2b-256 856203a153b149edcf433d87672e3b62917423c53ca2e30df8ac1cbb08fbaf72

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-musllinux_1_2_riscv64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-musllinux_1_2_riscv64.whl
Algorithm Hash digest
SHA256 9248aa322f24f28a888991f38d23a75d4dd27587becd029b00e94a1f155f9276
MD5 001c7eac29240aaa34d246dd0135eb3d
BLAKE2b-256 f55e84af0d65bc2f661b5df0a5e9202f722a746a67a77649fc5d1634d5b22455

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-musllinux_1_2_ppc64le.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-musllinux_1_2_ppc64le.whl
Algorithm Hash digest
SHA256 a321e1b909378769f7c05697468fe88ea65089148785be50b65f26f591350204
MD5 50bcdd86310e2b153e5ed2f42b082bed
BLAKE2b-256 590548bbe7225e55f0ff5c7c4bc00296733ab88e09135dcb6a39f614fcaf0c4b

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 4ca9adcadc1fe1564e31d4bda14d76f5d7614c616c10c16d7bccc48ee486b0eb
MD5 1dade664c4dc2cd6c60863290fcab494
BLAKE2b-256 8e470ca2d6131b65e925e8d5498d32f72b0353de58bd49346ccc0d193d79fc5d

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 cadbc0ac8474cdfa85084f797f080b797c1eb951cd7aa0c73881616d1ddf0639
MD5 45ce12bb9556d7183618eaed325d50e0
BLAKE2b-256 78520f784920c1a4e5f2241772f961c4b72d2444d066ef588c34edb5490c7995

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 bbbeea11ddd1553a63650c1abb720710989879d94d151a43600f10af1d8a39a5
MD5 7c8431b2ebf310ca87f15a613e3cf1e0
BLAKE2b-256 7f9c8b1e21c13c94a7b405c36a84fe12aa746a19215a1eaf4610233356188c78

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl
Algorithm Hash digest
SHA256 da84c50cb0eb3d5b6f03de83b791bf891b1f49a296e3dac0301a27cf2a827d0c
MD5 2adcc9c5909556b1866262b312e70dc4
BLAKE2b-256 48b38a7ca75ee56f82e0d120932e6d4ed96c304b82f28d8e4c63f040f279b04e

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 83bc6f26f02628c831ddfc547f1077c69a6fd7902231fea2aee8f93e9da8bd46
MD5 5c9707d9551bdf120fd69d48fdc73b13
BLAKE2b-256 0f5465df9199a334066cf9d8ead48573f80b6c3bc2cb7d2a0a0fce09f31b2860

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl
Algorithm Hash digest
SHA256 b307cd338cf719137db7efaa1cb3da3421d50ea26c61059424bac97ddd5138d5
MD5 e687e50190901aff9759eb3b3a90e72b
BLAKE2b-256 448d08a810a6b9a13f15a5885f290fb14ea6d75e783b95852c95c5a938016f25

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl
Algorithm Hash digest
SHA256 02a9979e6992ed6975ab6bc24d19148839a8cfad88ee828608296cef9002a141
MD5 d62fe7c6a8e377ebdf64bb2120c02f1a
BLAKE2b-256 5bf8869e728e232645c93817c4bdd794bcd75bc895334b110011e46c5cbd90e9

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl
Algorithm Hash digest
SHA256 3c60de67871c87a4aebcdca3f120d87252446cc4cd344156ae5391d17147bed2
MD5 cfe0c8e949f0c68d80cc5e2b31a53713
BLAKE2b-256 7383946c776fd76af08a20f21c4671bbc4cc5a8a0e9b3ba578b67e724a5639b1

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a5737b993627d8994654dc8780b858ed610bd4ed6a9c1ed6bd9972c33b258b06
MD5 f34bcf60a9069857c66d0d10bfd7f341
BLAKE2b-256 254f80eda1295021033ad67b775ef7e4933b3b52f6f715fb464676cd0330bbdf

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl
Algorithm Hash digest
SHA256 b54e6be715a19faf407837bccf302082866134b09673d470c162cbbffb7b1570
MD5 59c3561b5aeaa1e6dc85c2db7ef90f38
BLAKE2b-256 0dc2b7764bb3dfb08f221f0b96267b2b72715489a165c89973bbce2edc2d8f97

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dae0ce637947ab3bfbca8e26236de1149e2a181a7b7286a174936a986b370689
MD5 ab03bea2d966cedef7982090baf89b9d
BLAKE2b-256 882b62dd1a69ee3bb55ac507da01f4312ecba338e77e158e8b03c01eb79884fe

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp312-cp312-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp312-cp312-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 4fd0728a45baa861927abf37d365c656695c89b13a671108f34b7bdc88653c20
MD5 bfe021b0c19c80643820718e483e1d00
BLAKE2b-256 a42c71021ff18092efeef3bd9e9f75c04b2f5b648c66aa88f80981f12f1777f0

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp311-cp311-win_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp311-cp311-win_arm64.whl
Algorithm Hash digest
SHA256 283ad47988be6f594f1d37be1b6031b5ad987b162e33115c2e9711d051b894bb
MD5 bd11d51a9ed333be33b3960194feb28e
BLAKE2b-256 864c8a33d0d924351419d4a9a909102373d73bdb2c1b12c179fddecaadbd8644

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 795371c6d00e21e037bd338ed29a633fc22ff992b3385a359c55f597f9c4e4f8
MD5 46d98195475a24fe10c53c12c9c299da
BLAKE2b-256 5a4afde3ecc52dcb0f949087b6efeaa1ba636b19a9bc54d2e7c9167902b263f7

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp311-cp311-win32.whl.

File metadata

  • Download URL: stringzilla-4.6.2-cp311-cp311-win32.whl
  • Upload date:
  • Size: 114.4 kB
  • Tags: CPython 3.11, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stringzilla-4.6.2-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 445b2fbe485fe4743725e6ff189ada43a73f73d9ea94bf4cfafb68b4ecbb5141
MD5 3459a3aa9fcf7853a12a3924dfcab1c8
BLAKE2b-256 ac121e4fb2fe9993e11ccbdbf887489439b02dff3f8e63ae13c0bcb87b662dee

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0381e4e9d1081eb9de6242a9a98e14b884651e6cf9d0bd50eac30927699a5b72
MD5 b81a8b1523b394c5a6bf3d920d725999
BLAKE2b-256 43c433fd6afac7ffb6953b4cf86dbf91dfe8d5a110adb7b42eb59e6a16d74f55

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp311-cp311-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp311-cp311-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 56e8a999ee954a34fbe729523ddf8f3f1024180b444090c42d10d71db50449c4
MD5 ea2fe4a5641af5243d6bda706c3502a2
BLAKE2b-256 04f3193fa3a4fd79e2ff4bf1622f908c30088daf9b65c1b7344b090b7b86a6b9

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-win_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-win_arm64.whl
Algorithm Hash digest
SHA256 fe6f9562d88ef8d812ab7ead2f5da59b5903ce5ef318e718ca96bedcb7492625
MD5 9bf3e6fef42a6fa7bd750e0b0328d966
BLAKE2b-256 d1e6b39ee8b58fcd376219afd2e390e0447299a56e26be038e0875555f56576b

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2d1eaf16aef37c86d0910fff169eb1af4e7b0fb3b80fa2ac88f70eb3fecd5f39
MD5 5e40ad59ac67a5119abff8f84c45e088
BLAKE2b-256 a86f069902e9679f17908ee49d5418e36b2a6dc6ac4f8caa33013fb679581e53

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-win32.whl.

File metadata

  • Download URL: stringzilla-4.6.2-cp310-cp310-win32.whl
  • Upload date:
  • Size: 114.4 kB
  • Tags: CPython 3.10, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 d8e47dc8acc0f0d8d031a8f69e9f798545465e200c2122d970181e741b8f8911
MD5 3d6e93f8de9c4637c02aeaa386848627
BLAKE2b-256 3ab221e2e5546d3a679bcf2440f1cb5d96fcfaac128bc459bab9d3916f6868e8

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 25be2a763cc43b35be6d5717d239be4e9a8f05ed4e5a9fb907896f95ba8cb27c
MD5 753b9e9012de7a8c1ebc610a43d55f50
BLAKE2b-256 b5421fb69af5552d373d1e790dfd58031004bfaba406c8222b006d6d2229c0c6

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-musllinux_1_2_s390x.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-musllinux_1_2_s390x.whl
Algorithm Hash digest
SHA256 6af8df7d4d35c5b816e460841a4ca0e1e086e7acab0aa5b939de915b0112e43c
MD5 365e7c267b95cf4f1713e4e62ea27c21
BLAKE2b-256 90a059cd1122b3bdce491656bd330cabc189d514defc69a11a428008476c55dd

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-musllinux_1_2_riscv64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-musllinux_1_2_riscv64.whl
Algorithm Hash digest
SHA256 25cb1fc08837c555fe927228fef1b1217e45f703eb1b9785331c0192a2ea6629
MD5 8b4af9491b643b5237660e8d483fdc3e
BLAKE2b-256 0134d50a53251cc5f07f30a3ad32dac9c6abb6e279939a2767564e7b8fb61052

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-musllinux_1_2_ppc64le.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-musllinux_1_2_ppc64le.whl
Algorithm Hash digest
SHA256 eaa6b10fa206e2eafc1f9ca7e95c96b917f31f694497138e84e09eb6d2d813a1
MD5 57d52b9598f84453c38a00901b918456
BLAKE2b-256 3b643001c758b6cd74b4f55328b76b275bcb47c6a5fb79441e290b7d24da607a

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 4a7901ba376efd0a43d353193fe64d55324d5c68407f16f17ab50ef80004c10e
MD5 2dc1930d3d5008aadb9fecca7ffe7c35
BLAKE2b-256 3c8ba7d3d4c8bca156a5791e145428a072440096d58b4a98b1c55c7e5f1db1fe

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 0bef1c6c7e755b2ada69383e051abfbeceae22b87817ab4d73cf5ade73e8a9dd
MD5 a745fce1a748ac11d0949091584ddcc7
BLAKE2b-256 c74ead74bd7107530c3ac989ac1163e5fc5547e4b46a1ec5453277a73526774c

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 f8b5ab3fb6dc5abcd78fbbac8d3be6f5c96d94c7c1983d279f9fedccbdda8571
MD5 aaa4af6af5af79b4d29714563ae2319d
BLAKE2b-256 13560063591f24abc6f9e69abde70b051ba589685dd51974028d093e3161d48b

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-manylinux_2_34_riscv64.manylinux_2_39_riscv64.whl
Algorithm Hash digest
SHA256 6f2377d628a08115342973b432c36ec121358748705b6fa726043a88f8e5bbc2
MD5 d02ee4d6b0c1982f87de8f2aec4aa6aa
BLAKE2b-256 6270d709c62ca69d310399d1a1cf61a049ffa3d2901eb03922a1430a08bcb691

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 133cf8ff87f4f5db54a12cf9c97b79c186e2b979cb97131a321720f13123a049
MD5 df18288db3a0bfb13486fe613c3db8cf
BLAKE2b-256 82d920f561d072ad6c242af8b426c306bd49df33494e550e53160486fd5364d8

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl
Algorithm Hash digest
SHA256 43cf9adc320ced59ed79ce43735173f2df1fdc8a7387e19d5e9e14d810d633d8
MD5 08bedd70f7539b404cc88408953c7eff
BLAKE2b-256 9bbb54f86db591a9e71b4fed4f7c7d0fba0bd4c030b0f47f6d0d027370c23a20

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl
Algorithm Hash digest
SHA256 a52dc2307c8136553db2bdd62eab70f7feaf9a8010dc52fd2612f63679d4210f
MD5 8050cc4d5f2e61845a942d127020dbd1
BLAKE2b-256 cce84411217539c15a41ebbe816e6c6a3df7b6fd18817e12fccd6f848e130af6

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-manylinux2014_armv7l.manylinux_2_17_armv7l.manylinux_2_31_armv7l.whl
Algorithm Hash digest
SHA256 53fb72393d920c700f93547f5dba8f7aca6ebe2a586e200a1e9ba365a6a0af2e
MD5 65a096bec08bfc96d6312e6e5f65e91b
BLAKE2b-256 695bcf96fd6b6863314820998184c8396f2834e3086b822c747617c21972dfc4

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8f9c90d4f7f6bca00cc6fe613f6450fdbe9be7e522990cfc309adb50e95fec4c
MD5 5f9a9ee8c1abc4944edbaf31ae20e07e
BLAKE2b-256 82ec63e2e1557ce60e579858a22492b4077ddce3f23e2fc88c7e4e1f2c03b97f

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-manylinux1_i686.manylinux_2_28_i686.manylinux_2_5_i686.whl
Algorithm Hash digest
SHA256 ef9c472362458e8e0bd138d43018d55720fec26c22a46c8ad7a3cce6947e199c
MD5 32c9ac8c24a9e8922a348a6648c2894f
BLAKE2b-256 870389fc5e21d063b1991a75b8506aadf712e72a945c9a0ad1c879a61d0e1cf3

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d9f5d7856a0f9c6371a6db104996aded2a0f23327101e4a5ce34d0ae22a7c81d
MD5 e73fd10d31be6c287ee9ed007d40a2f5
BLAKE2b-256 aca4520efe0720db1be83549c79b0411af980b1e39516a5c37689d33e1fa1781

See more details on using hashes here.

File details

Details for the file stringzilla-4.6.2-cp310-cp310-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for stringzilla-4.6.2-cp310-cp310-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 05692b0e808371848afeef774792df8019ae3b3721ba24ecbdaff059ff5116e1
MD5 666a81be82f67898f6bfb906f7749788
BLAKE2b-256 d81bc57c344ee4846225df15dd8b1092e3deea3d0007c85af430384dafd97955

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page