Crunch 100+ GB Strings in Python with ease
Project description
Stringzilla
Crunch 100+ GB Strings in Python with ease, leveraging SIMD Assembly
Stringzilla was born many years ago as a tutorial for SIMD accelerated string-processing.
But one day, processing 100+ GB Chemistry and AI datasets, I decided to transform it into a library.
It's designed to replace open(...).readlines()
, str().splitlines()
and many other common workloads with very long strings.
|
|||||||||||||||||
Usage
pip install stringzilla
There are two classes you can use interchangibly:
from stringzilla import Str, File, Slices
text: str = 'some-string'
text: Str = Str('some-string')
text: File = File('some-file.txt')
Once constructed, following interfaces are supported:
len(text) -> int
'substring' in text -> bool
text[42] -> str
text.contains(
'subtring',
start=0, # optional
end=9223372036854775807, # optional
) -> bool
text.find(
'subtring',
start=0, # optional
end=9223372036854775807, # optional
) -> int
text.count(
'subtring',
start=0, # optional
end=9223372036854775807, # optional
**, # non-traditional arguments:
allowoverlap=False, # optional
) -> int
text.splitlines(
keeplinebreaks=False, # optional
**, # non-traditional arguments:
separator='\n', # optional
) -> Slices # similar to list[str]
text.split(
separator=' ', # optional
maxsplit=9223372036854775807, # optional
**, # non-traditional arguments:
keepseparator=False, # optional
) -> Slices # similar to list[str]
Development
rm -rf build && pip install -e . && pytest scripts/test.py -s -x
To benchmark on some custom file and pattern combination:
python scripts/bench.py --path "your file" --pattern "your pattern"
To validate packaging:
cibuildwheel --platform linux
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file stringzilla-0.1.1-cp311-cp311-manylinux_2_28_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp311-cp311-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 228.7 kB
- Tags: CPython 3.11, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 726046a6f54447d3921219ecbaa1abc79efa70a3fbc549b84bf694f9a996198d |
|
MD5 | 7be8b94dc0faa79fca4e479006d0c81f |
|
BLAKE2b-256 | bd7cdd4d5fe643323f123ac1da4fc37975009526569e4c7d4d4f34043fa3818e |
File details
Details for the file stringzilla-0.1.1-cp311-cp311-manylinux_2_28_aarch64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp311-cp311-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 223.1 kB
- Tags: CPython 3.11, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ebadb540d181b4875fe67c91ee84ed6549401aedc77f320ef9cc289247b51e8 |
|
MD5 | 4f190e262f76e2198d83758b9db2fb64 |
|
BLAKE2b-256 | 2572769e2a79b2176a0a0610ff095a730fe92d45d7ae05daccdd7344f6941cad |
File details
Details for the file stringzilla-0.1.1-cp311-cp311-macosx_11_0_arm64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 106.6 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b3e0bc54602aabda53c6a4c22fbc25abef9458fa4dc1f72e5356a5678b37786 |
|
MD5 | 7c09f76055889030206cf9e26a11ec92 |
|
BLAKE2b-256 | e0c2bc80af8f2be4b52af164c8bfd9608d62c5ee8556b25a50b6552c9010e41a |
File details
Details for the file stringzilla-0.1.1-cp311-cp311-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp311-cp311-macosx_10_9_x86_64.whl
- Upload date:
- Size: 109.4 kB
- Tags: CPython 3.11, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6663f30d7bb9a255e6298dcbec2636ff5a21286d04ba7a320cbf996631adb322 |
|
MD5 | 57f1265f97ccdd70c82d9afbad1441e8 |
|
BLAKE2b-256 | 5e486213c50dd4f415f8061ceae9a08b64af7bfc1f8ab196527eb3e8ac546f83 |
File details
Details for the file stringzilla-0.1.1-cp311-cp311-macosx_10_9_universal2.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp311-cp311-macosx_10_9_universal2.whl
- Upload date:
- Size: 214.5 kB
- Tags: CPython 3.11, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fddfb659988036bcb7eb99cfaed1b290c15c77209bf45adb9249738835449cd2 |
|
MD5 | 8936fa1dd4c9201bb4db3d2892ddbe0e |
|
BLAKE2b-256 | 1ad954846642dd54ae1e4f80fc303388f402d5499ba9c48c73487b36e00f8b8d |
File details
Details for the file stringzilla-0.1.1-cp310-cp310-manylinux_2_28_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp310-cp310-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 228.7 kB
- Tags: CPython 3.10, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7a54febb9de815a9ada5a6c369d8ab8c08ab4083a1fd26d261d99c641ece131 |
|
MD5 | 7edbb632d9d96118c2979cf91355ce51 |
|
BLAKE2b-256 | 30989136f92a4f99fc3488f915891392ee3a533ddcf7e5a9e4359841279c9236 |
File details
Details for the file stringzilla-0.1.1-cp310-cp310-manylinux_2_28_aarch64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp310-cp310-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 223.1 kB
- Tags: CPython 3.10, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8168af10727372e58b844bd3345fb280e847e7e54aafe5c38d5feb7796af810 |
|
MD5 | ed51981c039ebbd0201f9b6cb177b3c4 |
|
BLAKE2b-256 | 4f261a8015efde2c025b5c5831d3c92f53285f94ed0fd308d16f05ec8c7ef774 |
File details
Details for the file stringzilla-0.1.1-cp310-cp310-macosx_11_0_arm64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 106.7 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65fe6e8e3e9008b3a4b1569067f84244dd24002745ccda0d0082c3bc53fa048e |
|
MD5 | e125fcf6c07c82279c5719781a514031 |
|
BLAKE2b-256 | b576cd9e2ad5faead576bf9175c1c0663f2cac5bc450176106a4642c924393e8 |
File details
Details for the file stringzilla-0.1.1-cp310-cp310-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp310-cp310-macosx_10_9_x86_64.whl
- Upload date:
- Size: 109.4 kB
- Tags: CPython 3.10, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cdf1a79e3a581f2cef87cee61eabe55fca8e55dc622cba2dbec4a6b9b2c05b1c |
|
MD5 | 4c90e24caf39a3d851b27232a085424b |
|
BLAKE2b-256 | 80db871752aeaa357b230d1123ebcc5d636ba67d88956a88206a6bbb74356e82 |
File details
Details for the file stringzilla-0.1.1-cp310-cp310-macosx_10_9_universal2.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp310-cp310-macosx_10_9_universal2.whl
- Upload date:
- Size: 214.6 kB
- Tags: CPython 3.10, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6f312bc22b8d8ad629f665f799c662ae8088218ae891087ea7cdf23623ae60d |
|
MD5 | bd4c87f7cb17718bb394047694eca871 |
|
BLAKE2b-256 | 60099d3d000325d048735119ec24b190244c1cc313b11d14d503b776a92e4ee5 |
File details
Details for the file stringzilla-0.1.1-cp39-cp39-manylinux_2_28_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp39-cp39-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 228.8 kB
- Tags: CPython 3.9, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 00beeb1c2d488edd225b1ee6c5701ff914dfb6148428615dcb6a3ee3b930c66a |
|
MD5 | df31c9572a04ac5cdc884fba679c7223 |
|
BLAKE2b-256 | 76718de2a4ba2edc65ec82cbde5257ef18f99f420a2810d55c302b26737bd7b6 |
File details
Details for the file stringzilla-0.1.1-cp39-cp39-manylinux_2_28_aarch64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp39-cp39-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 223.3 kB
- Tags: CPython 3.9, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 32db10ea47a3a152cbd0e9b9b237f60372ddcefd537840b8de7f376a515d365c |
|
MD5 | 372dfa4e140b022095fd31f6e48db9db |
|
BLAKE2b-256 | ec447e19c2c04b095ee94b28203a972d2c58db3e7b6f5fb93287b3c1afba6bcf |
File details
Details for the file stringzilla-0.1.1-cp39-cp39-macosx_11_0_arm64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 106.7 kB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7848596e8cbe0104b6bd4a3b81456f6c28a2f336e048893c0c4041fa3aff6eb9 |
|
MD5 | 2994a9f302b4160c9d66e0f49ff96c01 |
|
BLAKE2b-256 | 3f6572801842fb988ff7b14967f6afcc317378326abd953ca2789dd4cf445da9 |
File details
Details for the file stringzilla-0.1.1-cp39-cp39-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp39-cp39-macosx_10_9_x86_64.whl
- Upload date:
- Size: 109.6 kB
- Tags: CPython 3.9, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35df7011217d061b7ead13e2135f11643a27bf7780e129930a3c310feef1a945 |
|
MD5 | 33a18af24e61ec20b082dd7be10aa710 |
|
BLAKE2b-256 | 7fcdd3ca1f81ad38d420515a1bd72a47aee9234c5174fbd3791778ca34c78dc6 |
File details
Details for the file stringzilla-0.1.1-cp39-cp39-macosx_10_9_universal2.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp39-cp39-macosx_10_9_universal2.whl
- Upload date:
- Size: 214.8 kB
- Tags: CPython 3.9, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44d6ca4ac665e111dc4e876dba90af3adb2d38183557b204c23cbe276a309c70 |
|
MD5 | 53a0e56e6470e424d6f27eb00d2373d6 |
|
BLAKE2b-256 | cbec3a067cbc7ad3e530f4e6f19991951fb6909480e85b5c0f6c647c75d90db8 |
File details
Details for the file stringzilla-0.1.1-cp38-cp38-manylinux_2_28_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp38-cp38-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 228.5 kB
- Tags: CPython 3.8, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d86a99f44b8a36632d95454bca79ffc5978f5548164531cb8ded16e309ff28b |
|
MD5 | 165cecfa3fb156a55b08fdf9bb6e9774 |
|
BLAKE2b-256 | 0a0c11ababfac60376d0cca9d3affcdadcd1cceeef3455a6ebce4e1664898620 |
File details
Details for the file stringzilla-0.1.1-cp38-cp38-manylinux_2_28_aarch64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp38-cp38-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 223.0 kB
- Tags: CPython 3.8, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba42f1da28c1bf50b3d1e34e410d1a01c6a816c5068c1747a26e3a528c8e5d44 |
|
MD5 | 2b333ba43c8784417b01949da2b8dc57 |
|
BLAKE2b-256 | abc15c61bc907e79b6ea671fce1e6891fdf7cf6c22565a6ef5450dd1055ffc9e |
File details
Details for the file stringzilla-0.1.1-cp38-cp38-macosx_11_0_arm64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp38-cp38-macosx_11_0_arm64.whl
- Upload date:
- Size: 106.6 kB
- Tags: CPython 3.8, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99f76f0b7473037078a4d4bdb48241b7f6fb007c7a49cac6e5c78d735717fe2d |
|
MD5 | 806bbb7575a7a95abb47f0201bd6b49b |
|
BLAKE2b-256 | c836e0bf7a38d2865c9dfc516e7fa869545942e23ec774f372261e7b43e17613 |
File details
Details for the file stringzilla-0.1.1-cp38-cp38-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp38-cp38-macosx_10_9_x86_64.whl
- Upload date:
- Size: 109.3 kB
- Tags: CPython 3.8, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd332fb0088daa6afcbeeab66a8b8186f7e6568befa9f5f6481ae245d6767a75 |
|
MD5 | 02d8d6c378241af96acdc5dcee957fd2 |
|
BLAKE2b-256 | b49e5c1c08a29fd8688e049cb5fe9772af30f9baf56c5d3c1f134f5a3e0533ee |
File details
Details for the file stringzilla-0.1.1-cp38-cp38-macosx_10_9_universal2.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp38-cp38-macosx_10_9_universal2.whl
- Upload date:
- Size: 214.3 kB
- Tags: CPython 3.8, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46eb480e112645df717a2a5ba3d58b55790c25a2205792e2213affd9039461b8 |
|
MD5 | d8f6502bdba4391cfbefd49d0dcf474b |
|
BLAKE2b-256 | 7ee0a2be3b0a59ff8bc51ee952a9f640b4e62d0d3a3afbd698b060604ae43f9e |
File details
Details for the file stringzilla-0.1.1-cp37-cp37m-manylinux_2_28_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp37-cp37m-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 229.5 kB
- Tags: CPython 3.7m, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b44925e8ce84e882263d1b4a83248de426bf0b7eaba3be071716ba5ba20d4bea |
|
MD5 | fb1c048aa7b7e9ac2dcb0b44a76f51bf |
|
BLAKE2b-256 | a7c2bd07b03766ea2b80eab32e8337b8d28799cbab54d69796310b18518a1f56 |
File details
Details for the file stringzilla-0.1.1-cp37-cp37m-manylinux_2_28_aarch64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp37-cp37m-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 225.9 kB
- Tags: CPython 3.7m, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 25472dc290bf33dc260bb475748b79c88e52f52b4967e520920ace976340f2b5 |
|
MD5 | 39a9f135bd256267723ce1700dbe43a6 |
|
BLAKE2b-256 | c3c60c41fab4f8a706b45531ea3aa8e944fd34a7a83b37a583ff2d2563a750eb |
File details
Details for the file stringzilla-0.1.1-cp37-cp37m-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp37-cp37m-macosx_10_9_x86_64.whl
- Upload date:
- Size: 107.0 kB
- Tags: CPython 3.7m, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe85363b1aefca73424200438acb93c121dda3d0962b0b78d7faf505d9094e3c |
|
MD5 | d29cd250f0abd7dac39d994854605418 |
|
BLAKE2b-256 | 1fc00ddef88789124cba5a8b153a543c19a5520c99319befeb05153ff074b705 |
File details
Details for the file stringzilla-0.1.1-cp36-cp36m-manylinux_2_28_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp36-cp36m-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 229.2 kB
- Tags: CPython 3.6m, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b8cbd4de5fd8e202dfdc652268e128e42f9317bcb46bc8ee0c5018d717dc70dc |
|
MD5 | 21e982d642f9bb5b03da3c46c8d83e98 |
|
BLAKE2b-256 | fdf6135aba73a17d77ef2c35b1fca76a85b148671871b2f3ca0aa86165a79e27 |
File details
Details for the file stringzilla-0.1.1-cp36-cp36m-manylinux_2_28_aarch64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp36-cp36m-manylinux_2_28_aarch64.whl
- Upload date:
- Size: 224.4 kB
- Tags: CPython 3.6m, manylinux: glibc 2.28+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | df730f30c08e4e66ded61858b1f1e30726d7428b677e5cca38b9aad3e8af77cc |
|
MD5 | 0a20411094057b5f7090c6e2e1f01122 |
|
BLAKE2b-256 | b40fc0fae8781602720b8ca09ecb64835f9eb3f96fe3f348709d99f00b0e2952 |
File details
Details for the file stringzilla-0.1.1-cp36-cp36m-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: stringzilla-0.1.1-cp36-cp36m-macosx_10_9_x86_64.whl
- Upload date:
- Size: 107.0 kB
- Tags: CPython 3.6m, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.1 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3aa0706408dd98e6c649ba3c1d8679a20b4955873622c4c8a1c3b556c660b399 |
|
MD5 | 09df453d8c9c47f2366620dedc5094d2 |
|
BLAKE2b-256 | de07aa280b28611af266fb481515d0b4b2d51f7d921f9829858c557dbe68902d |