Skip to main content

The fastest semantic text chunking library

Project description

chonkie-core

chonkie-core

the fastest text chunking library — up to 1 TB/s throughput

crates.io PyPI npm GitHub License


you know how every chunking library claims to be fast? yeah, we actually meant it.

chonkie-core splits text at semantic boundaries (periods, newlines, the usual suspects) and does it stupid fast. we're talking "chunk the entire english wikipedia in 120ms" fast.

want to know how? read the blog post where we nerd out about SIMD instructions and lookup tables.

📦 installation

pip install chonkie-core

looking for rust or javascript?

🚀 usage

from chonkie_core import Chunker

text = "Hello world. How are you? I'm fine.\nThanks for asking."

# with defaults (4KB chunks, split at \n . ?)
for chunk in Chunker(text):
    print(bytes(chunk))

# with custom size
for chunk in Chunker(text, size=1024):
    print(bytes(chunk))

# with custom delimiters
for chunk in Chunker(text, delimiters=".?!\n"):
    print(bytes(chunk))

# with multi-byte pattern (e.g., metaspace ▁ for SentencePiece tokenizers)
for chunk in Chunker(text, pattern="▁", prefix=True):
    print(bytes(chunk))

# with consecutive pattern handling (split at START of runs, not middle)
for chunk in Chunker("word   next", pattern=" ", consecutive=True):
    print(bytes(chunk))

# with forward fallback (search forward if no pattern in backward window)
for chunk in Chunker(text, pattern=" ", forward_fallback=True):
    print(bytes(chunk))

# collect all chunks
chunks = list(Chunker(text))

chunks are returned as memoryview objects (zero-copy slices of the original text).

📝 citation

if you use chonkie-core in your research, please cite it as follows:

@software{chunk2025,
  author = {Minhas, Bhavnick},
  title = {chunk: The fastest text chunking library},
  year = {2025},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/chonkie-inc/chunk}},
}

📄 license

licensed under either of Apache License, Version 2.0 or MIT license at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chonkie_core-0.10.2.tar.gz (70.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

chonkie_core-0.10.2-cp314-cp314-win_amd64.whl (228.7 kB view details)

Uploaded CPython 3.14Windows x86-64

chonkie_core-0.10.2-cp314-cp314-pyemscripten_2026_0_wasm32.whl (101.8 kB view details)

Uploaded CPython 3.14PyEmscripten 2026.0 wasm32

chonkie_core-0.10.2-cp314-cp314-manylinux_2_28_aarch64.whl (370.6 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ ARM64

chonkie_core-0.10.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (386.2 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

chonkie_core-0.10.2-cp314-cp314-macosx_11_0_arm64.whl (336.5 kB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

chonkie_core-0.10.2-cp314-cp314-macosx_10_12_x86_64.whl (346.8 kB view details)

Uploaded CPython 3.14macOS 10.12+ x86-64

chonkie_core-0.10.2-cp313-cp313-win_amd64.whl (229.4 kB view details)

Uploaded CPython 3.13Windows x86-64

chonkie_core-0.10.2-cp313-cp313-manylinux_2_28_aarch64.whl (371.7 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ ARM64

chonkie_core-0.10.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (386.4 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

chonkie_core-0.10.2-cp313-cp313-macosx_11_0_arm64.whl (336.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

chonkie_core-0.10.2-cp313-cp313-macosx_10_12_x86_64.whl (346.8 kB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

chonkie_core-0.10.2-cp312-cp312-win_amd64.whl (229.7 kB view details)

Uploaded CPython 3.12Windows x86-64

chonkie_core-0.10.2-cp312-cp312-manylinux_2_28_aarch64.whl (371.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

chonkie_core-0.10.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (387.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

chonkie_core-0.10.2-cp312-cp312-macosx_11_0_arm64.whl (336.9 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

chonkie_core-0.10.2-cp312-cp312-macosx_10_12_x86_64.whl (347.5 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

chonkie_core-0.10.2-cp311-cp311-win_amd64.whl (231.0 kB view details)

Uploaded CPython 3.11Windows x86-64

chonkie_core-0.10.2-cp311-cp311-manylinux_2_28_aarch64.whl (374.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

chonkie_core-0.10.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (390.0 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

chonkie_core-0.10.2-cp311-cp311-macosx_11_0_arm64.whl (339.7 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

chonkie_core-0.10.2-cp311-cp311-macosx_10_12_x86_64.whl (349.5 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

chonkie_core-0.10.2-cp310-cp310-win_amd64.whl (230.9 kB view details)

Uploaded CPython 3.10Windows x86-64

chonkie_core-0.10.2-cp310-cp310-manylinux_2_28_aarch64.whl (375.2 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ ARM64

chonkie_core-0.10.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (390.1 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

chonkie_core-0.10.2-cp310-cp310-macosx_11_0_arm64.whl (339.2 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

chonkie_core-0.10.2-cp310-cp310-macosx_10_12_x86_64.whl (349.2 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file chonkie_core-0.10.2.tar.gz.

File metadata

  • Download URL: chonkie_core-0.10.2.tar.gz
  • Upload date:
  • Size: 70.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for chonkie_core-0.10.2.tar.gz
Algorithm Hash digest
SHA256 c8e40ef8f3a034a7c5dd23a0401dce2ef2b4883f5a6a29cf94176d64b209bdbb
MD5 7b156efa0e294ab2250c594376f7fe6a
BLAKE2b-256 2017ad31bbcfe1a7b63f76e065684a8e9cd98552d3c627fb28d8fbc05f1a9456

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 20c9d64b9c5169d1f7e5ceeaa22bff3613017b8937c224ba85e3e24748fb0ba3
MD5 6b8c8310dc07737e44aba049bcf53977
BLAKE2b-256 b093e3167b1823f919f1a4c517092f73dc4107737de8bb53c67ff60c964b5244

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp314-cp314-pyemscripten_2026_0_wasm32.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp314-cp314-pyemscripten_2026_0_wasm32.whl
Algorithm Hash digest
SHA256 acf83c774d5646dd9f16b896722155db8e87948fc93ee952fd649b5344d9205d
MD5 1dcd987a3979075ed93dcd686a2b8d65
BLAKE2b-256 f5d0103da313f8990607ac77c4735f372734a4d96afe79fdabc5883731bb2063

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp314-cp314-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp314-cp314-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 d1999d993df8ecf941361b8eff9618f2a9bf3436b4a1851ec3f3f42d9a9c1a7e
MD5 187f66d95ec1a7e558445e097dd6f973
BLAKE2b-256 781e4c775aec3d7ef6ff6ad90570c5b6f612fda9bf1ae7dfb4bd322d0fe55ffe

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e4b6e2cf0411cb675d8b22f2b234b07e20724ded6cf84b2bc625b677885d2ef7
MD5 2bbe14e7178a220b7b09d04799644d90
BLAKE2b-256 5605b3e206f063e6515a1b2e1f873ff33d91bf1eaab99554577c47804ebbbf37

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5266bdb8887c223782096c97a72eb49f19930f797213fa65b82199042742dc98
MD5 e3b208bd395a8412604fed24bf6ebde4
BLAKE2b-256 264d29e42ac094624a0ebfe74684ce62eb870e640679c1feb8da63e631a45cde

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp314-cp314-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp314-cp314-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 cb0626fa4d9df5128188eef069f69809571fbfc6db47e348d5f4d1199fa34b9c
MD5 f1e7fa0f823acc58507d49997a6a52fc
BLAKE2b-256 0b45e4d68840847133afa9c69fc0e28575270afdd9dcc0c13280a9b40445377d

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 9f77811bb722bd019a52353364bc6dcea1a9998120a413f1c2a1c79153a66386
MD5 a2dd34481bb601ab3b8eb40e16f3075f
BLAKE2b-256 3839158f85b728d84e85a00768ccebe3a289d21462f52ba7850db9e01e61e4df

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp313-cp313-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp313-cp313-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 72974163eab058d14c365f45b53d2367b1e04a0dcd779dedd828973cf21256d7
MD5 50115f245878bd55e347b485b516ed6c
BLAKE2b-256 046e2c505276878b695b4cee07eb35a5c536ed92d60c722b4547da1517fb3e41

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 aebeba2d2cb0d30eefdbdb18d2ab2fcfbc0ea0d5121f61f537a6114a1da56101
MD5 664a8d836a0a2f3501647de567311a55
BLAKE2b-256 e976079296e83d2ed6e34b8ed24bd23e59193b502b53ddfa176b3e921410a5db

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3da60e797b2fcbef3e17fad3e46bb7d4f5541ac5f87a5556c2c610cef83a549a
MD5 ea160fda04cae33641b961fd9725bba5
BLAKE2b-256 3f33b640f406b6d2fe30d5c41d4a91411d9ed1dc22acf6497cda837a6f121ba4

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 25372879e43b235dee01f48309634a3fce07b55c382a7fbe200d6b66769267e4
MD5 0335bcc1529e9c2d551d167f54ea65cd
BLAKE2b-256 594f0814d2bd9abddbbc37f8c204f056d28f2ab7d14bf742515e9684062bd758

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ac686d192c4dd7e038cea4aba59c677d45e2b1e8de8eb1dd45d269b22a4ae201
MD5 e1bb9b76bf97f0d8b9e1917fa019536b
BLAKE2b-256 8cc205c2c85b11d092f04768c41fb9cfdd27a9d27fa7e10e107188a51ce8813f

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 5a09bcb56d9d9f2fa6ec84485aeecbcaa375567eea9fbec18517d2ef643b9029
MD5 4a52d63d04537af654dbbc3adf83bf46
BLAKE2b-256 89c8b0d91419730bff4566305f2c85f27fb1378e8e1e99f8e59403c6b25d43a2

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 51dbf2163dac5ad695dde3f92155c6febb9332b5fb5a174ac80ae9fc8f8ccde9
MD5 35785e3d8a202d57cd44c09155730f8d
BLAKE2b-256 0d9fcbfec0f25e35d2a5b62a9cba7a5a85cfc5349210b9f9acb40832060211e4

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a6d081363fbfaad0f36c66e67e6dec1c9f9850c04c0475b64b3e47084257b646
MD5 77e904e9a7768ffab3ce6b0ed62b74dd
BLAKE2b-256 b23fba4083f21b4ef52ed130a79f371fed453b7edf998bf08079694137880bff

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 31f8fa3415d2e93cc3bdb1e99c3b6865278169af348c68fb2943aed2fcabc010
MD5 cdcce79449d0029ccbc85c138f6f3fbf
BLAKE2b-256 562f4dd88b4af9ef0e9ed31a3c6a3d8e44327c5ca22b074076830b0224e0ad58

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 cbe5a8a1e89a79a74bac99409e1bf5e7a5b3e2286ac5f2185077d39f5fa95175
MD5 52ae000c07cf37403081993b7b04d2fa
BLAKE2b-256 5462425bca737db62a371b1e3393d76f69f483baf3482ccd88504089501ee0e4

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f69969fa3f75930bd10f1ec502993b758df4e5a62d170c373a4fa73743dc2f93
MD5 f1e6efae5c47cf819ad92169a13784c0
BLAKE2b-256 50654a86e1648147df94bcc2d681b6c5f40a21c57df65b45bf2f3ceb72fcb784

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ff79cbe5016f85a8fac2164e1716f3335b631704bfa486f0d1de02c6d88c2bce
MD5 e43d20631720a4cdaf3c2e8eb35e8da4
BLAKE2b-256 a505513baf0c159fb93e6c05bf199cca79045b2c186e9fea1fd7ebf47d063c7b

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 712ecc448302bdc8f3ab4aab1e17d5d37b50d57d89d4fa4ec44821a37eb8fa57
MD5 4612b09fd878b3011663ecf5d7714fce
BLAKE2b-256 93a6069b46520a23264585ddacd117bee62594afff0f851cd1dfedf32b900faf

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a431af90ac4e4072e02699ac1300efcaaff2ee432b052fe51027998859af1055
MD5 2811494f7efa7fa4ea000ea453e98721
BLAKE2b-256 c90b4c28270b24d9e756c90a52966f26ddff5fa2f639aed7e6cd9cc1e1728176

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c7c2c431d86fbae7c1f8d0b8628236df27b4dab040dc6504ded209be9e36f5b0
MD5 a22ec36d2e163090d8c9d9a626680c2f
BLAKE2b-256 16083f55dbb5cd033b9c33e9e5fbaa5ee88af3dbc78e6858cf42341c903c5cc6

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp310-cp310-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp310-cp310-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e7e05701463e8e7a339a7be617fe2306153dcb5f269f43251e1516f40ce91134
MD5 d68a2f3581aaa9b9cb330b3eb3bf0a11
BLAKE2b-256 0962c3ae371065d31e52eb96d05ef02384521c99dd6bd036bc4b1d7a09ea4cbb

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2225d6e6622b26d4542ed9661e71b51dfd8b4c5ae7e652fe85b4f0faa4be8eba
MD5 f8b901ce140e23076640664cfd4b9470
BLAKE2b-256 12bd373ca97e84a4c4caf9adba9e710bc48f50c87524379a949cd17ecc588def

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 86c3e04603ce69d380f3f6a8f14f85dd902410ad1ec86709ddcfad43dc968946
MD5 4f936dfbfdbc56aa53ef0dd749fe4e36
BLAKE2b-256 ed9d2ed7af6666a168fcb18328e06354035f506aa8965a479827aa6f5528b189

See more details on using hashes here.

File details

Details for the file chonkie_core-0.10.2-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for chonkie_core-0.10.2-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b63d25a3270a98eb5a6739072315f36566c87139078797a0970bc8d480f5bbe0
MD5 660952da746ecbdc7cf45f423bddd224
BLAKE2b-256 9ea016ff6f6d3ddfb3f39d89663717e91eaa1bd5ba6ac52ec2fa01795ac33c65

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page