Skip to main content

Advanced Python chunking with stride, filtering, and expressions

Project description

smartchunks

smartchunks is a flexible, powerful chunking utility for Python that provides more than just slicing — it allows pattern-based, overlapping, and conditional chunking.


✨ Features

  • ✅ Standard fixed-size chunking
  • nth_position: Take every nth item before or after chunking
  • chunk_position: Pick every nth chunk
  • stride: Create overlapping chunks (sliding window)
  • fillvalue: Pad final incomplete chunks
  • filter_fn: Apply a custom condition to keep or discard chunks
  • materialize: Toggle between generator and list output

📦 Installation

pip install smartchunks

🧪 Usage Examples

from smartchunks import chunked

data = [1, 2, 3, 4, 5, 6, 7, 8, 9]

🔹 Standard Chunking

print(list(chunked(data, size=3)))
# → [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

🔹 nth_position (Before Chunking)

print(list(chunked(data, size=2, nth_position=2)))
# → [[1, 3], [5, 7], [9]]

🔹 chunk_position (Every nth Chunk)

print(list(chunked(data, size=2, chunk_position=2)))
# → [[3, 4], [7, 8]]

🔹 nth_position After Chunking

print(list(chunked(data, size=3, nth_position=2, apply_nth_before_chunk=False)))
# → [[2], [5], [8]]

🔹 Overlapping chunks (stride)

print(list(chunked(data, size=3, stride=1)))
# → [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8], [7, 8, 9]]

🔹 Padding incomplete chunks

print(list(chunked([1, 2, 3, 4, 5], size=3, fillvalue=0)))
# → [[1, 2, 3], [4, 5, 0]]

🔹 Filtering chunks (keep those whose sum > 15)

print(list(chunked(data, size=3, stride=1, filter_fn=lambda x: sum(x) > 15)))
# → [[5, 6, 7], [6, 7, 8], [7, 8, 9], [8, 9]]

🔹 Materialized output

chunks = chunked(data, size=2)
print(next(chunks))  # Lazy generator by default

chunks = chunked(data, size=2, materialize=True)
print(chunks)  # Fully materialized list of chunks

🧠 Advanced Combinations

print(list(chunked(data, size=2, stride=2, chunk_position=2, nth_position=2, apply_nth_before_chunk=False)))
# Apply chunking with stride, select every 2nd chunk, then take the 2nd element from each

📜 License

MIT License © 2024 Maurya Allimuthu


🧩 Special Examples

🔹 stride with overlap

data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
print(list(chunked(data, size=3, stride=1)))
# → [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8], [7, 8, 9]]

🔹 nth_position then chunk_position

# Pick every 2nd element first, then every 2nd chunk
print(list(chunked(data, size=2, nth_position=2, chunk_position=2, apply_nth_before_chunk=True)))
# → [[5, 7]]

🔹 chunk_position then nth_position

# Chunk normally, keep every 2nd chunk, then take 2nd element from each
print(list(chunked(data, size=2, chunk_position=2, nth_position=2, apply_nth_before_chunk=False)))
# → [[4], [8]]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartchunks-0.1.1.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smartchunks-0.1.1-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file smartchunks-0.1.1.tar.gz.

File metadata

  • Download URL: smartchunks-0.1.1.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.12

File hashes

Hashes for smartchunks-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e69f61897809f7eacd065eab55d09a9bbd611aaff502d945a5a00ffce0abd94f
MD5 c29d79287253117cfa6b625d5afe796b
BLAKE2b-256 e4a9f3084ceea0d431c44c82109360e543b8538e3231b8fba3c4318f07ae3e9c

See more details on using hashes here.

File details

Details for the file smartchunks-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: smartchunks-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.12

File hashes

Hashes for smartchunks-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2a703b70f2fb997b5efe63aba6118b6c62566adbc9d836bb926eac48f99da3ef
MD5 6ee7ede32cd4cacf9074fe391fbdc7b1
BLAKE2b-256 6f1f0130353aa119fe271ea1451d010c2f239ed00748ad012b4f48ddccbea7a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page