Advanced Python chunking with stride, filtering, and expressions
Project description
smartchunks
smartchunks is a flexible, powerful chunking utility for Python that provides more than just slicing — it allows pattern-based, overlapping, and conditional chunking.
✨ Features
- ✅ Standard fixed-size chunking
- ✅
nth_position: Take every nth item before or after chunking - ✅
chunk_position: Pick every nth chunk - ✅
stride: Create overlapping chunks (sliding window) - ✅
fillvalue: Pad final incomplete chunks - ✅
filter_fn: Apply a custom condition to keep or discard chunks - ✅
materialize: Toggle between generator and list output
📦 Installation
pip install smartchunks
🧪 Usage Examples
from smartchunks import chunked
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
🔹 Standard Chunking
print(list(chunked(data, size=3)))
# → [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
🔹 nth_position (Before Chunking)
print(list(chunked(data, size=2, nth_position=2)))
# → [[1, 3], [5, 7], [9]]
🔹 chunk_position (Every nth Chunk)
print(list(chunked(data, size=2, chunk_position=2)))
# → [[3, 4], [7, 8]]
🔹 nth_position After Chunking
print(list(chunked(data, size=3, nth_position=2, apply_nth_before_chunk=False)))
# → [[2], [5], [8]]
🔹 Overlapping chunks (stride)
print(list(chunked(data, size=3, stride=1)))
# → [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8], [7, 8, 9]]
🔹 Padding incomplete chunks
print(list(chunked([1, 2, 3, 4, 5], size=3, fillvalue=0)))
# → [[1, 2, 3], [4, 5, 0]]
🔹 Filtering chunks (keep those whose sum > 15)
print(list(chunked(data, size=3, stride=1, filter_fn=lambda x: sum(x) > 15)))
# → [[5, 6, 7], [6, 7, 8], [7, 8, 9], [8, 9]]
🔹 Materialized output
chunks = chunked(data, size=2)
print(next(chunks)) # Lazy generator by default
chunks = chunked(data, size=2, materialize=True)
print(chunks) # Fully materialized list of chunks
🧠 Advanced Combinations
print(list(chunked(data, size=2, stride=2, chunk_position=2, nth_position=2, apply_nth_before_chunk=False)))
# Apply chunking with stride, select every 2nd chunk, then take the 2nd element from each
📜 License
MIT License © 2024 Maurya Allimuthu
🧩 Special Examples
🔹 stride with overlap
data = [1, 2, 3, 4, 5, 6, 7, 8, 9]
print(list(chunked(data, size=3, stride=1)))
# → [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7], [6, 7, 8], [7, 8, 9]]
🔹 nth_position then chunk_position
# Pick every 2nd element first, then every 2nd chunk
print(list(chunked(data, size=2, nth_position=2, chunk_position=2, apply_nth_before_chunk=True)))
# → [[5, 7]]
🔹 chunk_position then nth_position
# Chunk normally, keep every 2nd chunk, then take 2nd element from each
print(list(chunked(data, size=2, chunk_position=2, nth_position=2, apply_nth_before_chunk=False)))
# → [[4], [8]]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
smartchunks-0.1.1.tar.gz
(3.5 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file smartchunks-0.1.1.tar.gz.
File metadata
- Download URL: smartchunks-0.1.1.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e69f61897809f7eacd065eab55d09a9bbd611aaff502d945a5a00ffce0abd94f
|
|
| MD5 |
c29d79287253117cfa6b625d5afe796b
|
|
| BLAKE2b-256 |
e4a9f3084ceea0d431c44c82109360e543b8538e3231b8fba3c4318f07ae3e9c
|
File details
Details for the file smartchunks-0.1.1-py3-none-any.whl.
File metadata
- Download URL: smartchunks-0.1.1-py3-none-any.whl
- Upload date:
- Size: 3.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a703b70f2fb997b5efe63aba6118b6c62566adbc9d836bb926eac48f99da3ef
|
|
| MD5 |
6ee7ede32cd4cacf9074fe391fbdc7b1
|
|
| BLAKE2b-256 |
6f1f0130353aa119fe271ea1451d010c2f239ed00748ad012b4f48ddccbea7a9
|