High-performance n-gram generation for Polars
Project description
ngram_polars - N-Gram Generation for Polars
A high-performance Polars plugin for generating n-grams from text data in Python.
Installation
pip install ngram-polars
Basic example
import polars as pl
from ngram_polars import ngrams
df = pl.DataFrame({
"id": [1, 2],
"words": [
["the", "quick", "brown", "fox"],
["hello", "world"]
]
})
# Generate bigrams
result = df.with_columns(
bigrams=ngrams(pl.col("words"), n_range=[2])
)
more advanced examples
# Multiple n-gram sizes
df.with_columns(
multi_ngrams=ngrams(pl.col("words"), n_range=[1, 2, 3])
)
# Custom delimiter
df.with_columns(
underscored=ngrams(pl.col("words"), n_range=[2], delimiter="_")
)
# Lazy evaluation
(df.lazy()
.with_columns(
ngrams=ngrams(pl.col("words"), n_range=[2, 3])
)
.collect()
)
API Reference
ngrams(expr, n_range, delimiter)
Generate n-grams from a list of strings.
Parameters:
expr: IntoExpr- Polars expression representing a list of stringsn_range: list[int]- List of n-gram sizes to generate (default: [1])delimiter: str- String delimiter between words (default: " ")
Returns:
pl.Expr- Expression that generates lists of n-gram strings
Behavior:
- Returns a new list column containing all generated n-grams
- Works element-wise on list columns
- Changes the length of the output (each input list produces a new list of n-grams)
- Supports both eager and lazy evaluation
Performance Tips
- Use Lazy Evaluation: For large datasets, use lazy evaluation to optimize query planning
- Batch N-Gram Sizes: Generate multiple n-gram sizes in one call when possible
- Choose Appropriate N-Range: Only generate the n-gram sizes you actually need
Requirements
- Python 3.9 -> 3.13
- Polars requirement are not fully tested, tested on the latest version.
- Compatible with both eager and lazy Polars APIs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ngram_polars-0.1.1-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 4.2 MB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea1c479910fbbdd02a9333d99ad6b3350fd28fb4c6d9960fbdcca17445053cb8
|
|
| MD5 |
986f69a22b4bd558d6756c6c10623ceb
|
|
| BLAKE2b-256 |
eab00b07db607229b9fe32b9bc260a338cc38fc4906b9e9ee2676faa5c716766
|
File details
Details for the file ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.1 MB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db487df8630dce951498a2acb784da5adbae23af8c5b99d4baf2fe02faf729f7
|
|
| MD5 |
6eea2e3b328035ccabd44c1983693e4c
|
|
| BLAKE2b-256 |
b247d5ac0e29ae03591a583fe169209120c2cba09737ffe90db4164724849fde
|
File details
Details for the file ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 3.6 MB
- Tags: CPython 3.13, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee1144c90e35b91fb9d6b216f7f87a2e76a314d8aa069d4fbffe336ad9c0c190
|
|
| MD5 |
1904b1337baa6dd457b68901a50194d7
|
|
| BLAKE2b-256 |
d795af3012135bfdb36eed413367477526d40632b0c61aadcdc610f5d2de89fa
|
File details
Details for the file ngram_polars-0.1.1-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bcacdfeb7731a31b4bc2c840f26f7cc94d2cc13c89a9327831648f808b1b1715
|
|
| MD5 |
230ca991c9c7fe7ae88e30c753ef2167
|
|
| BLAKE2b-256 |
75ccb5630927b16f1bcf69a9fece3282e857d096d7663187b98db840e058e692
|
File details
Details for the file ngram_polars-0.1.1-cp313-cp313-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp313-cp313-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.7 MB
- Tags: CPython 3.13, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cfe281e311353268269a2d5c483cd7bc5a86d79d18d8ddc1e99ac2eb639064d
|
|
| MD5 |
0802dc3200fb21a5a649ae174138656b
|
|
| BLAKE2b-256 |
d23d215c797b1c6bbe8c9467985e9f38ec06a75332249c2c35b55119a92bc55f
|
File details
Details for the file ngram_polars-0.1.1-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 4.2 MB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cd144082a84452e948b18b1272954499bdf53337ecfffb9e10088491f8cf65ad
|
|
| MD5 |
0243afd5e79be14dd226da9137229e74
|
|
| BLAKE2b-256 |
35759b5c735c227d91ec759b17419219a5ba899b0aacb94f8cbe65d20399b631
|
File details
Details for the file ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.1 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4f63dfa81a2875daf96a98943d02402b28c70add0c8cb0fd9c31f387e8cc518f
|
|
| MD5 |
473b588734f6218e3225e97592e4f8b7
|
|
| BLAKE2b-256 |
562d7b743abb492e268e9ea83530abd9b0ec0d6be5220588a23be07c14fbe1c7
|
File details
Details for the file ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 3.6 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0656688a04f5d3f849cbe3f6269ab60301bf8b17c4157064552d5e5f84a91c9a
|
|
| MD5 |
d29970b1fe990c03961ae94d4978ab6a
|
|
| BLAKE2b-256 |
db36b41fc4afcf31d07c3adad1bfee7944b8499ef9f3ca4109b5896758b6c033
|
File details
Details for the file ngram_polars-0.1.1-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc0b99f8458ede89274f76912f6831064062d770c4be29c83d666f3ca28e0e58
|
|
| MD5 |
9ff9612b61ce679224a32cd06449c358
|
|
| BLAKE2b-256 |
eae72682891b9c3c6eac118afe56f755752f3174e3200971f7b555da83f1bf6e
|
File details
Details for the file ngram_polars-0.1.1-cp312-cp312-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp312-cp312-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.7 MB
- Tags: CPython 3.12, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ad3cd9694c34c704833f98d2843ee10ec116a166e23347249da1eda8c147803
|
|
| MD5 |
d9d8600d7dbfbe4e060e7209f0058949
|
|
| BLAKE2b-256 |
52f5c4215df4f8f8836c18b34f6d9414cca00e8fe73d84ffb191bf5bb9599209
|
File details
Details for the file ngram_polars-0.1.1-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 4.2 MB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b44cfe31920578d18813932890a0382b818d1fc609b41a83c2acadc8aa45df29
|
|
| MD5 |
5e538c1a287b31f178cd10e9a08c99fc
|
|
| BLAKE2b-256 |
49b347859c0caf8363cd787ce74a48b34921533a3440e1a352de64784ad861ee
|
File details
Details for the file ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.1 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2930316d79287a4f6455b7e4640946d3edc9a2faed319618f2f2e1fddd4fab8c
|
|
| MD5 |
284c06e95392a9289d92cab6e14e3d77
|
|
| BLAKE2b-256 |
5a82e68c4342dbbe159584d40f1d51301b86ae8223210ccdeefd341c51432647
|
File details
Details for the file ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 3.6 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a015ab36e5bd48ebeafe6fbd1d8934a624bde25d93d2da1c6fea50bd047b3251
|
|
| MD5 |
0d980c2126bff2e6a25cf05b6edf6a9c
|
|
| BLAKE2b-256 |
61675c042424a01b4ff472dd28ab727503dd5d158cf27cfd7874609f760b6e0a
|
File details
Details for the file ngram_polars-0.1.1-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c6bcfb1f5665dbbcb35e2fd96477ffb611cf986732ec9eea52780db1c1b9e6c
|
|
| MD5 |
8894c894636a707db0e5aad14c1458d5
|
|
| BLAKE2b-256 |
cdb6737d10a001bd6f31e900a1ac2400207df4d838d684e50b281b73b859baf8
|
File details
Details for the file ngram_polars-0.1.1-cp311-cp311-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp311-cp311-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.7 MB
- Tags: CPython 3.11, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69a2ff92e64bad2ea0ea0bf60cfba4f1e10050ae98af842cd53a9b2a2c8c2598
|
|
| MD5 |
f04b622c76145904bb8539c26d0c4036
|
|
| BLAKE2b-256 |
0c1575f0bdfc72b7a5465be024785c8d6d2e35eecb0cd5ac743dc4294a61b176
|
File details
Details for the file ngram_polars-0.1.1-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 4.3 MB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49a5780389972294247f5b834fdea7c9858b740428565f87653156f18830be46
|
|
| MD5 |
7523f8cdeb3d55602d0e4ad3203c0b06
|
|
| BLAKE2b-256 |
a58e1e55661ea0f8d826f2a863e1e50e287f2463ce7fcbfb474a446f1efe85fe
|
File details
Details for the file ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.1 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
20585bd77ec1fe4212a4a5712e179109e3c5b31a071db985826c742ebef78fb3
|
|
| MD5 |
343348eb981eed230362bea78788b748
|
|
| BLAKE2b-256 |
a8de88e301c495c6edb614708ab3a8d1e42571209b8f326aac113795b54184d3
|
File details
Details for the file ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 3.6 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ae393c8b60280a10d717ec8afac5dfa9382703844bebc4d487ab2c05e81bf85
|
|
| MD5 |
1afc5d9c43b48859aefb2d7ff29393ce
|
|
| BLAKE2b-256 |
f670a3c3d7ef5c3aeb5d1e25efa590aea2c7a7e5f3a41d3e413720821ecd9529
|
File details
Details for the file ngram_polars-0.1.1-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
17565ea776eb2ba6f6015edd3f4a7781f969b7daaf9679a58eac354aa7ae5627
|
|
| MD5 |
07b76ac021f2a1260ab5ff10972be6fb
|
|
| BLAKE2b-256 |
216acc872e19cdb40a83fa05e620f277abadce83a861b8e0d76f62d89dc304d0
|
File details
Details for the file ngram_polars-0.1.1-cp310-cp310-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp310-cp310-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.7 MB
- Tags: CPython 3.10, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ab8000b5e914d4ad166588d52c6b819cd8c26e042943618b91df6da8b8e1485
|
|
| MD5 |
2b5c2a8981d63170a1e3243b3a7415c3
|
|
| BLAKE2b-256 |
1117de354c86f440931d5bb31de359945925408ee5eaee7ffbe6eed6401d6b1e
|
File details
Details for the file ngram_polars-0.1.1-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 4.3 MB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
261955113660c004b371897b9f98bbbe934013e7cee9d2c835817852c839f34e
|
|
| MD5 |
4ede13a3fb04b91dd4b4b5739f66a619
|
|
| BLAKE2b-256 |
3af9ee6f2f29e19cb8c51ecce23557252cc060ab46d0b7105b20521ea2c3edcd
|
File details
Details for the file ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.1 MB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a8006e502291c6f9217f314dfc6f313c48520349142f87505f7bfa9cf7f9e34d
|
|
| MD5 |
453d3e059bebb0ddf3c61b2cda54e133
|
|
| BLAKE2b-256 |
aaac187d46e1f53a4f95b945620cc6f5f2803c6a3c97b7201e29bcac9988ddfd
|
File details
Details for the file ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 3.6 MB
- Tags: CPython 3.9, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52e4f0fcadfc782fbbe14546e76db2d94bff58cf9d92cca773be451a28b1b9b6
|
|
| MD5 |
49ff1ec8ed740b05c8f14600c017c4e6
|
|
| BLAKE2b-256 |
9b69e82528288e560ec1471242a1eab32c7e91059b733217c3b7fc70a529d0a1
|
File details
Details for the file ngram_polars-0.1.1-cp39-cp39-macosx_11_0_arm64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.3 MB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f0c01184c3d58398b117ab17c6422322ae6002937cb08cda5ac2f7b3af839a5b
|
|
| MD5 |
6e393edf304a341c6deb108cc88c2a26
|
|
| BLAKE2b-256 |
f72d0db60238ea65fcc050a22504ae04e36aab3ea30de38908b73c59db617dba
|
File details
Details for the file ngram_polars-0.1.1-cp39-cp39-macosx_10_12_x86_64.whl.
File metadata
- Download URL: ngram_polars-0.1.1-cp39-cp39-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.7 MB
- Tags: CPython 3.9, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7c7b453118013ffbe2c9df9f7dfc603562608c24858e57ea1725b1cdf9228c0a
|
|
| MD5 |
031f278e6d48de0abe81bea8ee62ec44
|
|
| BLAKE2b-256 |
fdc48549606b78c2899587c2b8faf0a8d365f5870a03f4008188953537c33a48
|