Skip to main content

High-performance n-gram generation for Polars

Project description

ngram_polars - N-Gram Generation for Polars

A high-performance Polars plugin for generating n-grams from text data in Python.

Installation

pip install ngram-polars

Basic example

import polars as pl
from ngram_polars import ngrams

df = pl.DataFrame({
    "id": [1, 2],
    "words": [
        ["the", "quick", "brown", "fox"],
        ["hello", "world"]
    ]
})

# Generate bigrams
result = df.with_columns(
    bigrams=ngrams(pl.col("words"), n_range=[2])
)

more advanced examples

# Multiple n-gram sizes
df.with_columns(
    multi_ngrams=ngrams(pl.col("words"), n_range=[1, 2, 3])
)

# Custom delimiter
df.with_columns(
    underscored=ngrams(pl.col("words"), n_range=[2], delimiter="_")
)

# Lazy evaluation
(df.lazy()
   .with_columns(
       ngrams=ngrams(pl.col("words"), n_range=[2, 3])
   )
   .collect()
)

API Reference

ngrams(expr, n_range, delimiter) Generate n-grams from a list of strings.

Parameters:

  • expr: IntoExpr - Polars expression representing a list of strings
  • n_range: list[int] - List of n-gram sizes to generate (default: [1])
  • delimiter: str - String delimiter between words (default: " ")

Returns:

  • pl.Expr - Expression that generates lists of n-gram strings

Behavior:

  • Returns a new list column containing all generated n-grams
  • Works element-wise on list columns
  • Changes the length of the output (each input list produces a new list of n-grams)
  • Supports both eager and lazy evaluation

Performance Tips

  • Use Lazy Evaluation: For large datasets, use lazy evaluation to optimize query planning
  • Batch N-Gram Sizes: Generate multiple n-gram sizes in one call when possible
  • Choose Appropriate N-Range: Only generate the n-gram sizes you actually need

Requirements

  • Python 3.10 -> 3.14
  • Polars requirement are not fully tested, tested on the latest version.
  • Compatible with both eager and lazy Polars APIs

Changes

0.1.2: Updated to Rust 1.93.1 and Polars 0.53.0 (dropped python 3.9 added 3.14)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ngram_polars-0.1.2-cp314-cp314-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.14Windows x86-64

ngram_polars-0.1.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.2-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.2-cp314-cp314-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.14macOS 11.0+ ARM64

ngram_polars-0.1.2-cp314-cp314-macosx_10_12_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.14macOS 10.12+ x86-64

ngram_polars-0.1.2-cp313-cp313-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.13Windows x86-64

ngram_polars-0.1.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.2-cp313-cp313-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ngram_polars-0.1.2-cp313-cp313-macosx_10_12_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

ngram_polars-0.1.2-cp312-cp312-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.12Windows x86-64

ngram_polars-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.2-cp312-cp312-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

ngram_polars-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

ngram_polars-0.1.2-cp311-cp311-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.11Windows x86-64

ngram_polars-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.2-cp311-cp311-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

ngram_polars-0.1.2-cp311-cp311-macosx_10_12_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

ngram_polars-0.1.2-cp310-cp310-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.10Windows x86-64

ngram_polars-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.2-cp310-cp310-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

ngram_polars-0.1.2-cp310-cp310-macosx_10_12_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file ngram_polars-0.1.2-cp314-cp314-win_amd64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp314-cp314-win_amd64.whl
Algorithm Hash digest
SHA256 e0f266c27d12190bdd53772263ec1c0fec757eea4175da0e9e77f8a0c8febacf
MD5 331aa3597d2effc229345247ab50a203
BLAKE2b-256 f8479179170bad34a8823e60cc977889cd18f29bf6bcb6a1ee15d3af3cd94142

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3f57cbb1b369a6610d1efb0263396962620b94440f009bb3da2c363b6cb8f63d
MD5 ec64eb5ebbe631a57bae4bc0c8f32ac9
BLAKE2b-256 bec1d368acb59f259f7b140f9c5cdaa2f0b8cfe7a24d0c1d3bff69dcee791816

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 92fcf58c089b23d939321a34a8abe470ddbc397f7475760074be5a102556a8e3
MD5 3e8d0d2cad2fc80e8b62038e8a8f0d76
BLAKE2b-256 a8865a55b91d2dd7091264688a9ba7fb4bc37fac8ec7faa7a664c23b8c127879

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp314-cp314-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp314-cp314-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 04664acb2df08b9d475e277c861c25437e4d7597216752f6bcac7be17a27e588
MD5 60a76ada19884614ffbfb9a9133169ce
BLAKE2b-256 2e82d3889cade624185ee0fe9859a4dfc2195aef4316a4ffc144fab68cbc3ed3

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp314-cp314-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp314-cp314-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e4357aab63c960bcb274500c1ce8dcd111ab50fc0fe16fe2b13a2c3e23196f7d
MD5 aa014110109b0c43e13e40f889634547
BLAKE2b-256 24b549d19d27e53ad985fb11bc3fcf6e55ce107164b9e6106ee50e1e08abea56

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 9f5c995504d4cc5fc5dd457fa6eb4dab01f621d6ab59ffc2427155a6e1062965
MD5 ead8a6a62f0d17879260198e782fbced
BLAKE2b-256 41e9f56c0ea47beaf731cb65054a01f4498a158fe43fd69e2f51ff72ba311d24

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 48af43aec71c3bd388fececd70eaba60922caf68277efa74911b9a04141a9b02
MD5 c4570d30d0ec1db2ea3f2de5879d723a
BLAKE2b-256 77a930c0375b73feff91e3212885c4ed2b317e2afb57b2505cbb3ffe4750731c

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9ca7f29a418e1911af3c3a9fd2c50fc47ac24af2481f8d0e55c9d31ad7cc0943
MD5 0a0dc659322c897048849e73f9d0c5bc
BLAKE2b-256 df01b1a84d8d9fbdff8e22361522789b444a50e63a138e392535c6c661c86f56

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bd7f45bb6c51eebcfc7fe8c38ebe09db48025fff6f6aca20780cf6f4f1625831
MD5 002e0c9e23c67bca822ba2a0d7ed8fb3
BLAKE2b-256 2480b237b85c30fc64b60e22bff7ece97f6cf27373db617f0ca8f8073223d239

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 9b061694251aa6061a76ffe4e2b378af9978bd3ea30bb628d0e7cd3d588e871e
MD5 209fc8e4fe38a14262cf3e55085916ca
BLAKE2b-256 21f729760249fad3942004dc8c76e3db08059f0ed7b23fecf655b9241b389293

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 35c960954ec32d124ef3ab19457b6bcc7dd48cc4d0fd0dfc50d883da9e265a04
MD5 29be4dcd1058699edb361a5603561163
BLAKE2b-256 4846b0c15ea2ca133f5ef8d8fcd19d500dc8ba0f8f16318cade5b27e5172c3ff

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 91d5b8dbfa81d0f903a5a2818ed3488d436a551d87fa115cd3503f47aeb56c7f
MD5 58d562a65d1c5487f38ca612c6b1988e
BLAKE2b-256 154d8005be74595094e36a466338d8510fbfe2586abde29ff0f3e3a584e03865

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8b3b4b1b7ac501407a6dea25b2f665b28b249806afc3bd4417927ddd1a2a1235
MD5 f923dab3c32fc2de983e9108c359cce3
BLAKE2b-256 3876cffe359799aef03c2c7d26fb9647f5b00f6c0c15f57010576e99aaf60432

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 40fd3d18c61d265032ebfda014e4107a7de74e8b04a71fd01c7d0dda153c8102
MD5 4f50e6736cfc99bee087e23b9865b201
BLAKE2b-256 fe5c6cce8e5fe266a6b8baad2af41960ca0d0207007f7682e4c0cebc3d516922

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 50b94a73d40352702ae7c22baaca470fa93182d84cec4765d36caea4f4818e9b
MD5 cfc4a221566fa829ca58621ea3dabfab
BLAKE2b-256 2617608c2978c730751878a9981dd2c3a47f8168dc99d10f91332a4afcf8ba2f

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 3530dc5610c8a8985a46f1a41bab20a178d10aed953d504b120dc61061094ba4
MD5 15f02e2118de32fc11cdc1bfb8ae8d64
BLAKE2b-256 fdc7e77982d32130fe8b797b7c3f38e8cdb31b7ec49f50ef2297780a912c4c6e

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3a0f4c8d118c8e15cdd290247d55e501296cfc258aaf778168910e6f5ae78284
MD5 9075ccd996209c39279fa53f77a33216
BLAKE2b-256 e364f454b4d7c893a5ccfb4ec07096da505618f49d350c4724c721e45808e3c4

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9aac36cd8d0ebb89eb4e394c3666cdede1b7be754c8fb9aea60f3cc45b704369
MD5 283dffce6e4e20fe6a1f4682d0e99319
BLAKE2b-256 ddf985daaf21ec9b56b34774aacc3d4272fd7838e9b7ec2a0c595360ec648351

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 75ccd995b5d5d01e3ac8438bd8b0f34b7fa3f0fd57601984dd5f74a17d1c86b5
MD5 34ad347fa8e98674bf2d92c4e39cfe04
BLAKE2b-256 32965ef290842a832c4ac84bc113867487f446fa55eda3c2dd284702d49553b8

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d5db843e825e1a0bde0851787f51e41f4c46e22762a637c41d56eeb0a0bda582
MD5 7de29f6d8e78d6964122c2f6b47d2f6c
BLAKE2b-256 6e835c5db0d504d74d1493ea7ade139d1a1d7212e5bb2757aa6a213819b8379c

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 54f5dcaff3847cec85f6324eb18089af79c568e8458ae6165c8b9a41d408cce5
MD5 f659a120615b821699958170d3827458
BLAKE2b-256 cc1e9b235634d989274770de2a0b261d883eda7952411664c0b05a595130b58f

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6718d2a3e6733877c67b06fded6aed49b61690bcc73c39d7530328d6fc536d3f
MD5 a0a44e14cb3e7b755e72a6e2d5aa897c
BLAKE2b-256 c419dd919cb304a4e51fc6bfcba029cefaa947ac25b27c45d362558bd4ad759e

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 cc1a286f87c637078c768eec0820ec21ee874f7160f858dad2e3aa0a5f8b9ccf
MD5 a6834afce4618ce15ed652cdebbe02e0
BLAKE2b-256 962cfc0b4b8b06bae678b62adad82fcacb970a35e9c37820422d6f39519b5303

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1d5eeb903afd2c03fa9d299b9a376489976f035c1a5cecebcfdd1867ad9a02d3
MD5 eea899d4993082615a7d215b170d9a29
BLAKE2b-256 593cd3aa1bc5a480a507f2da9b47b987397e75093503a582b16fc01519071604

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.2-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.2-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ac8b4fd11ad7c421232fdfc27f42a0ff60155aae4f638fe078a982fb39802214
MD5 1be5afac9c06f18e6570c4941e120bc0
BLAKE2b-256 f54d797f42db75b1948f91d6352eb8666709fc461a5663d122deff8623ef5ce1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page