Skip to main content

High-performance n-gram generation for Polars

Project description

ngram_polars - N-Gram Generation for Polars

A high-performance Polars plugin for generating n-grams from text data in Python.

Installation

pip install ngram-polars

Basic example

import polars as pl
from ngram_polars import ngrams

df = pl.DataFrame({
    "id": [1, 2],
    "words": [
        ["the", "quick", "brown", "fox"],
        ["hello", "world"]
    ]
})

# Generate bigrams
result = df.with_columns(
    bigrams=ngrams(pl.col("words"), n_range=[2])
)

more advanced examples

# Multiple n-gram sizes
df.with_columns(
    multi_ngrams=ngrams(pl.col("words"), n_range=[1, 2, 3])
)

# Custom delimiter
df.with_columns(
    underscored=ngrams(pl.col("words"), n_range=[2], delimiter="_")
)

# Lazy evaluation
(df.lazy()
   .with_columns(
       ngrams=ngrams(pl.col("words"), n_range=[2, 3])
   )
   .collect()
)

API Reference

ngrams(expr, n_range, delimiter) Generate n-grams from a list of strings.

Parameters:

  • expr: IntoExpr - Polars expression representing a list of strings
  • n_range: list[int] - List of n-gram sizes to generate (default: [1])
  • delimiter: str - String delimiter between words (default: " ")

Returns:

  • pl.Expr - Expression that generates lists of n-gram strings

Behavior:

  • Returns a new list column containing all generated n-grams
  • Works element-wise on list columns
  • Changes the length of the output (each input list produces a new list of n-grams)
  • Supports both eager and lazy evaluation

Performance Tips

  • Use Lazy Evaluation: For large datasets, use lazy evaluation to optimize query planning
  • Batch N-Gram Sizes: Generate multiple n-gram sizes in one call when possible
  • Choose Appropriate N-Range: Only generate the n-gram sizes you actually need

Requirements

  • Python 3.9 -> 3.13
  • Polars requirement are not fully tested, tested on the latest version.
  • Compatible with both eager and lazy Polars APIs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ngram_polars-0.1.1-cp313-cp313-win_amd64.whl (4.2 MB view details)

Uploaded CPython 3.13Windows x86-64

ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.1-cp313-cp313-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

ngram_polars-0.1.1-cp313-cp313-macosx_10_12_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

ngram_polars-0.1.1-cp312-cp312-win_amd64.whl (4.2 MB view details)

Uploaded CPython 3.12Windows x86-64

ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.1-cp312-cp312-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

ngram_polars-0.1.1-cp312-cp312-macosx_10_12_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

ngram_polars-0.1.1-cp311-cp311-win_amd64.whl (4.2 MB view details)

Uploaded CPython 3.11Windows x86-64

ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.1-cp311-cp311-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

ngram_polars-0.1.1-cp311-cp311-macosx_10_12_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

ngram_polars-0.1.1-cp310-cp310-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.10Windows x86-64

ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.1-cp310-cp310-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

ngram_polars-0.1.1-cp310-cp310-macosx_10_12_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

ngram_polars-0.1.1-cp39-cp39-win_amd64.whl (4.3 MB view details)

Uploaded CPython 3.9Windows x86-64

ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ ARM64

ngram_polars-0.1.1-cp39-cp39-macosx_11_0_arm64.whl (3.3 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

ngram_polars-0.1.1-cp39-cp39-macosx_10_12_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

File details

Details for the file ngram_polars-0.1.1-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 ea1c479910fbbdd02a9333d99ad6b3350fd28fb4c6d9960fbdcca17445053cb8
MD5 986f69a22b4bd558d6756c6c10623ceb
BLAKE2b-256 eab00b07db607229b9fe32b9bc260a338cc38fc4906b9e9ee2676faa5c716766

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 db487df8630dce951498a2acb784da5adbae23af8c5b99d4baf2fe02faf729f7
MD5 6eea2e3b328035ccabd44c1983693e4c
BLAKE2b-256 b247d5ac0e29ae03591a583fe169209120c2cba09737ffe90db4164724849fde

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ee1144c90e35b91fb9d6b216f7f87a2e76a314d8aa069d4fbffe336ad9c0c190
MD5 1904b1337baa6dd457b68901a50194d7
BLAKE2b-256 d795af3012135bfdb36eed413367477526d40632b0c61aadcdc610f5d2de89fa

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bcacdfeb7731a31b4bc2c840f26f7cc94d2cc13c89a9327831648f808b1b1715
MD5 230ca991c9c7fe7ae88e30c753ef2167
BLAKE2b-256 75ccb5630927b16f1bcf69a9fece3282e857d096d7663187b98db840e058e692

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 1cfe281e311353268269a2d5c483cd7bc5a86d79d18d8ddc1e99ac2eb639064d
MD5 0802dc3200fb21a5a649ae174138656b
BLAKE2b-256 d23d215c797b1c6bbe8c9467985e9f38ec06a75332249c2c35b55119a92bc55f

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 cd144082a84452e948b18b1272954499bdf53337ecfffb9e10088491f8cf65ad
MD5 0243afd5e79be14dd226da9137229e74
BLAKE2b-256 35759b5c735c227d91ec759b17419219a5ba899b0aacb94f8cbe65d20399b631

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4f63dfa81a2875daf96a98943d02402b28c70add0c8cb0fd9c31f387e8cc518f
MD5 473b588734f6218e3225e97592e4f8b7
BLAKE2b-256 562d7b743abb492e268e9ea83530abd9b0ec0d6be5220588a23be07c14fbe1c7

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0656688a04f5d3f849cbe3f6269ab60301bf8b17c4157064552d5e5f84a91c9a
MD5 d29970b1fe990c03961ae94d4978ab6a
BLAKE2b-256 db36b41fc4afcf31d07c3adad1bfee7944b8499ef9f3ca4109b5896758b6c033

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dc0b99f8458ede89274f76912f6831064062d770c4be29c83d666f3ca28e0e58
MD5 9ff9612b61ce679224a32cd06449c358
BLAKE2b-256 eae72682891b9c3c6eac118afe56f755752f3174e3200971f7b555da83f1bf6e

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 2ad3cd9694c34c704833f98d2843ee10ec116a166e23347249da1eda8c147803
MD5 d9d8600d7dbfbe4e060e7209f0058949
BLAKE2b-256 52f5c4215df4f8f8836c18b34f6d9414cca00e8fe73d84ffb191bf5bb9599209

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 b44cfe31920578d18813932890a0382b818d1fc609b41a83c2acadc8aa45df29
MD5 5e538c1a287b31f178cd10e9a08c99fc
BLAKE2b-256 49b347859c0caf8363cd787ce74a48b34921533a3440e1a352de64784ad861ee

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2930316d79287a4f6455b7e4640946d3edc9a2faed319618f2f2e1fddd4fab8c
MD5 284c06e95392a9289d92cab6e14e3d77
BLAKE2b-256 5a82e68c4342dbbe159584d40f1d51301b86ae8223210ccdeefd341c51432647

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a015ab36e5bd48ebeafe6fbd1d8934a624bde25d93d2da1c6fea50bd047b3251
MD5 0d980c2126bff2e6a25cf05b6edf6a9c
BLAKE2b-256 61675c042424a01b4ff472dd28ab727503dd5d158cf27cfd7874609f760b6e0a

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0c6bcfb1f5665dbbcb35e2fd96477ffb611cf986732ec9eea52780db1c1b9e6c
MD5 8894c894636a707db0e5aad14c1458d5
BLAKE2b-256 cdb6737d10a001bd6f31e900a1ac2400207df4d838d684e50b281b73b859baf8

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 69a2ff92e64bad2ea0ea0bf60cfba4f1e10050ae98af842cd53a9b2a2c8c2598
MD5 f04b622c76145904bb8539c26d0c4036
BLAKE2b-256 0c1575f0bdfc72b7a5465be024785c8d6d2e35eecb0cd5ac743dc4294a61b176

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 49a5780389972294247f5b834fdea7c9858b740428565f87653156f18830be46
MD5 7523f8cdeb3d55602d0e4ad3203c0b06
BLAKE2b-256 a58e1e55661ea0f8d826f2a863e1e50e287f2463ce7fcbfb474a446f1efe85fe

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 20585bd77ec1fe4212a4a5712e179109e3c5b31a071db985826c742ebef78fb3
MD5 343348eb981eed230362bea78788b748
BLAKE2b-256 a8de88e301c495c6edb614708ab3a8d1e42571209b8f326aac113795b54184d3

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1ae393c8b60280a10d717ec8afac5dfa9382703844bebc4d487ab2c05e81bf85
MD5 1afc5d9c43b48859aefb2d7ff29393ce
BLAKE2b-256 f670a3c3d7ef5c3aeb5d1e25efa590aea2c7a7e5f3a41d3e413720821ecd9529

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 17565ea776eb2ba6f6015edd3f4a7781f969b7daaf9679a58eac354aa7ae5627
MD5 07b76ac021f2a1260ab5ff10972be6fb
BLAKE2b-256 216acc872e19cdb40a83fa05e620f277abadce83a861b8e0d76f62d89dc304d0

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 4ab8000b5e914d4ad166588d52c6b819cd8c26e042943618b91df6da8b8e1485
MD5 2b5c2a8981d63170a1e3243b3a7415c3
BLAKE2b-256 1117de354c86f440931d5bb31de359945925408ee5eaee7ffbe6eed6401d6b1e

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: ngram_polars-0.1.1-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 4.3 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for ngram_polars-0.1.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 261955113660c004b371897b9f98bbbe934013e7cee9d2c835817852c839f34e
MD5 4ede13a3fb04b91dd4b4b5739f66a619
BLAKE2b-256 3af9ee6f2f29e19cb8c51ecce23557252cc060ab46d0b7105b20521ea2c3edcd

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a8006e502291c6f9217f314dfc6f313c48520349142f87505f7bfa9cf7f9e34d
MD5 453d3e059bebb0ddf3c61b2cda54e133
BLAKE2b-256 aaac187d46e1f53a4f95b945620cc6f5f2803c6a3c97b7201e29bcac9988ddfd

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 52e4f0fcadfc782fbbe14546e76db2d94bff58cf9d92cca773be451a28b1b9b6
MD5 49ff1ec8ed740b05c8f14600c017c4e6
BLAKE2b-256 9b69e82528288e560ec1471242a1eab32c7e91059b733217c3b7fc70a529d0a1

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f0c01184c3d58398b117ab17c6422322ae6002937cb08cda5ac2f7b3af839a5b
MD5 6e393edf304a341c6deb108cc88c2a26
BLAKE2b-256 f72d0db60238ea65fcc050a22504ae04e36aab3ea30de38908b73c59db617dba

See more details on using hashes here.

File details

Details for the file ngram_polars-0.1.1-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ngram_polars-0.1.1-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 7c7b453118013ffbe2c9df9f7dfc603562608c24858e57ea1725b1cdf9228c0a
MD5 031f278e6d48de0abe81bea8ee62ec44
BLAKE2b-256 fdc48549606b78c2899587c2b8faf0a8d365f5870a03f4008188953537c33a48

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page