Skip to main content

A text embedding extension for the Polars Dataframe library.

Project description

polars-candle

A polars extension for running candle ML models on polars DataFrames.

Example

Pull any applicable model from Huggingface, such as the recently released Snowflake model, and embed text using a simple API.

import polars as pl
import polars_candle  # ignore: F401

df = pl.DataFrame({"s": ["This is a sentence", "This is another sentence"]})

embed_kwargs = {
    "model_repo": "Snowflake/snowflake-arctic-embed-xs",
    "pooling": "mean", 
}

df = df.with_columns(
    pl.col("s").candle.embed_text(**embed_kwargs).alias("s_embedding")
)
print(df)
# ┌──────────────────────────┬───────────────────────────────────┐
# │ s                        ┆ s_embedding                       │
# │ ---                      ┆ ---                               │
# │ str                      ┆ array[f32, 384]                   │
# ╞══════════════════════════╪═══════════════════════════════════╡
# │ This is a sentence       ┆ [-0.056457, 0.559411, … -0.20403… │
# │ This is another sentence ┆ [-0.117206, 0.336827, … 0.174078… │
# └──────────────────────────┴───────────────────────────────────┘

Currently, Bert, JinaBert, and Distilbert models are supported. More models will be added in the future. Check my other repository wdoppenberg/glowrs to learn more about the underlying implementation for sentence embedding.

Installation

Make sure you have polars installed. If not, install it using pip install polars. Then, install polars-candle using

pip install polars-candle

Note: The macOS ARM wheels of this library come with Metal support out of the box. For CUDA, check the below instructions on how to build from source.

If you want to install the latest version from the repository, you can use:

pip install git+https://github.com/wdoppenberg/polars-candle.git

Note: You need to have the Rust toolchain installed on your system to compile the library. See here for instructions on how to install Rust.

You can set build features using maturin:

maturin develop --release -F <feature>

Where <feature> can be one of the following:

  • metal Install with Metal acceleration.
  • cuda Install with CUDA acceleration. Might require additional setup such as installing CUDA libraries.
  • accelerate Install with the Accelerate framework.

Roadmap

  • Embed text using Bert, JinaBert, and Distilbert models.
  • Add more models.
  • More configuration options for embedding (e.g. pooling strategy, device selection, etc.).
  • Support & test streaming workloads.

Credits

  • Massive thanks to polars & their contributors for providing a blazing fast DataFrame library with the ability to extend it with custom functions using pyo3-polars.
  • Great work so far by Huggingface on candle for providing a simple interface to run ML models.

Note

This is a work in progress and the API might change in the future. Feel free to open an issue if you have any suggestions or improvements.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_candle-0.1.7.tar.gz (43.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polars_candle-0.1.7-cp312-none-win_amd64.whl (6.3 MB view details)

Uploaded CPython 3.12Windows x86-64

polars_candle-0.1.7-cp312-cp312-manylinux_2_34_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

polars_candle-0.1.7-cp312-cp312-macosx_11_0_arm64.whl (6.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

polars_candle-0.1.7-cp311-none-win_amd64.whl (6.3 MB view details)

Uploaded CPython 3.11Windows x86-64

polars_candle-0.1.7-cp311-cp311-manylinux_2_34_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

polars_candle-0.1.7-cp311-cp311-macosx_11_0_arm64.whl (6.0 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

polars_candle-0.1.7-cp310-none-win_amd64.whl (6.3 MB view details)

Uploaded CPython 3.10Windows x86-64

polars_candle-0.1.7-cp310-cp310-manylinux_2_34_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

polars_candle-0.1.7-cp310-cp310-macosx_11_0_arm64.whl (6.0 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

polars_candle-0.1.7-cp39-none-win_amd64.whl (6.3 MB view details)

Uploaded CPython 3.9Windows x86-64

polars_candle-0.1.7-cp39-cp39-manylinux_2_34_x86_64.whl (6.9 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

polars_candle-0.1.7-cp39-cp39-macosx_11_0_arm64.whl (6.0 MB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file polars_candle-0.1.7.tar.gz.

File metadata

  • Download URL: polars_candle-0.1.7.tar.gz
  • Upload date:
  • Size: 43.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.5.1

File hashes

Hashes for polars_candle-0.1.7.tar.gz
Algorithm Hash digest
SHA256 65bf42e02e1db6130d79f3925c6ef96926a6ee49a5c71a5d9434e54cef21acab
MD5 1a32a78d4b7afcb3c74cdf8cb9a9619a
BLAKE2b-256 dd311670ae4e5ba0b2b528a84ef1f7fe6e9fc8381f43f8df41e05d3636e7b042

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp312-none-win_amd64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp312-none-win_amd64.whl
Algorithm Hash digest
SHA256 da69a8aceff91b16130467bb641d3176b9f5a80f531692c15498d1b202b0c48d
MD5 42cbfa910972630058f0e5c5400a90eb
BLAKE2b-256 c5cfbb78811777e0dd5c4099a45099ff65cb3ccbd6b7c888e907630f9a244bc4

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 838f908197e65e7d072528a7e172d20981aa0bff8fe23148f40dd3e2e5856252
MD5 7084ad89f14bacffba21be1b8d054a6f
BLAKE2b-256 0b4906d845f3faa246b91ea4282a7092638e8c418f7c0fd849f0f24877c04daa

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 30305c4eaac82771551f2609fc4c7669c739c77a9491a2c0d650fc85218133b7
MD5 66569b34a00e76e6b238dcdfb400e5af
BLAKE2b-256 9b717c18c00a56811824b1fdb7c940f58b70eba69b3a5498fa5e55d4bebab761

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 dc4f2e57b6b0f4fc297e82e75564034bfe5dfe6ee540679a156392d0796bc469
MD5 ccc07930d80ae06c3ecd2f02513809c6
BLAKE2b-256 d01dcee526471859088985c2893ad0048107511d7f6fa8f31f944b74918b6c54

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 fedd44c085217ed7a2320690371360e46c6ce885e13194cdf67700f8ade4e0d2
MD5 e6f1c143f317fa2029ba40f126bc3d9c
BLAKE2b-256 36ed8fc33ef07f4224eebd09cd56a2cee93671142da61039c5ef6bbddd46872e

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 26ea7c7f6773ef58d44663914ecd54bd77b2640cd1537f29035b2a24ed6a6e32
MD5 7a4ab4705b857240edc60017d197e363
BLAKE2b-256 9d646e96d0d5396c73a0a14f51eb9fdade733da31daf2c7b11deb95239e4f1ce

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 c2deeee33b7cf06539a386400e8d5a9f9fed4fd4469f8f353df20e56c4cec3ca
MD5 0d8d245e4a1db04920101000e63bd124
BLAKE2b-256 22e04b93d9a09c25811ce3b5b4311b6e1344b513afa37ea4ad68242a92edd214

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 f9ae245dbfbbb674d235f3df04bac322ef6c5bfad97206a2d4c5cf50f8cce62c
MD5 524bf2c75f1dc6fd0b62eb7df7c639bf
BLAKE2b-256 0630e85f5103bc1f1d05ceb1a475cfc43634162ae67f37e11f89eae0c9ced895

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2cc4db47365b2bbf429ac2fa7495797313711df5128a2506cb4802d41873f2f8
MD5 1b0c0347e87fe54d8a7940a4f198ec2d
BLAKE2b-256 fd993d0425e2b86d4684bdeaf458e52727e252835b29f809333e4bff64df0465

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp39-none-win_amd64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 0b7facdc7610bda4093ca4cca1e705d27b13a1d38318d2b546dac485a2f7da46
MD5 c8e1c37d87e37fe63f7f644f18137343
BLAKE2b-256 15e71f5f2d98cfdc8aab187ef310112b8943c41f3dbcb66df92ac7f03c27262d

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 e00daee55dbce06316d6e0830db02b79377054e1565513a3c6b31f5322eabb92
MD5 74125c00b4fc9110c73e3d05ed44bccb
BLAKE2b-256 e913b2a5da0bf7db969378d0bf8f6997aa20b8a2f29eb67f57e58f14b94cccb8

See more details on using hashes here.

File details

Details for the file polars_candle-0.1.7-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polars_candle-0.1.7-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 51da3b23e88beaec7ccc778db830624b548f7e187d6a84072f355cd971e68e11
MD5 c8d632673990f1ff05c3ac08823b75c4
BLAKE2b-256 1989544a9c0ed8fd4161f9dcef4f5bafbadf9ab1675ff3e012bd69b8c8ecc859

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page