Skip to main content

Stable non-cryptographic and cryptographic hashing functions for Polars

Project description

This plugin provides stable hashing functionality across different polars versions.

Examples

Cryptographic Hashers

import polars
import polars_hash as plh

df = pl.DataFrame({
    "foo":["hello_world"]
})

result = df.select(plh.col('foo').chash.sha256())

print(result)

┌──────────────────────────────────────────────────────────────────┐
 foo                                                              
 ---                                                              
 str                                                              
╞══════════════════════════════════════════════════════════════════╡
 35072c1ae546350e0bfa7ab11d49dc6f129e72ccd57ec7eb671225bbd197c8f1 
└──────────────────────────────────────────────────────────────────┘

Non-cryptographic Hashers

df = pl.DataFrame({
    "foo":["hello_world"]
})

result = df.select(plh.col('foo').nchash.wyhash())
print(result)
┌──────────────────────┐
 foo                  
 ---                  
 u64                  
╞══════════════════════╡
 16737367591072095403 
└──────────────────────┘

Geo Hashers

df = pl.DataFrame(
    {"coord": [{"longitude": -120.6623, "latitude": 35.3003}]},
    schema={
        "coord": pl.Struct(
            [pl.Field("longitude", pl.Float64), pl.Field("latitude", pl.Float64)]
        ),
    },
)

df.with_columns(
    plh.col('coord').geohash.from_coords().alias('geohash')
)
shape: (1, 2)
┌─────────────────────┬────────────┐
 coord                geohash    
 ---                  ---        
 struct[2]            str        
╞═════════════════════╪════════════╡
 {-120.6623,35.3003}  9q60y60rhs 
└─────────────────────┴────────────┘


pl.select(pl.lit('9q60y60rhs').geohash.to_coords().alias('coordinates'))
shape: (1, 1)
┌───────────────────────┐
 coordinates           
 ---                   
 struct[2]             
╞═══════════════════════╡
 {-120.6623,35.300298} 
└───────────────────────┘

Create hash from multiple columns

df = pl.DataFrame({
    "foo":["hello_world"],
    "bar": ["today"]
})

result = df.select(plh.concat_str('foo','bar').chash.sha256())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_hash-0.4.8.tar.gz (22.1 kB view hashes)

Uploaded Source

Built Distributions

polars_hash-0.4.8-cp38-abi3-win_amd64.whl (3.1 MB view hashes)

Uploaded CPython 3.8+ Windows x86-64

polars_hash-0.4.8-cp38-abi3-win32.whl (2.7 MB view hashes)

Uploaded CPython 3.8+ Windows x86

polars_hash-0.4.8-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.7 MB view hashes)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

polars_hash-0.4.8-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.1 MB view hashes)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ s390x

polars_hash-0.4.8-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (9.5 MB view hashes)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ppc64le

polars_hash-0.4.8-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (6.9 MB view hashes)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ i686

polars_hash-0.4.8-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (6.1 MB view hashes)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARMv7l

polars_hash-0.4.8-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.1 MB view hashes)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

polars_hash-0.4.8-cp38-abi3-macosx_11_0_arm64.whl (3.1 MB view hashes)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

polars_hash-0.4.8-cp38-abi3-macosx_10_12_x86_64.whl (3.3 MB view hashes)

Uploaded CPython 3.8+ macOS 10.12+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page