Skip to main content

Polars plugin providing Map operations on List(Struct({key, value})) columns

Project description

polars-map

build pypi

Polars plugin providing a Map extension type and functions. Maps represent a mapping from unique keys of any type to values, and are stored as List(Struct({key, value})) columns. All function in the .map namespace can be used on the extension type or on the underlying list.

Installation

pip install polars-map

Supported operations (.map.*)

Category Methods
Accessors entries, keys, values, len, get, contains_key
Filtering filter, filter_keys, filter_values
Transform eval, eval_keys, eval_values
Set ops merge, intersection, difference
Conversion from_entries
Iteration __iter__, to_list (Series only)

Arrow conversion

Function Description
from_arrow(table) Arrow Table/RecordBatch to Polars DataFrame, preserving map<> as Map
from_arrow_array(array) Arrow Array to Polars Series, preserving map<> as Map
to_arrow(frame) Polars DataFrame to Arrow Table, converting Map back to map<>
to_arrow_array(series) Polars Series to Arrow Array, converting Map back to map<>
scan_arrow(source) Lazy scan from an Arrow source with Map preservation

Usage

import polars as pl
import pyarrow as pa
from polars_map import Map, from_arrow, to_arrow, scan_arrow

# Construction
ser = pl.Series(
    "m",
    [
        [{"key": "a", "value": 1}, {"key": "b", "value": 2}],
        [{"key": "x", "value": 10}],
    ],
    dtype=Map(pl.String(), pl.Int64()),
)
df = pl.DataFrame([ser])

# Accessors
df.select(pl.col("m").map.keys())    # [["a", "b"], ["x"]]
df.select(pl.col("m").map.values())  # [[1, 2], [10]]
df.select(pl.col("m").map.len())     # [2, 1]

# Lookup
df.select(pl.col("m").map.get("a"))           # [1, None]
df.select(pl.col("m").map.contains_key("a"))  # [True, False]

# Filtering
df.select(pl.col("m").map.filter(pl.element().struct["value"] > 1))
df.select(pl.col("m").map.filter_keys(pl.element() > "a"))
df.select(pl.col("m").map.filter_values(pl.element() >= 2))

# Transform keys or values
df.select(pl.col("m").map.eval_keys(pl.element().str.to_uppercase()))
df.select(pl.col("m").map.eval_values(pl.element() * 2))

# Merge (right-side wins on key conflict)
left = pl.Series("l", [[{"key": "a", "value": 1}, {"key": "b", "value": 2}]], dtype=Map(pl.String(), pl.Int64()))
right = pl.Series("r", [[{"key": "a", "value": 99}, {"key": "c", "value": 3}]], dtype=Map(pl.String(), pl.Int64()))
pl.DataFrame([left, right]).select(pl.col("l").map.merge(pl.col("r")))
# [{"a": 99, "b": 2, "c": 3}]

# Set operations
pl.DataFrame([left, right]).select(pl.col("l").map.intersection(pl.col("r")))  # keys in both
pl.DataFrame([left, right]).select(pl.col("l").map.difference(pl.col("r")))    # keys only in left

# Convert to/from plain List(Struct)
df.select(pl.col("m").map.entries())        # strip Map type
df.select(pl.col("m").map.from_entries())   # wrap as Map (with deduplication)

# Series iteration yields Python dicts
for d in ser.map:
    print(d)  # {"a": 1, "b": 2}, {"x": 10}

# Arrow table with map column → Polars DataFrame
table = pa.table({"m": pa.array([[("a", 1)]], type=pa.map_(pa.string(), pa.int64()))})
df = from_arrow(table)          # Map(String, Int64) dtype preserved
table2 = to_arrow(df)           # roundtrips back to arrow map<>

# Lazy scanning from an Arrow source
lf = scan_arrow(lambda: [table])
result = lf.collect()

Caveats

  • Extension types — used to wrap the underlying List(Struct) storage with a semantic Map dtype, are not yet stabilized and may change across Polars releases.
  • pl.dtype_of — used to efficiently cast to the extension type after some operations is also unstable.
  • GIL - is required to automatically wrap an expression as the extension type, and so operations which could change the underlying key or value types will briefly lock the GIL to do the cast. This may also prevent the polars engine from reasoning about the type.
  • LongMap - arrow currently only support Map, not LongMap. Polars generlly uses LongList, but if a frame is every converted to arrow with offsets that don't fit in a u32, this will exproting will error.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_map-0.2.1.tar.gz (43.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_map-0.2.1-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file polars_map-0.2.1.tar.gz.

File metadata

  • Download URL: polars_map-0.2.1.tar.gz
  • Upload date:
  • Size: 43.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for polars_map-0.2.1.tar.gz
Algorithm Hash digest
SHA256 3b13791f94bfb71c116a930c16377f792c5d630d4075babf7494d5f58795a543
MD5 e3e1ff9ffc128d01362a13df8a909d14
BLAKE2b-256 facb1366e16d2114819393d58e5ca082ba59d27e774057ee050703166da958b2

See more details on using hashes here.

Provenance

The following attestation bundles were made for polars_map-0.2.1.tar.gz:

Publisher: release.yml on hafaio/polars-map

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file polars_map-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: polars_map-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 13.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for polars_map-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 695f5b1a1043446ac66b48cd74539b18a507fe455e40cf43ee67dae33d345ebe
MD5 548da1adfd35d386d7b014fbbe99827b
BLAKE2b-256 f922faef69efca731f27d8e468840c97e4aea4500fc757db3354801da5283a15

See more details on using hashes here.

Provenance

The following attestation bundles were made for polars_map-0.2.1-py3-none-any.whl:

Publisher: release.yml on hafaio/polars-map

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page