Skip to main content

A lightweight library for tabulating dictionaries

Project description

unpact

A lightweight library for tabulating dictionaries.

Coverage

Usage

A basic example:

from unpact import unwind, ColumnDef

columns: List[ColumnDef] = [
    'calendar.year',
    'calendar.date',
    'locations.location',
    'locations.x',
    'locations.y'
]

 # columns of the same child and the same length are considered 'adjacent'
 # adjacent columns are zipped together.
 # here, 'x' and 'y' are considered 'adjacent'
data = {
    'calendar': {'year': 2022, 'date': 'Aug 14'},
    'locations': [
        {'location': 'Loc1', 'x': [1,2,3,4], 'y': [1,2,3,4]},
        {'location': 'Loc2', 'x': [11,22,33,44], 'y': [11,22,33,44]},
        {'location': 'Loc3', 'x': [11], 'y': [11]},
    ],
    'ignored': "This isn't in the ColumDefs so won't be included"
}

table = unwind(data, columns)
print(pl.from_dicts(table))

--
shape: (9, 5)
┌──────┬────────┬──────────┬─────┬─────┐
 year  date    location  x    y   
 ---   ---     ---       ---  --- 
 i64   str     str       i64  i64 
╞══════╪═���══════╪══════════╪═════╪═════╡
 2022  Aug 14  Loc1      1    1   
 2022  Aug 14  Loc1      2    2   
 2022  Aug 14  Loc1      3    3   
 2022  Aug 14  Loc1      4    4   
 2022  Aug 14  Loc2      11   11  
 2022  Aug 14  Loc2      22   22  
 2022  Aug 14  Loc2      33   33  
 2022  Aug 14  Loc2      44   44  
 2022  Aug 14  Loc2      11   11  
└──────┴────────┴──────────┴─────┴─────┘

A more complex example using ColumnSpecs:

from typing import List

import polars as pl

from unpact import ColumnDef, ColumnSpec, unwind


def format_coordinate_pair(
    coords: list[int], index: int | None
) -> dict:  # Formatter functions must return a dictionary
    # Terminal value is passed to the "formatter" function
    # "index" is optionally injected if the value is a member of a list

    return {"x": coords[0], "y": coords[1], "frame": index} if coords else {"x": None, "y": None, "frame": index}


# You can pass in a pass in a 'ColumnSpec' to change the behavior of a column
# current values are 'formatter' which accepts a callable and 'name', a string which will rename the column
columns: List[ColumnDef] = [
    ColumnSpec(path="calendar.year", name="Year"),  # You can rename the column using the optional `name` kwarg
    ColumnSpec(path="calendar.date"),  # Otherwise the column will be named after the last part of the path
    ColumnSpec(path="locations.location", name="location name"),
    ColumnSpec(path="locations.coords", formatter=lambda coords: {"x": coords[0], "y": coords[1]}),
    ColumnSpec(path="locations.coords", formatter=format_coordinate_pair),
]

data = {
    "calendar": {"year": 2022, "date": "Aug 14"},
    "locations": [
        {"location": "Loc1", "coords": [[1, 1], [2, 2], [3, 3]]},
        {"location": "Loc2", "coords": [[1, 1], [2, 2], [3, 3]]},
        {"location": "Loc3", "coords": [[1, 1], [2, 2], [3, 3]]},
    ],
    "ignored": "This isn't in the ColumDefs so won't be included",
}

table = unwind(data, columns)
print(pl.from_dicts(table))


---
shape: (9, 6)
┌──────┬────────┬───────────────┬─────┬─────┬───────┐
 Year  date    location name  x    y    frame 
 ---   ---     ---            ---  ---  ---   
 i64   str     str            i64  i64  i64   
╞══════╪════════╪═══════════════╪═════╪═════╪═══════╡
 2022  Aug 14  Loc1           1    1    0     
 2022  Aug 14  Loc1           2    2    1     
 2022  Aug 14  Loc1           3    3    2     
 2022  Aug 14  Loc2           1    1    0     
 2022  Aug 14  Loc2           2    2    1     
 2022  Aug 14  Loc2           3    3    2     
 2022  Aug 14  Loc3           1    1    0     
 2022  Aug 14  Loc3           2    2    1     
 2022  Aug 14  Loc3           3    3    2     
└──────┴────────┴───────────────┴─────┴─────┴───────┘

API Documentation

ColumnSpec

ColumnSpec is a dataclass used to define the specifications for a column in the output dataframe. It includes the following attributes:

  • path (str): Dot-delimited path to the column in the input data.
  • name (Optional[str]): Name of the column in the output dataframe. If not provided, the terminal path is used.
  • formatter (Optional[ColumnFormatter]): Formatter to apply to the column data.
  • default (Optional[Any]): Value to use if the column is missing from the input data. If not provided, None is used.

Methods

  • from_tuple(column_def: Tuple[str, ColumnSpecDict]) -> ColumnSpec: Creates a ColumnSpec instance from a tuple.
  • from_str(path: str) -> ColumnSpec: Creates a ColumnSpec instance from a string path.
  • from_def(column_def: ColumnDef) -> ColumnSpec: Creates a ColumnSpec instance from a ColumnDef which can be a string, tuple, or ColumnSpec instance.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unpact-0.0.20.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unpact-0.0.20-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file unpact-0.0.20.tar.gz.

File metadata

  • Download URL: unpact-0.0.20.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.0 Darwin/24.3.0

File hashes

Hashes for unpact-0.0.20.tar.gz
Algorithm Hash digest
SHA256 652ec685259cb0b652cd8ffd76d89506327dc092370437e05667c8f76e12daca
MD5 66d1eea5a2c9bd5e8c7abed867ea4629
BLAKE2b-256 3126d3742efb52ebec5a7db177e297cf5634bfdfa1e8cf18746bea158f0aa4a2

See more details on using hashes here.

File details

Details for the file unpact-0.0.20-py3-none-any.whl.

File metadata

  • Download URL: unpact-0.0.20-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.0 Darwin/24.3.0

File hashes

Hashes for unpact-0.0.20-py3-none-any.whl
Algorithm Hash digest
SHA256 deff0a0c4216ebbc323ab5228ceaf9f69449573b6a0b4775a85fe4f9bf886d00
MD5 f3109d45140d6d4f94736e668ef7475f
BLAKE2b-256 0f60c633e6baa0e45eb23a512ecd1a20aa0ec0459811a4c52395cf51e9109af9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page