Skip to main content

No project description provided

Project description

Custom WKT

Rust

Tooling to render WKT values based on specific, custom data layouts.

Given two series of 2-D data, grouped by classes, render into a list of WKT LINESTRINGs.

import custom_wkt
import numpy as np

custom_wkt.linestring(np.linspace(0, 5, 6), np.linspace(0, 5, 6) / 2.0, np.array([0, 0, 0, 0, 1, 1]), 1)
# ['LINESTRING (0.0 0.0, 1.0 0.5, 2.0 1.0, 3.0 1.5)', 'LINESTRING (4.0 2.0, 5.0 2.5)']

Operation

x y c
0.0 0.0 0
1.0 0.5 0
2.0 1.0 0
3.0 1.5 0
4.0 2.0 1
5.0 2.5 1

Where x, y are the elements are np.float64 values for the points and c are the classes for the points as np.int64.

Note that classes must already be consecutive and returned LINESTRINGs will match the classes' order.

LINESTRING (0.0 0.0, 1.0 0.5, 2.0 1.0, 3.0 1.5) for class 0 and LINESTRING (4.0 2.0, 5.0 2.5) for class 1.

Function Signature

Note that this package is type strict - point values must be np.float64, classes must be np.int64, and precision must be an int.

The underlying code is written in Rust and offers a function signature equivalent to:

def linestring(x: np.ndarray, y: np.ndarray, classes: np.ndarray, precision: int) -> List[str]:
    """
    Parameters
    ----------
    x : np.ndarray
        First Point Value - float64
    y : np.ndarray
        Second Point Values - float64
    classes : np.ndarray
        Classes - int64
    precision : int
        Number of places for rendering in string, >= 0

    Returns
    -------
    List[str]
        List of LineStrings
    """

Example Usage

import pandas as pd
import numpy as np
import custom_wkt

df = (
    pd.DataFrame({"x": np.arange(6).astype(np.float64)})
    .assign(y=lambda idf: idf["x"] / 2, c=lambda idf: (idf["x"] > 3).astype(np.int64))
    .sort_index("c")
)
result = custom_wkt.linestring(df["x"].values, df["y"].values, df["c"].values, 1)
# ['LINESTRING (0.0 0.0, 1.0 0.5, 2.0 1.0, 3.0 1.5)', 'LINESTRING (4.0 2.0, 5.0 2.5)']

If classes are available as Pandas series as strings or another type,

str_series = df["c"].astype(str)
int_classes = str_series.astype("category").cat.codes.astype(np.int64).values
# NOTE: Cast to int64. Pandas will return the smallest int dtype to fit number of categories
# but custom_wkt requires strict types

result = custom_wkt.linestring(df["x"].values, df["y"].values, int_classes, 1)
# ['LINESTRING (0.0 0.0, 1.0 0.5, 2.0 1.0, 3.0 1.5)', 'LINESTRING (4.0 2.0, 5.0 2.5)']

Performance

Performance varies greatly depending on size of inputs, size of the groups, and relative precision.

Rough estimates measure 2-3x speedup and decreased memory usage over normal python logic.

Similar Pandas operations are much less efficient computation. The nature of the solution includes groupbys and python function calls, which have richer functionality but negatively impact performance for this operation.

Benchmark Plot

Benchmark Histogram

Memory Rust Memory Python Memory Pandas

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

custom_wkt-0.1.0-cp39-none-win_amd64.whl (129.9 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

custom_wkt-0.1.0-cp39-cp39-manylinux_2_24_x86_64.whl (187.1 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.24+ x86-64

custom_wkt-0.1.0-cp39-cp39-macosx_10_7_x86_64.whl (178.3 kB view hashes)

Uploaded CPython 3.9 macOS 10.7+ x86-64

custom_wkt-0.1.0-cp38-none-win_amd64.whl (129.7 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

custom_wkt-0.1.0-cp38-cp38-manylinux_2_24_x86_64.whl (187.1 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.24+ x86-64

custom_wkt-0.1.0-cp38-cp38-macosx_10_7_x86_64.whl (178.3 kB view hashes)

Uploaded CPython 3.8 macOS 10.7+ x86-64

custom_wkt-0.1.0-cp37-none-win_amd64.whl (129.9 kB view hashes)

Uploaded CPython 3.7 Windows x86-64

custom_wkt-0.1.0-cp37-cp37m-manylinux_2_24_x86_64.whl (187.1 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.24+ x86-64

custom_wkt-0.1.0-cp37-cp37m-macosx_10_7_x86_64.whl (178.5 kB view hashes)

Uploaded CPython 3.7m macOS 10.7+ x86-64

custom_wkt-0.1.0-cp36-none-win_amd64.whl (129.9 kB view hashes)

Uploaded CPython 3.6 Windows x86-64

custom_wkt-0.1.0-cp36-cp36m-manylinux_2_24_x86_64.whl (187.2 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.24+ x86-64

custom_wkt-0.1.0-cp36-cp36m-macosx_10_7_x86_64.whl (178.5 kB view hashes)

Uploaded CPython 3.6m macOS 10.7+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page