Skip to main content

Lift special-purpose data into common tabular formats for analytics 💪

Project description

💪 Elbow

Build codecov Code style: black MIT License

Elbow is a lightweight and scalable library for getting diverse data out of specialized formats and into common tabular data formats for downstream analytics.

Example

Extract image metadata and pixel values from all JPEG image files under the current directory and save as a Parquet dataset.

import numpy as np
import pandas as pd
from PIL import Image

from elbow.builders import build_parquet

def extract_image(path: str):
    img = Image.open(path)
    width, height = img.size
    pixel_values = np.asarray(img)
    return {
        "path": path,
        "width": width,
        "height": height,
        "pixel_values": pixel_values,
    }

build_parquet(
    source="**/*.jpg",
    extract=extract_image,
    output="images.pqds/",
    workers=8,
)

df = pd.read_parquet("images.pqds")

For a complete example, see here.

Installation

pip install elbow

The current development version can be installed with

pip install git+https://github.com/cmi-dair/elbow.git

Related projects

There are many other high quality projects for extracting, loading, and transforming data. Some alternative projects focused on somewhat different use cases are:

Contributing

We welcome contributions of any kind! If you'd like to contribute, please feel free to start a conversation in our issues.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elbow-0.1.0.tar.gz (38.2 kB view details)

Uploaded Source

Built Distribution

elbow-0.1.0-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file elbow-0.1.0.tar.gz.

File metadata

  • Download URL: elbow-0.1.0.tar.gz
  • Upload date:
  • Size: 38.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.17

File hashes

Hashes for elbow-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4294086cc95925ed5ea0f61c9b0c52b0cb8496011a86002d48b6afd9d27ae92d
MD5 c0cdad7a0e6424ebd3a7435b3e4d7515
BLAKE2b-256 afb9cf7090a83a4baa0b75728ae066723f012d1ff876bb5e3887bac4f0829a20

See more details on using hashes here.

File details

Details for the file elbow-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: elbow-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 29.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.17

File hashes

Hashes for elbow-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d060de2ae1ab1663d5d8b570351684d1050678fff31fcc2387f2d62a387e4f3a
MD5 9e6d8bf17d1a99512c785e486dd0a570
BLAKE2b-256 a8b7768b859234dfd7cb6800375b671ca98b32167a6ccbbd8dfa09f0d4aebd4b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page