Skip to main content

Datastore supporting merql

Project description

I took a career break to build this. If you like it and are looking to hire a ML engineer, please contact me :)

CAUTION: PROTOTYPE! NO PRODUCTION USE, not even development.

Merdb is a data processing library that

  • is a relational api like SQL to query data
  • has Unix like pipes to compose operators using the | syntax
  • scales to multi core or a cluster(via Modin)
  • processes data too big to fit into memory
  • support interactive and optimized processing(optimizations in roadmap)

Install

pip install merdb

Example

import pandas as pd
from merdb.interactive import *
# for lazy(TBD) use `from merdb.lazy import *`

def is_senior(row) -> bool:
    return row['age'] > 35


def double_age(row) -> int:
    return row["age"] * 2


cols = ["name", "age"]
people_df = pd.DataFrame([
    ["Raj", 35],
    ["Sona", 20],
    ["Abby", 70],
    ["Abba", 90],
], columns=cols)

# One can specify functions without any source data like quadruple age
quadruple_age = map(double_age, "age") | map(double_age, "age")

result = (t(people_df) # convert people_df to a merdb table
          | where(is_senior)
          | order_by("name", "asc")
          | quadruple_age # Unix like pipe syntax making it easy to refactor out intermediate processing
          | select("age")
          | rename({"age": "new_age"})
          )

# Convert to Pandas Dataframe and print
print(result.df())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

merdb-0.0.2.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

merdb-0.0.2-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file merdb-0.0.2.tar.gz.

File metadata

  • Download URL: merdb-0.0.2.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.8.2 CPython/3.11.5

File hashes

Hashes for merdb-0.0.2.tar.gz
Algorithm Hash digest
SHA256 db62e3f80c246f46336c50e57b2c64d7068c286a1eb4132ab59862bd1834b505
MD5 ab7503b6d0d0702a2c25e1a04a9b09dc
BLAKE2b-256 0287cef9a8faebb375a70548074e3b1f4f9419d9b46de21a5e2eada235b4c649

See more details on using hashes here.

File details

Details for the file merdb-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: merdb-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.8.2 CPython/3.11.5

File hashes

Hashes for merdb-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8768658f769fbde1d2c6d79d7e7177af067bc1ce510cf44234f7267ba3d79b84
MD5 573d8a4c720a24568464c8202f9743f4
BLAKE2b-256 e2e193797310f72f105311c9275f3ddfb16f93d99263dd4aa503cd16dfef85b0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page