Skip to main content

Data lake operations toolkit

Project description

LakeOps

A modern data lake operations toolkit supporting multiple formats (Delta, Iceberg, Parquet) and engines (Spark, Polars).

Features

  • Multi-format support: Delta, Iceberg, Parquet
  • Multiple engine backends: Apache Spark, Polars (default)
  • Storage operations: read, write

Quick Start

from pyspark.sql import SparkSession
from lakeops import LakeOps
from lakeops.core.engine import SparkEngine, PolarsEngine

# Set either engine Spark or Polars
spark = SparkSession.builder
    .appName("LakeOps")
    .config("spark.jars.packages", "iceberg-spark-runtime-3.5_2.12:1.6.1")
    .config("spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
    .config("spark.sql.catalog.spark_catalog", "org.apache.iceberg.spark.SparkSessionCatalog") \
    .config("spark.sql.catalog.local", "org.apache.iceberg.spark.SparkCatalog") \
    .config("spark.sql.catalog.local.type", "hadoop") \
    .config("spark.sql.catalog.local.warehouse", "/app/data") \
    .getOrCreate()

engine = SparkEngine(spark)
# engine = PolarsEngine()

# Init lakeops
ops = LakeOps(engine)

# Read data from table name
df = ops.read("local.db.test_table", format="iceberg")

# Write data to table name
ops.write(df, "local.db.test_table", format="iceberg")

Installation

pip install lakeops

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lakeops-0.1.1-py3-none-any.whl (4.4 kB view details)

Uploaded Python 3

File details

Details for the file lakeops-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: lakeops-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for lakeops-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ab8a4cd37858b8dbb4f56bddeeb86f095c2e6fdce38433eaa029cfaac5285866
MD5 b9315f37943def289a0f4626daf9819c
BLAKE2b-256 1c7b57ebcf9247d84d0458484d8043f907024a5328b046a61e66f0de1121feec

See more details on using hashes here.

Provenance

The following attestation bundles were made for lakeops-0.1.1-py3-none-any.whl:

Publisher: publish.yml on hoaihuongbk/lakeops

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page