Skip to main content

A dataframe-like library for Dremio Cloud & Dremio Software

Project description

DremioFrame

DremioFrame is a Python library that provides an Ibis-like dataframe builder interface for interacting with Dremio Cloud & Dremio Software. It allows you to list data, perform CRUD operations, and administer Dremio resources using a familiar API.

Documentation

Installation

pip install dremioframe

Quick Start

Dremio Cloud

from dremioframe.client import DremioClient

# Assumes DREMIO_PAT and DREMIO_PROJECT_ID are set in env
client = DremioClient()

# Query a table
df = client.table("Samples.samples.dremio.com.zips.json").select("city", "state").limit(5).collect()
print(df)

Dremio Software

client = DremioClient(
    hostname="localhost",
    port=32010,
    username="admin",
    password="password123",
    tls=False
)

Features

from dremioframe.client import DremioClient

client = DremioClient(pat="YOUR_PAT", project_id="YOUR_PROJECT_ID")

# List catalog
print(client.catalog.list_catalog())

# Query data
df = client.table("Samples.samples.dremio.com.zips.json").select("city", "state").filter("state = 'MA'").collect()
print(df)

# Calculated Columns
df.mutate(total_pop="pop * 2").show()

# Aggregation
df.group_by("state").agg(avg_pop="AVG(pop)").show()

# Joins
df.join("other_table", on="left_tbl.id = right_tbl.id").show()

# Iceberg Time Travel
df.at_snapshot("123456789").show()

# External Queries
client.external_query("Postgres", "SELECT * FROM users").show()

# API Ingestion
client.ingest_api(
    url="https://api.example.com/users",
    table_name="users",
    mode="merge",
    pk="id"
)

# Charting
df.chart(kind="bar", x="category", y="sales", save_to="sales.png")

# Export
df.to_csv("data.csv")
df.to_parquet("data.parquet")

# Insert Data (Batched)
import pandas as pd
data = pd.DataFrame({"id": [1, 2], "name": ["A", "B"]})
client.table("my_table").insert("my_table", data=data, batch_size=1000)

# SQL Functions
from dremioframe import F

client.table("sales") \
    .select(
        F.col("dept"),
        F.sum("amount").alias("total_sales"),
        F.rank().over(F.Window.order_by("amount")).alias("rank")
    ) \
    .show()

# Merge (Upsert)
client.table("target").merge(
    target_table="target",
    on="id",
    matched_update={"name": "source.name"},
    not_matched_insert={"id": "source.id", "val": "source.val"},
    data=data
)

# Data Quality
df.quality.expect_not_null("city")
df.quality.expect_row_count("pop > 1000000", 5, "ge") # Expect at least 5 cities with pop > 1M

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dremioframe-0.2.0.tar.gz (16.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dremioframe-0.2.0-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file dremioframe-0.2.0.tar.gz.

File metadata

  • Download URL: dremioframe-0.2.0.tar.gz
  • Upload date:
  • Size: 16.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dremioframe-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c97d5be819bc69d94c789b19ebb9dd14dcaedbd4cb1907654806766ac464afe7
MD5 352fa26a2be0e16d14c8bd99af667150
BLAKE2b-256 a481afffe619648ef55eb262e17be48b91bc3184b6c1bd2621ca1a10b902579b

See more details on using hashes here.

File details

Details for the file dremioframe-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dremioframe-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dremioframe-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b85ecd64c701ee8316494dbd295c923c7c67fc3ab8d7491001b3075cf080a8ec
MD5 c640a416ebf7839948046bce9e87abe6
BLAKE2b-256 68b098586ea0551e299432278f68d27efae8d9171484bfa0958dcb45d9965c42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page