Skip to main content

A dataframe-like library for Dremio Cloud & Dremio Software

Project description

DremioFrame

DremioFrame is a Python library that provides an Ibis-like dataframe builder interface for interacting with Dremio Cloud & Dremio Software. It allows you to list data, perform CRUD operations, and administer Dremio resources using a familiar API.

Documentation

Installation

pip install dremioframe

Quick Start

Dremio Cloud

from dremioframe.client import DremioClient

# Assumes DREMIO_PAT and DREMIO_PROJECT_ID are set in env
client = DremioClient()

# Query a table
df = client.table("Samples.samples.dremio.com.zips.json").select("city", "state").limit(5).collect()
print(df)

Dremio Software

client = DremioClient(
    hostname="localhost",
    port=32010,
    username="admin",
    password="password123",
    tls=False
)

Features

from dremioframe.client import DremioClient

client = DremioClient(pat="YOUR_PAT", project_id="YOUR_PROJECT_ID")

# List catalog
print(client.catalog.list_catalog())

# Query data
df = client.table("Samples.samples.dremio.com.zips.json").select("city", "state").filter("state = 'MA'").collect()
print(df)

# Calculated Columns
df.mutate(total_pop="pop * 2").show()

# Aggregation
df.group_by("state").agg(avg_pop="AVG(pop)").show()

# Joins
df.join("other_table", on="left_tbl.id = right_tbl.id").show()

# Iceberg Time Travel
df.at_snapshot("123456789").show()



# API Ingestion
client.ingest_api(
    url="https://api.example.com/users",
    table_name="users",
    mode="merge",
    pk="id"
)

# Charting
df.chart(kind="bar", x="category", y="sales", save_to="sales.png")

# Export
df.to_csv("data.csv")
df.to_parquet("data.parquet")

# Insert Data (Batched)
import pandas as pd
data = pd.DataFrame({"id": [1, 2], "name": ["A", "B"]})
client.table("my_table").insert("my_table", data=data, batch_size=1000)

# SQL Functions
from dremioframe import F

client.table("sales") \
    .select(
        F.col("dept"),
        F.sum("amount").alias("total_sales"),
        F.rank().over(F.Window.order_by("amount")).alias("rank")
    ) \
    .show()

# Merge (Upsert)
client.table("target").merge(
    target_table="target",
    on="id",
    matched_update={"name": "source.name"},
    not_matched_insert={"id": "source.id", "val": "source.val"},
    data=data
)

# Data Quality
df.quality.expect_not_null("city")
df.quality.expect_row_count("pop > 1000000", 5, "ge") # Expect at least 5 cities with pop > 1M

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dremioframe-0.2.1.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dremioframe-0.2.1-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file dremioframe-0.2.1.tar.gz.

File metadata

  • Download URL: dremioframe-0.2.1.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dremioframe-0.2.1.tar.gz
Algorithm Hash digest
SHA256 0175f494b99b2f8b875ffdb55e5a75c2f5d8de38e628f8ac17f2cfb310c3228e
MD5 078cc94e5068dc01992d095bd13c43af
BLAKE2b-256 01ac0520b5b5ca0aa4ce56a67a15dd60115e4d50a0cd5979d35bc65111c9302a

See more details on using hashes here.

File details

Details for the file dremioframe-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: dremioframe-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dremioframe-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 08229766a0c41ea27932e113b5aae370f2385f16627116e84a687917e5214580
MD5 d0ee4112d6b8a89fc3ac201afade2cdd
BLAKE2b-256 cb337a741881a2e57389a5617830fff3418adcc9e7ac3e74a2da1145d6358cfe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page