Skip to main content

A dataframe-like library for Dremio Cloud

Project description

DremioFrame

DremioFrame is a Python library that provides an Ibis-like dataframe builder interface for interacting with Dremio Cloud. It allows you to list data, perform CRUD operations, and administer Dremio resources using a familiar API.

Documentation

Installation

pip install dremioframe

Quick Start

Dremio Cloud

from dremioframe.client import DremioClient

# Assumes DREMIO_PAT and DREMIO_PROJECT_ID are set in env
client = DremioClient()

# Query a table
df = client.table("Samples.samples.dremio.com.zips.json").select("city", "state").limit(5).collect()
print(df)

Dremio Software

client = DremioClient(
    hostname="localhost",
    port=32010,
    username="admin",
    password="password123",
    tls=False
)

Features

from dremioframe.client import DremioClient

client = DremioClient(pat="YOUR_PAT", project_id="YOUR_PROJECT_ID")

# List catalog
print(client.catalog.list_catalog())

# Query data
df = client.table("Samples.samples.dremio.com.zips.json").select("city", "state").filter("state = 'MA'").collect()
print(df)

# Calculated Columns
df.mutate(total_pop="pop * 2").show()

# Aggregation
df.group_by("state").agg(avg_pop="AVG(pop)").show()

# Joins
df.join("other_table", on="left_tbl.id = right_tbl.id").show()

# Iceberg Time Travel
df.at_snapshot("123456789").show()

# External Queries
client.external_query("Postgres", "SELECT * FROM users").show()

# API Ingestion
client.ingest_api(
    url="https://api.example.com/users",
    table_name="users",
    mode="merge",
    pk="id"
)

# Charting
df.chart(kind="bar", x="category", y="sales", save_to="sales.png")

# Export
df.to_csv("data.csv")
df.to_parquet("data.parquet")

# Insert Data (Batched)
import pandas as pd
data = pd.DataFrame({"id": [1, 2], "name": ["A", "B"]})
client.table("my_table").insert("my_table", data=data, batch_size=1000)

# SQL Functions
from dremioframe import F

client.table("sales") \
    .select(
        F.col("dept"),
        F.sum("amount").alias("total_sales"),
        F.rank().over(F.Window.order_by("amount")).alias("rank")
    ) \
    .show()

# Merge (Upsert)
client.table("target").merge(
    target_table="target",
    on="id",
    matched_update={"name": "source.name"},
    not_matched_insert={"id": "source.id", "val": "source.val"},
    data=data
)

# Data Quality
df.quality.expect_not_null("city")
df.quality.expect_row_count("pop > 1000000", 5, "ge") # Expect at least 5 cities with pop > 1M

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dremioframe-0.1.0.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dremioframe-0.1.0-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file dremioframe-0.1.0.tar.gz.

File metadata

  • Download URL: dremioframe-0.1.0.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dremioframe-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8107a3f0e13677bc3f80dff254b825d7af681a9b8a48bc04ead5c2c9f079af1b
MD5 70999aa723d5a5ae34256c3b7f195ae2
BLAKE2b-256 5504844d556ffd2cf73cfaab267724768735592b4f9bea67e63f6f3e40b1044e

See more details on using hashes here.

File details

Details for the file dremioframe-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dremioframe-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for dremioframe-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 423e3f3d9609c52a31f15f6e9aba7f8c983bc315154683993d27aa6296b695f2
MD5 09c6edbce5b768b3b3d21c220785d44c
BLAKE2b-256 98d41c7c61a29582b3d75ff4a7c1cfedba3ab055fd6e5cf17a0e3b72558574f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page