Skip to main content

A Dremio SDK for interacting with one or more Dremio instances

Project description

pydremio

Introduction

pydremio is a Python API wrapper for interacting with Dremio.
It allows you to perform operations on datasets and metadata within Dremio via either the HTTP API or Arrow Flight.
Since Arrow Flight offers significantly better performance, it is the recommended method for data operations.

This repository includes the core library, unit tests, and example code to help you get started.

The wrapper is distributed as a Python wheel (.whl) and can be found in the Releases section.
Publishing to PyPI is planned for the near future.

Installation

You need Python 3.13 or higher.

Option 1: Install via pip

pip install --upgrade --force-reinstall https://github.com/continental/pydremio/releases/download/v0.3.1/dremio-0.3.1-py3-none-any.whl

Option 2: Use requirements.txt

python-dotenv == 1.0.1
https://github.com/continental/pydremio/releases/latest/download/dremio-latest-py3-none-any.whl

Install a specific version

pip install https://github.com/continental/pydremio/releases/download/<version>/dremio-<version>-py3-none-any.whl

Getting Started

Logging in

The simplest way to create a logged-in client instance:

from dremio import Dremio

dremio = Dremio(<hostname>, username=<username>, password=<password>)

Replace the placeholders or, preferably, use environment variables (via a .env file) to avoid storing credentials in code.

Example .env file:

DREMIO_USERNAME="your_username@example.com"
DREMIO_PASSWORD="xyz-your-password-or-pat-xyz"
DREMIO_HOSTNAME="https://your.dremio.host.cloud"

You can then use the convenience method:

from dremio import Dremio
from dotenv import load_dotenv

load_dotenv()
dremio = Dremio.from_env()

More information here: Dremio authentication

Examples

Load a dataset

from dremio import Dremio

dremio = Dremio.from_env()

ds = dremio.get_dataset("path.to.vds")
polars_df = ds.run().to_polars()
pandas_df = ds.run().to_pandas()

Create a folder

from dremio import Dremio, NewFolder

folder = NewFolder(['<path>', '<to>', '<folder>'])
dremio.create_catalog_item(folder)

Create a folder with access control

from dremio import Dremio, NewFolder, AccessControlList, AccessControl

ac = AccessControlList(users=[AccessControl('<user_id>', ['SELECT'])])

folder = NewFolder(['<path>', '<to>', '<folder>'])
folder.accessControlList = ac
dremio.create_catalog_item(folder)

Methods

All models are located in the models/ directory.
Below is an overview of available methods grouped by category.

🔐 Connection

  • login(username: str, password: str) -> str
  • auth(auth: str = None, token: str = None) -> Dremio

📚 Catalog

Retrieval

  • get_catalog_by_id(id: UUID) -> CatalogObject
  • get_catalog_by_path(path: list[str]) -> CatalogObject
    • Accepts both list format (["space", "dataset"]) and string format ("space/dataset")

Creation

  • create_catalog_item(item: NewCatalogObject | dict) -> CatalogObject

Updating

  • update_catalog_item(id: UUID | item: NewCatalogObject | dict) -> CatalogObject
  • update_catalog_item_by_path(path: list[str], item: NewCatalogObject | dict) -> CatalogObject

Deletion

  • delete_catalog_item(id: UUID) -> bool
    • Returns True if successful

Copying

  • copy_catalog_item_by_path(path: list[str], new_path: list[str]) -> CatalogObject

Refreshing

  • refresh_catalog(id: UUID) -> CatalogObject

Exploration

  • get_catalog_tree(id: str = None, path: str | list[str] = None)
    • ⚠️ Expensive operation, intended for exploration and mapping only

📊 Dataset

  • get_dataset(path: list[str] | str | None = None, *, id: UUID | None = None) -> Dataset
  • create_dataset(path: list[str] | str, sql: str | SQLRequest, type: Literal['PHYSICAL_DATASET', 'VIRTUAL_DATASET'] = 'VIRTUAL_DATASET') -> Dataset
  • delete_dataset(path: list[str] | str) -> bool
  • copy_dataset(source_path: list[str] | str, target_path: list[str] | str) -> Dataset
  • reference_dataset(source_path: list[str] | str, target_path: list[str] | str) -> Dataset

🗂️ Folder

  • get_folder(path: list[str] | str | None = None, *, id: UUID | None = None) -> Folder
  • create_folder(path: str | list[str]) -> Folder
  • delete_folder(path: str | list[str], recursive: bool = True) -> bool
  • copy_folder(source_path: list[str] | str, target_path: list[str] | str, *, assume_privileges: bool = True, relative_references: bool = False) -> Folder
  • reference_folder(source_path: list[str] | str, target_path: list[str] | str, *, assume_privileges: bool = True) -> Folder

🤝 Collaboration

Wiki and tags are associated by the ID of the collection item.
The tags object contains an array of tags.

  • get_wiki(id: UUID) -> Wiki
  • set_wiki(id: UUID, wiki: Wiki) -> Wiki
  • get_tags(id: str) -> Tags
  • set_tags(id: str, tags: Tags) -> Tags

🧠 SQL

  • sql(sql_request: SQLRequest) -> JobId
  • start_job_on_dataset(id: UUID) -> JobId
  • get_job_info(id: UUID) -> Job
  • cancel_job(id: UUID) -> Job
  • get_job_results(id: UUID) -> JobResult
  • sql_results(sql_request: SQLRequest) -> Job | JobResult

👤 User

  • get_users() -> list[User]
  • get_user(id: UUID) -> User
  • get_user_by_name(name: str) -> User
  • create_user(user: User) -> User
  • update_user(id: UUID, user: User) -> User
  • delete_user(id: UUID, tag: str) -> bool
    • Returns True if deletion was successful

Roadmap

  • Publish to PyPI
  • CLI support

Contributing

Contributions are welcome! Please open issues or pull requests for features, bugs, or improvements.

License

This project is licensed under the BSD License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydremio-0.3.1.tar.gz (53.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydremio-0.3.1-py3-none-any.whl (48.8 kB view details)

Uploaded Python 3

File details

Details for the file pydremio-0.3.1.tar.gz.

File metadata

  • Download URL: pydremio-0.3.1.tar.gz
  • Upload date:
  • Size: 53.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for pydremio-0.3.1.tar.gz
Algorithm Hash digest
SHA256 4d15c7a00d65ebf30a10763e8d0d4ac2e4ca3578df41a976bc06155f6036c32b
MD5 ad56e1a2e90caaf061c512df24e3c730
BLAKE2b-256 8c6e22a97d5163da141715365381a4d06a15a5828568048329b188e007ffbb23

See more details on using hashes here.

File details

Details for the file pydremio-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: pydremio-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 48.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for pydremio-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f0dc858a56c24310e0ac0d745ea69ec44cedce360ca456b1915d15ee76cbd95b
MD5 80238dd63a576cce3bd4556245dcc5ea
BLAKE2b-256 d6cecab6c39263da6b0d2f0d5ba083624d126d77d4409f23079c22cd9f825916

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page