Skip to main content

A comprehensive Databricks SQL handler with Pydantic models, OAuth authentication, and connection management

Project description

Databricks SQL Handler

A comprehensive Python package for interacting with Databricks SQL warehouses using Pydantic models, OAuth authentication, and robust connection management.

Features

  • OAuth Authentication: Secure client credentials flow authentication
  • Pydantic Models: Type-safe data models for query results
  • Connection Management: Robust connection handling with automatic retry logic
  • Query Execution: Execute single or multiple queries with structured results
  • Built-in Models: Pre-defined models for common Databricks operations
  • Extensible: Easy to extend with custom data models
  • CLI Interface: Interactive command-line interface for testing and development

Installation

pip install dbxsql

Quick Start

Basic Usage

from dbxsql import QueryHandler, DatabricksSettings, NexsysRecord

# Configure settings (or use environment variables)
settings = DatabricksSettings(
    client_id="your_client_id",
    client_secret="your_client_secret", 
    server_hostname="your_databricks_hostname",
    http_path="/sql/1.0/warehouses/your_warehouse_id"
)

# Use as context manager
with QueryHandler(settings) as handler:
    # Execute a simple query
    result = handler.execute_query("SELECT * FROM my_table LIMIT 10")
    
    # Execute with Pydantic model parsing
    result = handler.execute_query(
        "SELECT * FROM nexsys_table LIMIT 10", 
        NexsysRecord
    )
    
    # Access structured data
    for record in result.data:
        print(f"ID: {record.id}, Name: {record.name}")

Environment Variables

Create a .env file:

DATABRICKS_CLIENT_ID=your_client_id
DATABRICKS_CLIENT_SECRET=your_client_secret
DATABRICKS_SERVER_HOSTNAME=your_hostname.databricks.com
DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/warehouse_id
DATABRICKS_LOG_LEVEL=INFO

CLI Usage

# Interactive mode
dbxsql --interactive

# Run example queries
dbxsql --examples

# Execute a single query
dbxsql --query "SELECT current_timestamp()"

Custom Models

Create your own Pydantic models:

from pydantic import BaseModel
from datetime import datetime
from typing import Optional

class MyCustomModel(BaseModel):
    id: int
    name: str
    created_at: datetime
    amount: Optional[float] = None

# Register the model
from dbxsql import register_model
register_model("my_custom", MyCustomModel)

# Use it in queries
result = handler.execute_query("SELECT * FROM my_table", MyCustomModel)

Configuration

All configuration can be done via environment variables with the DATABRICKS_ prefix or programmatically:

Setting Environment Variable Default Description
client_id DATABRICKS_CLIENT_ID Required OAuth client ID
client_secret DATABRICKS_CLIENT_SECRET Required OAuth client secret
server_hostname DATABRICKS_SERVER_HOSTNAME Required Databricks hostname
http_path DATABRICKS_HTTP_PATH Required SQL warehouse HTTP path
log_level DATABRICKS_LOG_LEVEL INFO Logging level
max_retries DATABRICKS_MAX_RETRIES 3 Query retry attempts
query_timeout DATABRICKS_QUERY_TIMEOUT 300 Query timeout (seconds)

API Reference

QueryHandler

Main class for executing queries and managing connections.

Methods

  • execute_query(query, model_class=None): Execute single query
  • execute_multiple_queries(queries, model_classes=None): Execute multiple queries
  • execute_query_with_retry(query, model_class=None): Execute with automatic retry
  • list_files(path): List files in Databricks path
  • show_tables(database=None): Show tables in database
  • test_connection(): Test database connectivity

Built-in Models

  • NexsysRecord: For NEXSYS system data
  • SalesRecord: For sales transaction data
  • FileInfo: For file listing results
  • TableInfo: For table information
  • GenericRecord: For unknown data structures

Error Handling

The package provides specific exceptions:

from dbxsql import (
    AuthenticationError,
    ConnectionError, 
    QueryExecutionError,
    SyntaxError,
    TimeoutError,
    DataParsingError
)

try:
    result = handler.execute_query("SELECT * FROM table")
except AuthenticationError:
    print("Authentication failed")
except QueryExecutionError as e:
    print(f"Query failed: {e}")

License

MIT License

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dbxsql-1.0.0.tar.gz (15.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dbxsql-1.0.0-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file dbxsql-1.0.0.tar.gz.

File metadata

  • Download URL: dbxsql-1.0.0.tar.gz
  • Upload date:
  • Size: 15.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for dbxsql-1.0.0.tar.gz
Algorithm Hash digest
SHA256 c50c710685eb4fa41f3fa9c1d23718a6488d31158ba64667bdcf062d3bdc3102
MD5 4d5f9e9874bdbdecc80c5707fdd58bab
BLAKE2b-256 b85e466028fe23257e0c1195a50ac7926567bc615070f065c18f7741c11e2e4d

See more details on using hashes here.

File details

Details for the file dbxsql-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: dbxsql-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 19.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for dbxsql-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e476838b7f78c13925a35230a3f8733ab40c8dfb64a63d754faea7ff73e8deae
MD5 b80d8d28e55343c395cbc54678406ebc
BLAKE2b-256 89bc3f3a5e1c62be53f989545c46069f56dcd8097a51ec24f5f9a3fde797c941

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page