Skip to main content

Databricks helpers for Pydantic models

Project description

Pydantic Databricks

Overview

This library leverages Pydantic to simplify the generation of Databricks SQL queries for managing tables. It provides a convenient way to define table schemas using Pydantic models and generates corresponding SQL statements.

Installation

pip install pydantic-databricks

Usage

1. Basic Example Table Creation

from pydantic_databricks import DatabricksModel
from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

class Schema(DatabricksModel):
    _table_name = "test"
    _schema_name = "test_schema"
    
    col1: str
    col2: int
    col3: float
    
spark.sql(Schema.create_table())

Generated SQL:

CREATE TABLE test_schema.test (col1 STRING NOT NULL, col2 BIGINT NOT NULL, col3 DOUBLE NOT NULL) USING DELTA; 

2. Setting Grants

from pydantic_databricks import DatabricksModel, Grant, GrantAction
from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

class Schema(DatabricksModel):
    _table_name = "test"
    _schema_name = "default"
    _grants = {Grant(action=GrantAction.MODIFY, principal="user1"),
               Grant(action=GrantAction.SELECT, principal="user2"), }

    col1: str

for grant in Schema.grant_statements():
    spark.sql(grant)

Currently Supported Options

  • _catalog_name: The catalog name for the table. Default is None. If None then a two part namespace is used.
  • _schema_name: The schema name for the table (required).
  • _table_name: The name of the table (required).
  • _grants: A set of Grant objects. Default is None.
  • _location_prefix: The location prefix for external tables. Default is None. If set, then the table is created as an external table. The prefix will be appended with the full table name.
  • _table_properties: A dictionary of table properties.
  • _table_create_mode: The mode for table creation. Default is CreateMode.CREATE.
  • _table_data_source: The data source for the table. Default is DataSource.DELTA.
  • _partition_columns: A list of partition columns for the table. Default is None.
  • _options: A dictionary of additional options for table creation. Default is None.
  • _comment: A comment for the table. Default is None.

Coming soon

  • Support for table and column constraints

Contributing

We welcome contributions to pydantic-databricks! If you'd like to contribute, please follow these steps:

  1. Fork the repository.
  2. Create a branch for your feature or bug fix.
  3. Make your changes.
  4. Test your changes thoroughly.
  5. Submit a pull request.

License

  • pydantic-databricks is licensed under the MIT License. See LICENSE for more information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantic_databricks-0.3.0.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

pydantic_databricks-0.3.0-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file pydantic_databricks-0.3.0.tar.gz.

File metadata

  • Download URL: pydantic_databricks-0.3.0.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.10.4 CPython/3.12.0

File hashes

Hashes for pydantic_databricks-0.3.0.tar.gz
Algorithm Hash digest
SHA256 019d74848d3bc2947786ed7d696b75a2a6ba77b28a6182b9a0bfe89375569a09
MD5 76044298ab0be3ee8d97e81966d4ccf6
BLAKE2b-256 63c286383cf263e01fb8d819ba7343bf70e73aeed6fe94bc5a6df364f15e3276

See more details on using hashes here.

File details

Details for the file pydantic_databricks-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pydantic_databricks-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8f2f16a4e8387a1b6c61bb5cb1004c107b3efb90d19382d2639cad79722c438d
MD5 c0efafd468c1d842235662360a1f6ff6
BLAKE2b-256 ab9675e828a5fdf5ab7cfee2191de5a45349ffe7c3fffcb96be38aede530ff05

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page