Skip to main content

Manage Unity Catalog tables with Pydantic models

Project description

Unity Catalog Pydantic

CodeQL test Python Version from PEP 621 TOML codecov

Disclaimer: This project is unofficial and not affiliated with or endorsed by the official Unity Catalog team.

Simplifies managing OSS Unity Catalog tables using Pydantic models.

Installation

pip install unitycatalog-pydantic

Examples

Create Table

from unitycatalog.client import ApiClient, TablesApi
from unitycatalog_pydantic import UCModel

class MyTable(UCModel):
    col1: str
    col2: int
    col3: float

# Initialize the API client
catalog_client = ApiClient(...)
tables_api = TablesApi(catalog_client)

# Create the table
table_info = await MyTable.create(
    tables_api=tables_api,
    catalog_name="my_catalog",
    schema_name="my_schema",
    storage_location="s3://my_bucket/my_path",
)

Retrieve Table

table_info = await MyTable.get(
    tables_api=tables_api,
    catalog_name="my_catalog",
    schema_name="my_schema",
)

Delete Table

await MyTable.delete(
    tables_api=tables_api,
    catalog_name="my_catalog",
    schema_name="my_schema",
)

Nested Models

from pydantic import BaseModel
from unitycatalog.client import ApiClient, TablesApi
from unitycatalog_pydantic import UCModel

class NestedModel(BaseModel):
    nested_col1: str
    nested_col2: int

class MyTable(UCModel):
    col1: str
    col2: NestedModel

# Initialize the API client
catalog_client = ApiClient(...)
tables_api = TablesApi(catalog_client)

# Create the table
table_info = await MyTable.create(
    tables_api=tables_api,
    catalog_name="my_catalog",
    schema_name="my_schema",
    storage_location="s3://my_bucket/my_path",
)

Using a BaseModel as root model

from pydantic import BaseModel
from unitycatalog.client import ApiClient, TablesApi
from unitycatalog_pydantic import create_table

class NestedModel(BaseModel):
    nested_col1: str
    nested_col2: int

class MyTable(BaseModel):
    col1: str
    col2: NestedModel

# Initialize the API client
catalog_client = ApiClient(...)
tables_api = TablesApi(catalog_client)

# Create the table
table_info = await create_table(
    model=MyTable,
    tables_api=tables_api,
    catalog_name="my_catalog",
    schema_name="my_schema",
    storage_location="s3://my_bucket/my_path",
)

Configuration

  • tables_api: The TablesApi client.
  • catalog_name: The catalog name.
  • schema_name: The schema name.
  • storage_location: The storage location.
  • table_type: The table type (default is TableType.EXTERNAL).
  • data_source_format: The data source format (default is DataSourceFormat.DELTA).
  • comment: A comment for the table. If not provided, the table docstring is used
  • properties: The properties of the table.
  • by_alias: Whether to use the alias or name for the columns (default is True).
  • json_schema_mode: The mode in which to generate the schema (default is validation).
  • alias: The table alias. If not provided, the class name is used.

Caveats

Tested on Parquet, Delta, and CSV data source formats. Other formats may not work as expected.

  • Currently, Parquet and Unity Catalog type integration is pretty limited. For instance, there is no way to specify the integer type, because Parquet doesn't recognize integer SQL types. The same goes for other types like DATE, TIMESTAMP, etc.. This is an integration issue and not a problem with the library itself.
  • You can't use nested models for CSV data source format. This is because CSV doesn't support nested types. This is an issue with the data source format and not the library itself.
  • Latest version of DuckDB doesn't support reading some of the required fields for UC's ColumnInfo model. e.g., precision fields. This is an integration issue and not a problem with the library itself.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unitycatalog_pydantic-0.1.0.tar.gz (86.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unitycatalog_pydantic-0.1.0-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file unitycatalog_pydantic-0.1.0.tar.gz.

File metadata

  • Download URL: unitycatalog_pydantic-0.1.0.tar.gz
  • Upload date:
  • Size: 86.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for unitycatalog_pydantic-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1e287d6e26bfe6fea7947cda9e32a13f3b8621d550927eabe83558e84ce89367
MD5 835f141d33d3184ccc53fcabe113d4f3
BLAKE2b-256 3d4def2ad2d34b6404711e74f15ba7126d99e53f394bd29f0892d36b64921de3

See more details on using hashes here.

Provenance

The following attestation bundles were made for unitycatalog_pydantic-0.1.0.tar.gz:

Publisher: build-and-publish.yml on dan1elt0m/unitycatalog-pydantic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file unitycatalog_pydantic-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for unitycatalog_pydantic-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9cb374c8f06effe95937e869c285dddcf8ef4a4e635c1d6a332c120eea6e485f
MD5 bc8e5b15190ff400b0984a7d9dd28bf6
BLAKE2b-256 6f47e7fcfc2aa22311e3be771aea8e4a3e8bcbe1ddc64e794e3079f56690d167

See more details on using hashes here.

Provenance

The following attestation bundles were made for unitycatalog_pydantic-0.1.0-py3-none-any.whl:

Publisher: build-and-publish.yml on dan1elt0m/unitycatalog-pydantic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page