Skip to main content

Official Python SDK for Unity Catalog

Project description

Unity Catalog Python Client SDK

Welcome to the official Python Client SDK for Unity Catalog!

Unity Catalog is the industry's only universal catalog for data and AI.

  • Multimodal interface supports any format, engine, and asset
    • Multi-format support: It is extensible and supports Delta Lake, Apache Iceberg and Apache Hudi via UniForm, Apache Parquet, JSON, CSV, and many others.
    • Multi-engine support: With its open APIs, data cataloged in Unity can be read by many leading compute engines.
    • Multimodal: It supports all your data and AI assets, including tables, files, functions, AI models.
  • Open source API and implementation - OpenAPI spec and OSS implementation (Apache 2.0 license). It is also compatible with Apache Hive's metastore API and Apache Iceberg's REST catalog API. Unity Catalog is currently a sandbox project with LF AI and Data Foundation (part of the Linux Foundation).
  • Unified governance for data and AI - Govern and secure tabular data, unstructured assets, and AI assets with a single interface.

Python Client SDK

The Unity Catalog Python SDK provides a convenient Python-native interface to all of the functionality of the Unity Catalog REST APIs. The library includes interfaces for all supported public modules.

This library is generated using the OpenAPI Generator toolkit, providing client interfaces using the aiohttp request handling library.

Installation

The Python Client SDK and associated shared namespace package for Unity Catalog use hatch as their supported build backend.

To ensure that you can install the package, install hatch via any of the listed options here.

To use the unitycatalog-client SDK, you can install directly from PyPI:

pip install unitycatalog-client

To install from source, you will need to fork and clone the unitycatalog repository locally.

To build the Python source locally, you will need to have JDK17 installed and activated.

Once your configuration supports the execution of sbt, you can run the following within the root of the repository to generate the Python Client SDK source:

build/sbt pythonClient/generate

The source code will be generated at unitycatalog/clients/python/target.

You can then install the package in editable mode from the repository root:

pip install -e clients/python/target

Usage

To get started with using the Python Client SDK, first ensure that you have a running Unity Catalog server to connect to. You can follow instructions here to quickly get started with setting up a local Unity Catalog server if needed.

Once the server is running, you can set up the Python client to make async requests to the Unity Catalog server.

For the examples listed here, we will be using a local unauthenticated server for simplicity's sake.

from unitycatalog.client import Configuration


config = Configuration()
config.host = "http://localhost:8080/api/2.1/unity-catalog"

Once we have our configuration, we can set our client that we will be using for each request:

from unitycatalog.client import ApiClient


client = ApiClient(configuration=config)

With our client configured and instantiated, we can use any of the Unity Catalog APIs by importing from the client namespace directly and send requests to the server.

from unitycatalog.client import CatalogsApi


catalogs_api = CatalogsApi(api_client=client)
my_catalogs = await catalogs_api.list_catalogs()

Note: APIs that support pagination (such as list_catalogs) should have continutation token logic for assembling the paginated return values into a single collection.

A simple example of consuming a paginated response is:

async def list_all_catalogs(catalog_api):
  token = None
  catalogs = []
  while True:
    results = await catalog_api.list_catalogs(page_token=token)
    catalogs += results.catalogs
    if next_token := results.next_page_token:
      token = next_token
    else:
      break
  return catalogs

my_catalogs = await list_all_catalogs(catalogs_api)

Creating a new catalog with the Python SDK is straight-forward:

from unitycatalog.client.models import CreateCatalog, CatalogInfo


async def create_catalog(catalog_name, catalog_api, comment=None):
    new_catalog = CreateCatalog(
        name=catalog_name,
        comment=comment or ""
    )
    return await catalog_api.create_catalog(create_catalog=new_catalog)
        
await create_catalog("MyNewCatalog", catalog_api=catalogs_api, comment="This is a new catalog.")

Adding a new Schema to our created Catalog is similarly simple:

from unitycatalog.client import SchemasApi
from unitycatalog.client import CreateSchema, SchemaInfo


schemas_api = SchemasApi(api_client=client)

async def create_schema(schema_name, catalog_name, schema_api, comment=None):
    new_schema = CreateSchema(
        name=schema_name,
        catalog_name=catalog_name,
        comment=comment or ""
    )
    return await schema_api.create_schema(create_schema=new_schema)

await create_schema(schema_name="MyNewSchema", catalog_name="MyNewCatalog", schema_api=schemas_api, comment="This is a new schema.")

And listing the schemas within our newly created Catalog (note that if you expect paginated responses, ensure that you are passing continuation tokens as shown above in list_all_catalogs):

await schemas_api.list_schemas(catalog_name="MyNewCatalog")

Feedback

Have requests for the Unity Catalog project? Interested in getting involved in the Open Source project?

See the repository on GitHub

Read the documentation for more guidance and examples!

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unitycatalog_client-0.4.1.tar.gz (69.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unitycatalog_client-0.4.1-py3-none-any.whl (180.3 kB view details)

Uploaded Python 3

File details

Details for the file unitycatalog_client-0.4.1.tar.gz.

File metadata

  • Download URL: unitycatalog_client-0.4.1.tar.gz
  • Upload date:
  • Size: 69.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.23

File hashes

Hashes for unitycatalog_client-0.4.1.tar.gz
Algorithm Hash digest
SHA256 005123d4d6f0dab3651ac49029d8f2c41953af6f5e18dab7bdd9732be211ab81
MD5 4b77c56464456c0182ee3c7fdff022d5
BLAKE2b-256 b65187340e0be3f23218fe4591777aae0d800c7f8e7d4e04caa865133416970b

See more details on using hashes here.

File details

Details for the file unitycatalog_client-0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for unitycatalog_client-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bbf0034fc357a921c0030ad383176295ee7c38c433f6ab58cca90ab30904c864
MD5 7b61a4d84b251d81d2cab9954aabfa5e
BLAKE2b-256 403c0d5384ac23a1cfea3fbdc1b5cc4c7d0e0a9bc2ebcb31756647aef5f96499

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page