Official Python library for Unity Catalog AI support
Project description
Unity Catalog AI Core library
The Unity Catalog AI Core library provides convenient APIs to interact with Unity Catalog functions, including the creation, retrieval and execution of functions. The library includes clients for interacting with both Unity Catalog servers and Databricks-managed Unity Catalog services, in support of UC functions as tools in agents.
Installation
pip install unitycatalog-ai
If you are using the Databricks-managed version of Unity Catalog, you can install the optional additional Databricks dependencies by providing the option:
pip install unitycatalog-ai[databricks]
Get started
Unity Catalog Function Client
The Unity Catalog (UC) function client is a core component of the Unity Catalog AI Core Library, enabling seamless interaction with a Unity Catalog server. This client allows you to manage and execute UC functions, providing both asynchronous and synchronous interfaces to cater to various application needs. Whether you're integrating UC functions into GenAI workflows or managing them directly, the UC client offers robust and flexible APIs to facilitate your development process.
Key Features
- Asynchronous and Synchronous Operations: Flexibly choose between async and sync methods based on your application's concurrency requirements.
- Comprehensive Function Management: Easily create, retrieve, list, execute, and delete UC functions.
- Wrapped Function Support: In addition to standard single-function creation, you can create wrapped functions that in-line additional helper functions within a function's definition to simplify code reuse and modularity.
- Integration with GenAI: Seamlessly integrate UC functions as tools within Generative AI agents, enhancing intelligent automation workflows.
- Type Safety and Caching: Enforce strict type validation and utilize caching mechanisms to optimize performance and reduce redundant executions.
Caveats
When using the UnitycatalogFunctionClient for UC, be mindful of the following considerations:
- Asynchronous API Usage:
- The
UnitycatalogFunctionClientis built on top of the asynchronous unitycatalog-client SDK, which utilizes aiohttp for REST communication with the UC server. - The function client for Unity Catalog offers both asynchronous and synchronous methods. The synchronous methods are wrappers around the asynchronous counterparts, ensuring compatibility with environments that may not support asynchronous operations.
- Important: Avoid creating additional event loops in environments that already have a running loop (e.g., Jupyter Notebooks) to prevent conflicts and potential runtime errors.
- The
- Security Considerations:
- WARNING Function execution occurs locally within the environment where your application is running.
- Caution: Executing GenAI-generated Python code can pose security risks, especially if the code includes operations like file system access or network requests.
- Recommendation: Run your application in an isolated and secure environment with restricted permissions to mitigate potential security threats.
- External Dependencies:
- Ensure that any external libraries required by your UC functions are pre-installed in the execution environment.
- Best Practice: Import external dependencies within the function body to guarantee their availability during execution.
- Function Overwriting:
- The
create_function,create_function_async,create_wrapped_functionandcreate_wrapped_function_asyncmethods allow overwriting existing functions by setting the replace parameter to True. - Warning: Overwriting functions can disrupt workflows that depend on existing function definitions. Use this feature judiciously and ensure that overwriting is intentional.
- The
- Type Validation and Compatibility:
- The client performs strict type validation based on the defined schemas. Ensure that your function parameters and return types adhere to the expected types to prevent execution errors.
Prerequisites
Before using the UC functions client, ensure that your environment meets the following requirements:
-
Python Version: Python 3.10 or higher is recommended to leverage all functionalities, including function creation and execution.
-
Dependencies: Install the necessary packages using pip:
pip install unitycatalog-client unitycatalog-ai
-
Unity Catalog Server: Ensure that you have access to a running instance of the open-source Unity Catalog server. Follow the Unity Catalog Installation Guide to set up your server if you haven't already.
Client Initialization
To interact with UC functions, initialize the UnitycatalogFunctionClient as shown below:
import asyncio
from unitycatalog.ai.core.client import UnitycatalogFunctionClient
from unitycatalog.client import ApiClient, Configuration
# Configure the Unity Catalog API client
config = Configuration(
host="http://localhost:8080/api/2.1/unity-catalog" # Replace with your UC server URL
)
# Initialize the asynchronous ApiClient
api_client = ApiClient(configuration=config)
# Instantiate the UnitycatalogFunctionClient
uc_client = UnitycatalogFunctionClient(api_client=api_client)
# Example catalog and schema names
CATALOG = "my_catalog"
SCHEMA = "my_schema"
Creating a UC Function
You can create a UC function either by providing a Python callable or by submitting a FunctionInfo object. Below is an example (recommended) of using the create_python_function API that accepts a Python callable (function) as input.
To create a UC function from a Python function, define your function with appropriate type hints and a Google-style docstring:
def add_numbers(a: float, b: float) -> float:
"""
Adds two numbers and returns the result.
Args:
a (float): First number.
b (float): Second number.
Returns:
float: The sum of the two numbers.
"""
return a + b
# Create the function within the Unity Catalog catalog and schema specified
function_info = uc_client.create_python_function(
func=add_numbers,
catalog=CATALOG,
schema=SCHEMA,
replace=False, # Set to True to overwrite if the function already exists
)
print(function_info)
Creating a Wrapped UC Function
In addition to standard function creation, you can create wrapped functions. A wrapped function uses a primary function as the interface while in-lining additional helper functions (wrapped functions) into the primary function’s definition. This feature is useful when you want to keep helper logic bundled together with the main function without needing to replicate existing common utilities within your function definitions.
For example, consider the following helper functions and the primary wrapper function that has direct dependencies on the helper functions:
def a(x: int) -> int:
return x + 1
def b(y: int) -> int:
return y + 2
def wrapper(x: int, y: int) -> int:
"""
Wrapper function that in-lines helper functions a and b.
Args:
x (int): The first argument.
y (int): The second argument.
Returns:
int: The combined result of a(x) and b(y).
"""
return a(x) + b(y)
To register this wrapped function as a single UC function, you can call the create_wrapped_function API:
function_info = uc_client.create_wrapped_function(
primary_func=wrapper,
functions=[a, b],
catalog=CATALOG,
schema=SCHEMA,
replace=False, # Set to True to overwrite if the function already exists
)
Retrieving a UC Function
To retrieve details of a specific UC function, use the get_function method with the full function name in the format <catalog>.<schema>.<function_name>:
full_func_name = f"{CATALOG}.{SCHEMA}.add_numbers"
# Retrieve the function information and metadata
function_info = uc_client.get_function(full_func_name)
print(function_info)
Listing Functions
# List all created functions within a given schema
functions = uc_client.list_functions(
catalog=CATALOG,
schema=SCHEMA,
max_results=10 # Paginated results will contain a continuation token that can be submitted with additional requests
)
for func in functions.items:
print(func)
Executing a Function
Note that function execution occurs in the main process of where you are calling this API from. Read the notes above about security considerations for unknown code execution before calling this API.
full_func_name = f"{CATALOG}.{SCHEMA}.add_numbers"
parameters = {"a": 10.5, "b": 5.5}
# Or synchronously
result = uc_client.execute_function(full_func_name, parameters)
print(result.value) # Outputs: 16.0
Function Parameter Defaults
Defining and executing functions with parameter defaults behave similarly to standard Python function argument defaults. If a parameter is not provided that is marked as having a default value when called via the execute_function API, the existing default parameter value will be mapped to the function invocation call.
If using defaults in your function signatures, ensure that the descriptions are accurate and declare what the default value is to ensure that Agentic use of your function is accurate.
Deleting a Function
To delete a function that you have write authority to, you can use the following API:
full_func_name = f"{CATALOG}.{SCHEMA}.add_numbers"
uc_client.delete_function(full_func_name)
Databricks-managed UC
To use Databricks-managed Unity Catalog with this package, follow the instructions to authenticate to your workspace and ensure that your access token has workspace-level privilege for managing UC functions.
Prerequisites
- [Highly recommended] Use python>=3.10 for accessing all functionalities including function creation and function execution.
- For creating UC functions with a SQL body definition, only serverless compute is supported.
Install databricks-connect package with
pip install databricks-connect==15.1.0to access serverless compute. python>=3.10 is a requirement to install this version of the package. - For executing the UC functions within Databricks, use either SQL warehouse or Databricks Connect with serverless:
- SQL warehouse: create a SQL warehouse following this instruction, and use the warehouse id when initializing the client.
NOTE: only
serverlessSQL warehouse type is supported because of performance concerns. - Databricks connect with serverless: Install databricks-connect package with
pip install databricks-connect==15.1.0. No config needs to be passed when initializing the client.
- SQL warehouse: create a SQL warehouse following this instruction, and use the warehouse id when initializing the client.
NOTE: only
Client initialization
In this example, we use serverless compute as an example.
from unitycatalog.ai.core.databricks import DatabricksFunctionClient
client = DatabricksFunctionClient()
Create a UC function
Create a UC function with SQL string should follow this syntax.
# make sure you have privilege in the corresponding catalog and schema for function creation
CATALOG = "..."
SCHEMA = "..."
func_name = "test"
sql_body = f"""CREATE FUNCTION {CATALOG}.{SCHEMA}.{func_name}(s string)
RETURNS STRING
LANGUAGE PYTHON
AS $$
return s
$$
"""
function_info = client.create_function(sql_function_body=sql_body)
Dependencies and Environments
In Databricks runtime version 17 and higher, the ability to specify dependencies within a function execution environment is supported. Earlier runtime
versions do not support this feature and will error if the arguments dependencies or environment are submitted with a create_python_function or create_wrapped_python_function call.
To specify PyPI dependencies to include in your execution environment, you can see the minimum example below:
# Define a function that requires an external PyPI dependency
def dep_check(x: str) -> str:
"""
A function to test the dependency support for UC
Args:
x: An input string
Returns:
A string that reports the dependency support for UC
"""
import scrapy # NOTE that you must still import the library to use within the function.
return scrapy.__version__
# Create the function and supply the dependency in standard PyPI format
client.create_python_function(func=dep_check, catalog=CATALOG, schema=SCHEMA, replace=True, dependencies=["scrapy==2.10.1"])
Retrieve a UC function
The client also provides API to get the UC function information details. Note that the function name passed in must be the full name in the format of <catalog>.<schema>.<function_name>.
full_func_name = f"{CATALOG}.{SCHEMA}.{func_name}"
client.get_function(full_func_name)
List UC functions
To get a list of functions stored in a catalog and schema, you can use list API with wildcards to do so.
client.list_functions(catalog=CATALOG, schema=SCHEMA, max_results=5)
Execute a UC function
Parameters passed into execute_function must be a dictionary that maps to the input params defined by the UC function.
result = client.execute_function(full_func_name, {"s": "some_string"})
assert result.value == "some_string"
Function execution arguments configuration
To manage the function execution behavior using Databricks client under different configurations, we offer the following environment variables:
| Environment Variable | Description | Default Value |
|---|---|---|
UCAI_DATABRICKS_SESSION_RETRY_MAX_ATTEMPTS |
Maximum number of attempts to retry refreshing the session client in case of token expiry. | 5 |
UCAI_DATABRICKS_SERVERLESS_EXECUTION_RESULT_ROW_LIMIT |
Maximum number of rows when executing functions using serverless compute with databricks-connect. |
100 |
| 100 |
Reminders
- If the function contains a
DECIMALtype parameter, it is converted to pythonfloatfor execution, and this conversion may lose precision.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file unitycatalog_ai-0.2.0.tar.gz.
File metadata
- Download URL: unitycatalog_ai-0.2.0.tar.gz
- Upload date:
- Size: 40.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4bf92dfb2d71dc4d001d3ee329e07e10ad8351422ec515afc7a45941c40d74c7
|
|
| MD5 |
5277d425ce0a7e509ff1c2d4c367bc0e
|
|
| BLAKE2b-256 |
f0c2831403376b2f44f55a790974c6ae9446f818cb695e83f87dd2f191439180
|
File details
Details for the file unitycatalog_ai-0.2.0-py3-none-any.whl.
File metadata
- Download URL: unitycatalog_ai-0.2.0-py3-none-any.whl
- Upload date:
- Size: 48.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4178304e986ca42500045a800b9d0b87b8c1396d3c8408b42e60850d51320015
|
|
| MD5 |
4f1beea81b838bdca1896e2c51c9b158
|
|
| BLAKE2b-256 |
4afe271d25a9240de72f6c43814fcc20397b4128f9b27408d3cec71c753f6f43
|