Skip to main content

Python client library for Avrio

Project description

PyAvrio Library for Avrio Product

PyAvrio allows you to query and transform data in Avrio (Data To AI) platform directly without having to download the data locally. This library provides seamless access to the Avrio platform, allowing users to execute SQL queries and retrieve metadata such as catalog names, schema names, table names, and column information.

Getting Started

Installation

You can install PyAvrio via pip:

pip install pyavrio

Usage

To start using PyAvrio, you first need to import the PyAvrioFunctions module:

from pyavrio import PyAvrioFunctions

Connecting to Avrio

To connect to the Avrio platform, use the avrio_engine method:

from pyavrio import PyAvrioFunctions



# Define connection parameters

user_email = "your_email@example.com"

password = "your_password"

host = "host"

port = 1234 

catalog = "your_catalog"

platform = "data_sources"  # Platform should be either "data_products" or "data_sources"



# Establish connection to Avrio

engine = PyAvrioFunctions.avrio_engine(f"pyavrio://{user_email}:{password}@{host}:{port}/{catalog}?platform={platform}")

Using SQL

You can execute SQL queries using the execute_sql_query method:

sql_query = """

    SELECT column1, column2 FROM table_name LIMIT 10

"""



result = PyAvrioFunctions.execute_sql_query(engine, sql_query)

Replace sql_query with your desired SQL query string.

Querying Data

import pandas as pd



# Execute query and store result in DataFrame

df = pd.DataFrame(result, columns=['column1', 'column2'])

print(df.head())



# Perform DataFrame operations

# Example: Filter DataFrame

filtered_df = df[df['column1'] > 100]

print(filtered_df.head())

DataFrame Aggregation

# Example: Aggregating DataFrame

aggregated_df = df.groupby('column1').agg({'column2': 'sum'}).reset_index()

print(aggregated_df.head())

DataFrame Join

sql_query2 = """

    SELECT column3, column4 FROM second_table LIMIT 10

"""

result2 = PyAvrioFunctions.execute_sql_query(engine, sql_query2)

df2 = pd.DataFrame(result2, columns=['column3', 'column4'])



# Join DataFrames

joined_df = df.merge(df2, on='common_column')

print(joined_df.head())

Available Methods

PyAvrio provides the following methods for interacting with the Avrio platform:

  • avrio_engine: Connects to the Avrio platform.

  • execute_sql_query: Executes SQL queries.

  • get_catalog_names: Retrieves catalog names. (Requires platform=data_products for data products or platform=data_sources for data sources). For data products, catalog name represents the domain name, and schema name represents the subdomain name. For data sources, it is similar to Trino catalog and schema.

  • get_schema_names: Retrieves schema names. (Requires platform=data_products for data products or platform=data_sources for data sources)

  • get_table_names: Retrieves table names. (Requires platform=data_products for data products or platform=data_sources for data sources)

  • get_table_columns: Retrieves column information for a specified table. (Requires platform=data_products for data products or platform=data_sources for data sources)

# Retrieve catalog names

catalogs = PyAvrioFunctions.get_catalog_names(engine)

print("Catalogs:", catalogs)



# Retrieve schema names

schemas = PyAvrioFunctions.get_schema_names(engine)

print("Schemas:", schemas)



# Retrieve table names

tables = PyAvrioFunctions.get_table_names(engine, schema='schema_name')

print("Tables:", tables)



# Retrieve columns information for a table

columns_info = PyAvrioFunctions.get_table_columns(engine, schema='schema_name', table_name='table_name')

print("Columns Information:", columns_info)

Supported Operations

DML operations are only supported for Data Sources and not for Data Products, while DDL operations are supported by both Data Sources and Data Products in PyAvrio.

Example of DML Query

# Example of executing a DML query

dml_query = """

    INSERT INTO table_name (column1, column2) VALUES (value1, value2)

"""



result = PyAvrioFunctions.execute_sql_query(engine, dml_query)

print(result)  

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyavrio-20.0.2.tar.gz (41.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyavrio-20.0.2-py3-none-any.whl (44.0 kB view details)

Uploaded Python 3

File details

Details for the file pyavrio-20.0.2.tar.gz.

File metadata

  • Download URL: pyavrio-20.0.2.tar.gz
  • Upload date:
  • Size: 41.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.4

File hashes

Hashes for pyavrio-20.0.2.tar.gz
Algorithm Hash digest
SHA256 4aec4079acbb527fce13bf18d35c189e6a86cd8bf0075a0cee92dcb67f31cc05
MD5 5692f9c51d52a3187097c6347968d0c1
BLAKE2b-256 fd1a36d9b5e465952a42482de466055ec1b5c6bfa8d7e02f40eb26742ddccce2

See more details on using hashes here.

File details

Details for the file pyavrio-20.0.2-py3-none-any.whl.

File metadata

  • Download URL: pyavrio-20.0.2-py3-none-any.whl
  • Upload date:
  • Size: 44.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.4

File hashes

Hashes for pyavrio-20.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a0f92d9e4f79d935db3abfa0ddc64b2406f97b4c7674e9add9ee9e5366b22d4a
MD5 7fd18f5e4adce755b344614a2e7b6c3c
BLAKE2b-256 b96e04f9cffacdde834bc35bc387df3f08b0fcaca7ba962a415daa2676be5705

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page