Skip to main content

Python client library for Avrio

Project description

PyAvrio Library for Avrio Product

PyAvrio allows you to query and transform data in Avrio (Data To AI) platform directly without having to download the data locally. This library provides seamless access to the Avrio platform, allowing users to execute SQL queries and retrieve metadata such as catalog names, schema names, table names, and column information.

Getting Started

Installation

You can install PyAvrio via pip:

pip install pyavrio

Usage

To start using PyAvrio, you first need to import the PyAvrioFunctions module:

from pyavrio import PyAvrioFunctions

Connecting to Avrio

To connect to the Avrio platform, use the avrio_engine method:

from pyavrio import PyAvrioFunctions



# Define connection parameters

user_email = "your_email@example.com"

password = "your_password"

host = "host"

port = 1234 

catalog = "your_catalog"

platform = "data_sources"  # Platform should be either "data_products" or "data_sources"



# Establish connection to Avrio

engine = PyAvrioFunctions.avrio_engine(f"pyavrio://{user_email}:{password}@{host}:{port}/{catalog}?platform={platform}")

Using SQL

You can execute SQL queries using the execute_sql_query method:

sql_query = """

    SELECT column1, column2 FROM table_name LIMIT 10

"""



result = PyAvrioFunctions.execute_sql_query(engine, sql_query)

Replace sql_query with your desired SQL query string.

Querying Data

import pandas as pd



# Execute query and store result in DataFrame

df = pd.DataFrame(result, columns=['column1', 'column2'])

print(df.head())



# Perform DataFrame operations

# Example: Filter DataFrame

filtered_df = df[df['column1'] > 100]

print(filtered_df.head())

DataFrame Aggregation

# Example: Aggregating DataFrame

aggregated_df = df.groupby('column1').agg({'column2': 'sum'}).reset_index()

print(aggregated_df.head())

DataFrame Join

sql_query2 = """

    SELECT column3, column4 FROM second_table LIMIT 10

"""

result2 = PyAvrioFunctions.execute_sql_query(engine, sql_query2)

df2 = pd.DataFrame(result2, columns=['column3', 'column4'])



# Join DataFrames

joined_df = df.merge(df2, on='common_column')

print(joined_df.head())

Available Methods

PyAvrio provides the following methods for interacting with the Avrio platform:

  • avrio_engine: Connects to the Avrio platform.

  • execute_sql_query: Executes SQL queries.

  • get_catalog_names: Retrieves catalog names. (Requires platform=data_products for data products or platform=data_sources for data sources). For data products, catalog name represents the domain name, and schema name represents the subdomain name. For data sources, it is similar to Trino catalog and schema.

  • get_schema_names: Retrieves schema names. (Requires platform=data_products for data products or platform=data_sources for data sources)

  • get_table_names: Retrieves table names. (Requires platform=data_products for data products or platform=data_sources for data sources)

  • get_table_columns: Retrieves column information for a specified table. (Requires platform=data_products for data products or platform=data_sources for data sources)

# Retrieve catalog names

catalogs = PyAvrioFunctions.get_catalog_names(engine)

print("Catalogs:", catalogs)



# Retrieve schema names

schemas = PyAvrioFunctions.get_schema_names(engine)

print("Schemas:", schemas)



# Retrieve table names

tables = PyAvrioFunctions.get_table_names(engine, schema='schema_name')

print("Tables:", tables)



# Retrieve columns information for a table

columns_info = PyAvrioFunctions.get_table_columns(engine, schema='schema_name', table_name='table_name')

print("Columns Information:", columns_info)

Supported Operations

DML operations are only supported for Data Sources and not for Data Products, while DDL operations are supported by both Data Sources and Data Products in PyAvrio.

Example of DML Query

# Example of executing a DML query

dml_query = """

    INSERT INTO table_name (column1, column2) VALUES (value1, value2)

"""



result = PyAvrioFunctions.execute_sql_query(engine, dml_query)

print(result)  

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyavrio-20.0.1.tar.gz (40.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyavrio-20.0.1-py3-none-any.whl (43.7 kB view details)

Uploaded Python 3

File details

Details for the file pyavrio-20.0.1.tar.gz.

File metadata

  • Download URL: pyavrio-20.0.1.tar.gz
  • Upload date:
  • Size: 40.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.0

File hashes

Hashes for pyavrio-20.0.1.tar.gz
Algorithm Hash digest
SHA256 339faac41de5dd39f16b5357d52577d9aa64f3eb7f0e9bb894a267dbe3d7d115
MD5 f120dcc84dd09a79492098f866f43a4e
BLAKE2b-256 a9b2234f86395966fcbf38c936d5423b11092a9f05657be7f292c980085e347f

See more details on using hashes here.

File details

Details for the file pyavrio-20.0.1-py3-none-any.whl.

File metadata

  • Download URL: pyavrio-20.0.1-py3-none-any.whl
  • Upload date:
  • Size: 43.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.0

File hashes

Hashes for pyavrio-20.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bd8004c17936126d7a89c044a86b622b77de13a818a724ed5cc6c21ca09ac594
MD5 7ffb6abe384a1226c55a2094e78f3a1e
BLAKE2b-256 590b25e87fd15c1ccf1bcc503f419ac2ff6bebd999184ed95b375a2474a3aedf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page