Skip to main content

Python client library for Avrio

Project description

PyAvrio Library for Avrio Product

PyAvrio allows you to query and transform data in Avrio (Data To AI) platform directly without having to download the data locally. This library provides seamless access to the Avrio platform, allowing users to execute SQL queries and retrieve metadata such as catalog names, schema names, table names, and column information.

Getting Started

Installation

You can install PyAvrio via pip:

pip install pyavrio

Usage

To start using PyAvrio, you first need to import the PyAvrioFunctions module:

from pyavrio import PyAvrioFunctions

Connecting to Avrio

To connect to the Avrio platform, use the avrio_engine method:

from pyavrio import PyAvrioFunctions



# Define connection parameters

user_email = "your_email@example.com"

password = "your_password"

host = "host"

port = 1234 

catalog = "your_catalog"

platform = "data_sources"  # Platform should be either "data_products" or "data_sources"



# Establish connection to Avrio

engine = PyAvrioFunctions.avrio_engine(f"pyavrio://{user_email}:{password}@{host}:{port}/{catalog}?platform={platform}")

Using SQL

You can execute SQL queries using the execute_sql_query method:

sql_query = """

    SELECT column1, column2 FROM table_name LIMIT 10

"""



result = PyAvrioFunctions.execute_sql_query(engine, sql_query)

Replace sql_query with your desired SQL query string.

Querying Data

import pandas as pd



# Execute query and store result in DataFrame

df = pd.DataFrame(result, columns=['column1', 'column2'])

print(df.head())



# Perform DataFrame operations

# Example: Filter DataFrame

filtered_df = df[df['column1'] > 100]

print(filtered_df.head())

DataFrame Aggregation

# Example: Aggregating DataFrame

aggregated_df = df.groupby('column1').agg({'column2': 'sum'}).reset_index()

print(aggregated_df.head())

DataFrame Join

sql_query2 = """

    SELECT column3, column4 FROM second_table LIMIT 10

"""

result2 = PyAvrioFunctions.execute_sql_query(engine, sql_query2)

df2 = pd.DataFrame(result2, columns=['column3', 'column4'])



# Join DataFrames

joined_df = df.merge(df2, on='common_column')

print(joined_df.head())

Available Methods

PyAvrio provides the following methods for interacting with the Avrio platform:

  • avrio_engine: Connects to the Avrio platform.

  • execute_sql_query: Executes SQL queries.

  • get_catalog_names: Retrieves catalog names. (Requires platform=data_products for data products or platform=data_sources for data sources). For data products, catalog name represents the domain name, and schema name represents the subdomain name. For data sources, it is similar to Trino catalog and schema.

  • get_schema_names: Retrieves schema names. (Requires platform=data_products for data products or platform=data_sources for data sources)

  • get_table_names: Retrieves table names. (Requires platform=data_products for data products or platform=data_sources for data sources)

  • get_table_columns: Retrieves column information for a specified table. (Requires platform=data_products for data products or platform=data_sources for data sources)

# Retrieve catalog names

catalogs = PyAvrioFunctions.get_catalog_names(engine)

print("Catalogs:", catalogs)



# Retrieve schema names

schemas = PyAvrioFunctions.get_schema_names(engine)

print("Schemas:", schemas)



# Retrieve table names

tables = PyAvrioFunctions.get_table_names(engine, schema='schema_name')

print("Tables:", tables)



# Retrieve columns information for a table

columns_info = PyAvrioFunctions.get_table_columns(engine, schema='schema_name', table_name='table_name')

print("Columns Information:", columns_info)

Supported Operations

DML operations are only supported for Data Sources and not for Data Products, while DDL operations are supported by both Data Sources and Data Products in PyAvrio.

Example of DML Query

# Example of executing a DML query

dml_query = """

    INSERT INTO table_name (column1, column2) VALUES (value1, value2)

"""



result = PyAvrioFunctions.execute_sql_query(engine, dml_query)

print(result)  

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyavrio-20.0.3.tar.gz (41.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyavrio-20.0.3-py3-none-any.whl (44.0 kB view details)

Uploaded Python 3

File details

Details for the file pyavrio-20.0.3.tar.gz.

File metadata

  • Download URL: pyavrio-20.0.3.tar.gz
  • Upload date:
  • Size: 41.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.4

File hashes

Hashes for pyavrio-20.0.3.tar.gz
Algorithm Hash digest
SHA256 99148e74f89b3f0b687ce227b10f7f42f443d1b40ffdd8ba6f9823870ecb98ed
MD5 54c4aeb3efc363bf9ef1070efdc3e931
BLAKE2b-256 9926e8b58fa67c6355e97e41e0ed4131cd4c5d17562f53abb6e283b8d6559dbb

See more details on using hashes here.

File details

Details for the file pyavrio-20.0.3-py3-none-any.whl.

File metadata

  • Download URL: pyavrio-20.0.3-py3-none-any.whl
  • Upload date:
  • Size: 44.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.11.4

File hashes

Hashes for pyavrio-20.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1762dbd122078ebb59a0ea8b3cab073ab1cb884ce670c85dd901ee8a0b80ea6c
MD5 b28b2069ded7ad7911a3ccabb1aaf6b1
BLAKE2b-256 2eea5be216e5ada14efca9939113384be4cc3ce4018ff523e4381d870ad46320

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page