Skip to main content

Python client library for Avrio

Project description

PyAvrio Library for Avrio Product

PyAvrio allows you to query and transform data in Avrio (Data To AI) platform directly without having to download the data locally. This library provides seamless access to the Avrio platform, allowing users to execute SQL queries and retrieve metadata such as catalog names, schema names, table names, and column information.

Getting Started

Installation

You can install PyAvrio via pip:

pip install pyavrio

Usage

To start using PyAvrio, you first need to import the PyAvrioFunctions module:

from pyavrio import PyAvrioFunctions

Connecting to Avrio

To connect to the Avrio platform, use the avrio_engine method:

from pyavrio import PyAvrioFunctions



# Define connection parameters

user_email = "your_email@example.com"

password = "your_password"

host = "host"

port = 1234 

catalog = "your_catalog"

platform = "data_sources"  # Platform should be either "data_products" or "data_sources"



# Establish connection to Avrio

engine = PyAvrioFunctions.avrio_engine(f"pyavrio://{user_email}:{password}@{host}:{port}/{catalog}?platform={platform}")

Using SQL

You can execute SQL queries using the execute_sql_query method:

sql_query = """

    SELECT column1, column2 FROM table_name LIMIT 10

"""



result = PyAvrioFunctions.execute_sql_query(engine, sql_query)

Replace sql_query with your desired SQL query string.

Querying Data

import pandas as pd



# Execute query and store result in DataFrame

df = pd.DataFrame(result, columns=['column1', 'column2'])

print(df.head())



# Perform DataFrame operations

# Example: Filter DataFrame

filtered_df = df[df['column1'] > 100]

print(filtered_df.head())

DataFrame Aggregation

# Example: Aggregating DataFrame

aggregated_df = df.groupby('column1').agg({'column2': 'sum'}).reset_index()

print(aggregated_df.head())

DataFrame Join

sql_query2 = """

    SELECT column3, column4 FROM second_table LIMIT 10

"""

result2 = PyAvrioFunctions.execute_sql_query(engine, sql_query2)

df2 = pd.DataFrame(result2, columns=['column3', 'column4'])



# Join DataFrames

joined_df = df.merge(df2, on='common_column')

print(joined_df.head())

Available Methods

PyAvrio provides the following methods for interacting with the Avrio platform:

  • avrio_engine: Connects to the Avrio platform.

  • execute_sql_query: Executes SQL queries.

  • get_catalog_names: Retrieves catalog names. (Requires platform=data_products for data products or platform=data_sources for data sources). For data products, catalog name represents the domain name, and schema name represents the subdomain name. For data sources, it is similar to Trino catalog and schema.

  • get_schema_names: Retrieves schema names. (Requires platform=data_products for data products or platform=data_sources for data sources)

  • get_table_names: Retrieves table names. (Requires platform=data_products for data products or platform=data_sources for data sources)

  • get_table_columns: Retrieves column information for a specified table. (Requires platform=data_products for data products or platform=data_sources for data sources)

# Retrieve catalog names

catalogs = PyAvrioFunctions.get_catalog_names(engine)

print("Catalogs:", catalogs)



# Retrieve schema names

schemas = PyAvrioFunctions.get_schema_names(engine)

print("Schemas:", schemas)



# Retrieve table names

tables = PyAvrioFunctions.get_table_names(engine, schema='schema_name')

print("Tables:", tables)



# Retrieve columns information for a table

columns_info = PyAvrioFunctions.get_table_columns(engine, schema='schema_name', table_name='table_name')

print("Columns Information:", columns_info)

Supported Operations

DML operations are only supported for Data Sources and not for Data Products, while DDL operations are supported by both Data Sources and Data Products in PyAvrio.

Example of DML Query

# Example of executing a DML query

dml_query = """

    INSERT INTO table_name (column1, column2) VALUES (value1, value2)

"""



result = PyAvrioFunctions.execute_sql_query(engine, dml_query)

print(result)  

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyavrio-20.0.0.tar.gz (39.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyavrio-20.0.0-py3-none-any.whl (42.5 kB view details)

Uploaded Python 3

File details

Details for the file pyavrio-20.0.0.tar.gz.

File metadata

  • Download URL: pyavrio-20.0.0.tar.gz
  • Upload date:
  • Size: 39.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.0

File hashes

Hashes for pyavrio-20.0.0.tar.gz
Algorithm Hash digest
SHA256 b4936833c80b886c7efb1f59a76d9291010c513c7b3da888503bb036d4dddc81
MD5 50f60cfb7615902f242c99c5f1538a0f
BLAKE2b-256 117c62267c7d5577a6306859604a3f14fd7c4d940073a8b3bf4d29202b44c79b

See more details on using hashes here.

File details

Details for the file pyavrio-20.0.0-py3-none-any.whl.

File metadata

  • Download URL: pyavrio-20.0.0-py3-none-any.whl
  • Upload date:
  • Size: 42.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.0

File hashes

Hashes for pyavrio-20.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dbcc5d253e7a4734f9fb9a952a9405dc16fb4da00043b03aa7ee879cfd7cd0b7
MD5 8b3350f278a29e4d15d3a4bec6a7641e
BLAKE2b-256 ed8f60dfcd5a23ee24a11fc6e4de4d480e59e06e08f3c6b7921043550b58ec85

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page