REFRACT-IO: To read and write dataframe from different connectors.
Project description
Installation:
without any dependencies:
pip install refractio
With all dependencies:
pip install refractio[all]
With snowflake:
pip install refractio[snowflake]
With s3:
pip install refractio[s3]
With azureblob:
pip install refractio[azureblob]
With local:
pip install refractio[local]
With sftp:
pip install refractio[sftp]
With mysql:
pip install refractio[mysql]
With hive:
pip install refractio[hive]
Source code is available at: https://git.lti-aiq.in/refract-sdk/refract-sdk.git
Usage:
To read dataframe with dataset name only -
from refractio import get_dataframe
get_dataframe("dataset_name")
# For reading data from any other RDBMS connection apart from (snowflake, hive and mysql) using connector-backend's package,
# pip install git+https://gitlab+deploy-token-14:myUpFE_XRxShG53Hs6tV@git.lti-aiq.in/mosaic-decisions-2-0/mosaic-connector-python.git
To read dataframe with filename from local storage -
from refracio import get_local_dataframe
get_local_dataframe("local_file_name_with_absolute_path")
To use snowflake related operations -
from refractio import snowflake
# To get snowflake connection object with a default snowflake connection created by the user, if available.
snowflake.get_connection()
# To get snowflake connection object with a specific connection name
snowflake.get_connection(connection_name="snowflake_con_name")
# To read a specific dataset published from a snowflake connection
snowflake.get_dataframe("dataset_name")
# To read a specific dataset published from a snowflake connection with only top few records.
snowflake.get_dataframe("dataset_name", row_count=3)
# To execute a user specific query in snowflake, with the specified connection name.
snowflake.execute_query(query="user_query", database="db_name", schema="schema", connection_name="connection_name")
# To execute a user specific query in snowflake, with the current connection object or with the default connection for the user.
snowflake.execute_query(query="user_query", database="db_name", schema="schema")
# To close snowflake connection, please do close the connection after use!
snowflake.close_connection()
To use mysql related operations -
from refractio import mysql
# To get mysql connection object with a default mysql connection created by the user, if available.
mysql.get_connection()
# To get mysql connection object with a specific connection name
mysql.get_connection(connection_name="mysql_con_name")
# To read a specific dataset published from a mysql connection
mysql.get_dataframe("dataset_name")
# To read a specific dataset published from a mysql connection with only top few records.
mysql.get_dataframe("dataset_name", row_count=3)
# To execute a user specific query in mysql, with the specified connection name.
mysql.execute_query(query="user_query", connection_name="connection_name")
# To execute a user specific query in mysql, with the current connection object or with the default connection for the user.
mysql.execute_query(query="user_query")
# To close mysql connection, please do close the connection after use!
mysql.close_connection()
To use hive related operations -
from refractio import hive
# To get hive connection object with a default hive connection created by the user, if available. User id is required (1001 is default user_id used).
hive.get_connection(user_id=1001)
# To get hive connection object with a specific connection name, User id is required (1001 is default user_id used).
hive.get_connection(connection_name="hive_con_name", user_id=1001)
# To read a specific dataset published from a hive connection. User id is required (1001 is default user_id used).
hive.get_dataframe("dataset_name", user_id="1001")
# To read a specific dataset published from a hive connection with only top few records. User id is required (1001 is default user_id used)
hive.get_dataframe("dataset_name", user_id="1001", row_count=3)
# To execute a user specific query in hive, with the specified connection name. User id is required (1001 is default user_id used).
hive.execute_query(query="user_query", connection_name="connection_name", user_id="1001")
# To execute a user specific query in hive, with the current connection object or with the default connection for the user. User id is required (1001 is default user_id used).
hive.execute_query(query="user_query", user_id="1001")
# To close hive connection, please do close the connection after use!
hive.close_connection()
To use sftp related operations -
from refractio import sftp
# To get sftp connection object with a default sftp connection created by the user, if available.
sftp.get_connection()
# To get sftp connection object with a specific connection name
sftp.get_connection(connection_name="sftp_con_name")
# To read a specific dataset published from a sftp connection
sftp.get_dataframe("dataset_name")
# To read a specific dataset published from a sftp connection with only top few records.
sftp.get_dataframe("dataset_name", row_count=3)
# Use sftp connection object c to do any operation related to sftp like (get, put, listdir etc)
c = sftp.get_connection()
# To close sftp connection, please do close the connection after use!
sftp.close_connection()
To use amazon S3 related operations -
from refractio import s3
# To get s3 connection object with a default s3 connection created by the user, if available.
s3.get_connection()
# To get s3 connection object with a specific connection name
s3.get_connection(connection_name="s3_con_name")
# To read a specific dataset published from a s3 connection
s3.get_dataframe("dataset_name")
# To read a specific dataset published from a s3 connection with only top few records.
s3.get_dataframe("dataset_name", row_count=3)
# Use s3 connection object c to do any operation related to s3.
c = s3.get_connection()
To use azure blob related operations -
from refractio import azure
# To get azure blob connection object with a default azure connection created by the user, if available.
azure.get_connection()
# To get azure blob connection object with a specific connection name
azure.get_connection(connection_name="azureblob_con_name")
# To read a specific dataset published from a azureblob connection
azure.get_dataframe("dataset_name")
# To read a specific dataset published from a azure connection with only top few records.
azure.get_dataframe("dataset_name", row_count=3)
# Use azure connection object c to do any operation related to azure.
c = azure.get_connection()
Note: Usage documentation will be updated in upcoming releases.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
refractio-2.0.1.tar.gz
(10.6 kB
view details)
Built Distribution
refractio-2.0.1-py3-none-any.whl
(15.0 kB
view details)
File details
Details for the file refractio-2.0.1.tar.gz
.
File metadata
- Download URL: refractio-2.0.1.tar.gz
- Upload date:
- Size: 10.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0cb7904ce7eb8c5dc64f5deaafda25fa81a724cebc4992877e079950bceeeffc |
|
MD5 | 3dd4c03b8dc5e62c5242aeab114c9c2a |
|
BLAKE2b-256 | e25b8c834e549b60f4295d003c313de96aa5743f766ea0ffb9630d3eae91d32b |
Provenance
File details
Details for the file refractio-2.0.1-py3-none-any.whl
.
File metadata
- Download URL: refractio-2.0.1-py3-none-any.whl
- Upload date:
- Size: 15.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9bd1c11b5eb8e4370c9ddbc56ea9d50116346861f891f48749e6de78a8e4c8b8 |
|
MD5 | 07f9f54eb4c4b256ad093cfd732c3dea |
|
BLAKE2b-256 | f1078c9c05e1d2efd0e0415729836ed89f20caa98d02a343b2ef48663d614728 |