REFRACT-IO: To read and write dataframe from different connectors.
Project description
Installation:
without any dependencies:
pip install refractio
With all dependencies:
pip install refractio[all]
With snowflake:
pip install refractio[snowflake]
With s3:
pip install refractio[s3]
With azureblob:
pip install refractio[azureblob]
With local:
pip install refractio[local]
With sftp:
pip install refractio[sftp]
With mysql:
pip install refractio[mysql]
With hive:
pip install refractio[hive]
With sqlserver:
pip install refractio[sqlserver]
With postgres:
pip install refractio[postgres]
Source code is available at: https://git.lti-aiq.in/refract-sdk/refract-sdk.git
Usage:
To read dataframe with dataset name only -
from refractio import get_dataframe
get_dataframe("dataset_name")
# For reading data from any other RDBMS connection apart from snowflake, hive or mysql please pip install mosaic-connector-python package.
To read dataframe with filename from local storage -
from refracio import get_local_dataframe
get_local_dataframe("local_file_name_with_absolute_path")
To use snowflake related operations -
from refractio import snowflake
# To get snowflake connection object with a default snowflake connection created by the user, if available.
snowflake.get_connection()
# To get snowflake connection object with a specific connection name
snowflake.get_connection(connection_name="snowflake_con_name")
# To read a specific dataset published from a snowflake connection
snowflake.get_dataframe("dataset_name")
# To read a specific dataset published from a snowflake connection with only top few records.
snowflake.get_dataframe("dataset_name", row_count=3)
# To execute a user specific query in snowflake, with the specified connection name.
snowflake.execute_query(query="user_query", database="db_name", schema="schema", connection_name="connection_name")
# To execute a user specific query in snowflake, with the current connection object or with the default connection for the user.
snowflake.execute_query(query="user_query", database="db_name", schema="schema")
# To close snowflake connection, please do close the connection after use!
snowflake.close_connection()
To use mysql related operations -
from refractio import mysql
# To get mysql connection object with a default mysql connection created by the user, if available.
mysql.get_connection()
# To get mysql connection object with a specific connection name
mysql.get_connection(connection_name="mysql_con_name")
# To read a specific dataset published from a mysql connection
mysql.get_dataframe("dataset_name")
# To read a specific dataset published from a mysql connection with only top few records.
mysql.get_dataframe("dataset_name", row_count=3)
# To execute a user specific query in mysql, with the specified connection name.
mysql.execute_query(query="user_query", connection_name="connection_name")
# To execute a user specific query in mysql, with the current connection object or with the default connection for the user.
mysql.execute_query(query="user_query")
# To close mysql connection, please do close the connection after use!
mysql.close_connection()
To use sqlserver related operations -
Requires sqlserver driver library
# Create a custom template with the following commands added in "Pre Init Script" section,
# sudo curl -o /etc/yum.repos.d/mssql-release.repo https://packages.microsoft.com/config/rhel/9.0/prod.repo
# sudo ACCEPT_EULA=Y yum install -y msodbcsql18
from refractio import sqlserver
# To get sqlserver connection object with a default sqlserver connection created by the user, if available.
sqlserver.get_connection()
# To get sqlserver connection object with a specific connection name
sqlserver.get_connection(connection_name="sqlserver_con_name")
# To read a specific dataset published from a sqlserver connection
sqlserver.get_dataframe("dataset_name")
# To read a specific dataset published from a sqlserver connection with only top few records.
sqlserver.get_dataframe("dataset_name", row_count=3)
# To execute a user specific query in sqlserver, with the specified connection name.
sqlserver.execute_query(query="user_query", database="db_name", connection_name="connection_name")
# To execute a user specific query in sqlserver, with the current connection object or with the default connection for the user.
sqlserver.execute_query(query="user_query", database="db_name")
# To close sqlserver connection, please do close the connection after use!
sqlserver.close_connection()
To use hive related operations -
from refractio import hive
# To get hive connection object with a default hive connection created by the user, if available. User id is required (1001 is default user_id used).
hive.get_connection(user_id=1001)
# To get hive connection object with a specific connection name, User id is required (1001 is default user_id used).
hive.get_connection(connection_name="hive_con_name", user_id=1001)
# To read a specific dataset published from a hive connection. User id is required (1001 is default user_id used).
hive.get_dataframe("dataset_name", user_id="1001")
# To read a specific dataset published from a hive connection with only top few records. User id is required (1001 is default user_id used)
hive.get_dataframe("dataset_name", user_id="1001", row_count=3)
# To execute a user specific query in hive, with the specified connection name. User id is required (1001 is default user_id used).
hive.execute_query(query="user_query", connection_name="connection_name", user_id="1001")
# To execute a user specific query in hive, with the current connection object or with the default connection for the user. User id is required (1001 is default user_id used).
hive.execute_query(query="user_query", user_id="1001")
# To close hive connection, please do close the connection after use!
hive.close_connection()
To use postgres related operations -
from refractio import postgres
# To get postgres connection object with a default postgres connection created by the user, if available.
postgres.get_connection()
# To get postgres connection object with a specific connection name
postgres.get_connection(connection_name="mysql_con_name")
# To read a specific dataset published from a postgres connection
postgres.get_dataframe("dataset_name")
# To read a specific dataset published from a postgres connection with only top few records.
postgres.get_dataframe("dataset_name", row_count=3)
# To execute a user specific query in postgres, with the specified connection name.
postgres.execute_query(query="user_query", connection_name="connection_name")
# To execute a user specific query in postgres, with the current connection object or with the default connection for the user.
postgres.execute_query(query="user_query")
# To close postgres connection, please do close the connection after use!
postgres.close_connection()
To use sftp related operations -
from refractio import sftp
# To get sftp connection object with a default sftp connection created by the user, if available.
sftp.get_connection()
# To get sftp connection object with a specific connection name
sftp.get_connection(connection_name="sftp_con_name")
# To read a specific dataset published from a sftp connection
sftp.get_dataframe("dataset_name")
# To read a specific dataset published from a sftp connection with only top few records.
sftp.get_dataframe("dataset_name", row_count=3)
# Use sftp connection object c to do any operation related to sftp like (get, put, listdir etc)
c = sftp.get_connection()
# To close sftp connection, please do close the connection after use!
sftp.close_connection()
To use amazon S3 related operations -
from refractio import s3
# To get s3 connection object with a default s3 connection created by the user, if available.
s3.get_connection()
# To get s3 connection object with a specific connection name
s3.get_connection(connection_name="s3_con_name")
# To read a specific dataset published from a s3 connection
s3.get_dataframe("dataset_name")
# To read a specific dataset published from a s3 connection with only top few records.
s3.get_dataframe("dataset_name", row_count=3)
# Use s3 connection object c to do any operation related to s3.
c = s3.get_connection()
To use azure blob related operations -
from refractio import azure
# To get azure blob connection object with a default azure connection created by the user, if available.
azure.get_connection()
# To get azure blob connection object with a specific connection name
azure.get_connection(connection_name="azureblob_con_name")
# To read a specific dataset published from a azureblob connection
azure.get_dataframe("dataset_name")
# To read a specific dataset published from a azure connection with only top few records.
azure.get_dataframe("dataset_name", row_count=3)
# Use azure connection object c to do any operation related to azure.
c = azure.get_connection()
Note: Usage documentation will be updated in upcoming releases.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
refractio-2.0.5.7.tar.gz
(13.1 kB
view details)
Built Distribution
File details
Details for the file refractio-2.0.5.7.tar.gz
.
File metadata
- Download URL: refractio-2.0.5.7.tar.gz
- Upload date:
- Size: 13.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6935b39a936179d2c64dd73b3a5b003f38f5fd6c4f041036a726c2851f638339 |
|
MD5 | f8798f66e3fa08a424b469bfec15d59b |
|
BLAKE2b-256 | 39b1e5262c4c81b0380581295b96477ac969316766a186855573dc089f94c745 |
Provenance
File details
Details for the file refractio-2.0.5.7-py3-none-any.whl
.
File metadata
- Download URL: refractio-2.0.5.7-py3-none-any.whl
- Upload date:
- Size: 19.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e5aa6b16c8683a617d4846b6e54244bc19407b5b9fa7273f05a464b20e02d6a |
|
MD5 | 1959d50d68e6b384f6ada0d04410e904 |
|
BLAKE2b-256 | a150cec30688b87a983f12bfa15705a141858119c2fd29f3cc870b66dfd31066 |