Skip to main content

REFRACT-IO: To read and write dataframe from different connectors.

Project description

Installation:

without any dependencies:

pip install refractio

With all dependencies:

pip install refractio[all]

With snowflake:

pip install refractio[snowflake]

With s3:

pip install refractio[s3]

With azureblob:

pip install refractio[azureblob]

With local:

pip install refractio[local]

With sftp:

pip install refractio[sftp]

With mysql:

pip install refractio[mysql]

With hive:

pip install refractio[hive]

Source code is available at: https://git.lti-aiq.in/refract-sdk/refract-sdk.git

Usage:

To read dataframe with dataset name only -

from refractio import get_dataframe
get_dataframe("dataset_name")

# For reading data from any other RDBMS connection apart from (snowflake, hive and mysql) using connector-backend's package,
# pip install git+https://gitlab+deploy-token-14:myUpFE_XRxShG53Hs6tV@git.lti-aiq.in/mosaic-decisions-2-0/mosaic-connector-python.git

To read dataframe with filename from local storage -

from refracio import get_local_dataframe
get_local_dataframe("local_file_name_with_absolute_path")

To use snowflake related operations -

from refractio import snowflake

# To get snowflake connection object with a default snowflake connection created by the user, if available.
snowflake.get_connection()

# To get snowflake connection object with a specific connection name
snowflake.get_connection(connection_name="snowflake_con_name")

# To read a specific dataset published from a snowflake connection
snowflake.get_dataframe("dataset_name")

# To read a specific dataset published from a snowflake connection with only top few records.
snowflake.get_dataframe("dataset_name", row_count=3)

# To execute a user specific query in snowflake, with the specified connection name.
snowflake.execute_query(query="user_query", database="db_name", schema="schema", connection_name="connection_name")

# To execute a user specific query in snowflake, with the current connection object or with the default connection for the user.
snowflake.execute_query(query="user_query", database="db_name", schema="schema")

# To close snowflake connection, please do close the connection after use!
snowflake.close_connection()

To use mysql related operations -

from refractio import mysql

# To get mysql connection object with a default mysql connection created by the user, if available.
mysql.get_connection()

# To get mysql connection object with a specific connection name
mysql.get_connection(connection_name="mysql_con_name")

# To read a specific dataset published from a mysql connection
mysql.get_dataframe("dataset_name")

# To read a specific dataset published from a mysql connection with only top few records.
mysql.get_dataframe("dataset_name", row_count=3)

# To execute a user specific query in mysql, with the specified connection name.
mysql.execute_query(query="user_query", connection_name="connection_name")

# To execute a user specific query in mysql, with the current connection object or with the default connection for the user.
mysql.execute_query(query="user_query")

# To close mysql connection, please do close the connection after use!
mysql.close_connection()

To use hive related operations -

from refractio import hive

# To get hive connection object with a default hive connection created by the user, if available. User id is required (1001 is default user_id used).
hive.get_connection(user_id=1001)

# To get hive connection object with a specific connection name, User id is required (1001 is default user_id used).
hive.get_connection(connection_name="hive_con_name", user_id=1001)

# To read a specific dataset published from a hive connection. User id is required (1001 is default user_id used).
hive.get_dataframe("dataset_name", user_id="1001")

# To read a specific dataset published from a hive connection with only top few records. User id is required (1001 is default user_id used)
hive.get_dataframe("dataset_name", user_id="1001", row_count=3)

# To execute a user specific query in hive, with the specified connection name. User id is required (1001 is default user_id used).
hive.execute_query(query="user_query", connection_name="connection_name", user_id="1001")

# To execute a user specific query in hive, with the current connection object or with the default connection for the user. User id is required (1001 is default user_id used).
hive.execute_query(query="user_query", user_id="1001")

# To close hive connection, please do close the connection after use!
hive.close_connection()

To use sftp related operations -

from refractio import sftp

# To get sftp connection object with a default sftp connection created by the user, if available.
sftp.get_connection()

# To get sftp connection object with a specific connection name
sftp.get_connection(connection_name="sftp_con_name")

# To read a specific dataset published from a sftp connection
sftp.get_dataframe("dataset_name")

# To read a specific dataset published from a sftp connection with only top few records.
sftp.get_dataframe("dataset_name", row_count=3)

# Use sftp connection object c to do any operation related to sftp like (get, put, listdir etc)
c = sftp.get_connection()

# To close sftp connection, please do close the connection after use!
sftp.close_connection()

To use amazon S3 related operations -

from refractio import s3

# To get s3 connection object with a default s3 connection created by the user, if available.
s3.get_connection()

# To get s3 connection object with a specific connection name
s3.get_connection(connection_name="s3_con_name")

# To read a specific dataset published from a s3 connection
s3.get_dataframe("dataset_name")

# To read a specific dataset published from a s3 connection with only top few records.
s3.get_dataframe("dataset_name", row_count=3)

# Use s3 connection object c to do any operation related to s3.
c = s3.get_connection()

To use azure blob related operations -

from refractio import azure

# To get azure blob connection object with a default azure connection created by the user, if available.
azure.get_connection()

# To get azure blob connection object with a specific connection name
azure.get_connection(connection_name="azureblob_con_name")

# To read a specific dataset published from a azureblob connection
azure.get_dataframe("dataset_name")

# To read a specific dataset published from a azure connection with only top few records.
azure.get_dataframe("dataset_name", row_count=3)

# Use azure connection object c to do any operation related to azure.
c = azure.get_connection()

Note: Usage documentation will be updated in upcoming releases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

refractio-2.0.1.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

refractio-2.0.1-py3-none-any.whl (15.0 kB view details)

Uploaded Python 3

File details

Details for the file refractio-2.0.1.tar.gz.

File metadata

  • Download URL: refractio-2.0.1.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.7

File hashes

Hashes for refractio-2.0.1.tar.gz
Algorithm Hash digest
SHA256 0cb7904ce7eb8c5dc64f5deaafda25fa81a724cebc4992877e079950bceeeffc
MD5 3dd4c03b8dc5e62c5242aeab114c9c2a
BLAKE2b-256 e25b8c834e549b60f4295d003c313de96aa5743f766ea0ffb9630d3eae91d32b

See more details on using hashes here.

Provenance

File details

Details for the file refractio-2.0.1-py3-none-any.whl.

File metadata

  • Download URL: refractio-2.0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.7

File hashes

Hashes for refractio-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9bd1c11b5eb8e4370c9ddbc56ea9d50116346861f891f48749e6de78a8e4c8b8
MD5 07f9f54eb4c4b256ad093cfd732c3dea
BLAKE2b-256 f1078c9c05e1d2efd0e0415729836ed89f20caa98d02a343b2ef48663d614728

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page