Skip to main content

QAPI SDK provides a library of classes for working with Query API in your Python code.

Project description

QAPI SDK

QAPI SDK provides a library of classes for working with Query API in your Python code.

Requirements

  • Python 3.6+
  • Must be logged into the private VPN.

Installation

pip install qapi-sdk 

Environment Variables

  • QAPI_URL: QAPI Base URL.

  • EMAIL: Your Email

    Optional: If you choose to add your AWS credentials, you can use the read_columns method to read in the headers of your CSV file automatically.

  • AWS_ACCESS_KEY_ID: AWS Access Key ID

  • AWS_SECRET_ACCESS_KEY: AWS Secret Access Key

  • AWS_DEFAULT_REGION: AWS Default Region

Examples

Query

  • FEED ID: The table must exist in Athena.
  • QUERY ID: The query id is used as an identifier for the query. Query id must be unique. Once you have retrieved your data from S3 it is advised to delete the query.
  • SQL: The SQL query to be executed.
import time

from dotenv import load_dotenv

from qapi_sdk import Query

load_dotenv()

# Step 1: Assign your FEED ID, QUERY ID, and SQL QUERY
feed_id = "[FEED/TABLE NAME]"
query_id = "[QUERY NAME]"
query = f"SELECT * FROM {feed_id}"

# Step 2: Create a Query object
my_query = Query(
    feed_id=feed_id,
    query_id=query_id
)

# Step 3: Execute the query push
my_query.push_query(sql=query)

# Step 4: Wait for the query to complete
while my_query.query_status():
    print("Waiting for query to complete...")
    time.sleep(10)

# Step 5 (Optional): Delete the query
my_query.delete_query()

Feed

  • FEED ID: The table name you want to create in Athena.
  • PUSH ID: The push id is used as an identifier for the query. Push id must be unique.
  • COLUMNS: The name of the columns that will be pushed to Athena.
import time

from dotenv import load_dotenv

from qapi_sdk import Feed

load_dotenv()

# Step 1: Assign your FEED ID, PUSH ID, and COLUMNS
feed_id = "[FEED/TABLE NAME]"
push_id = "[PUSH ID/PUSH NAME]"

# Step 2: Create a Feed object
my_feed = Feed(feed_id=feed_id, push_id=push_id)

# Step 3: You can manually assign the columns
columns = [
    {
        "name": "email",
        "type": "string"
    },
    {
        "name": "md5email",
        "type": "string"
    },
    {
        "name": "firstname",
        "type": "string"
    }
]

# Step 3a (Optional): If you added AWS credentials, you can use the `read_columns` method to read 
# in the headers of your CSV file automatically.
columns = my_feed.read_columns(
    data_bucket="[DATA BUCKET]",
    data_key="path/to/your/data/dir/ OR path/to/your/data/file.csv",
    delimiter=","
)

# Step 4: Define where to grab the data and format of the data.Then push the data to Athena.
my_feed.push_feed(
    pull_path_bucket="[DATA BUCKET]",
    pull_path_key="path/to/your/data/dir OR path/to/your/data/file.csv",
    columns=columns,
    separator=","
)

# Step 5: Wait for the push to complete
while my_feed.push_status():
    print("Waiting for push to complete...")
    time.sleep(10)

# Step 6 (Optional): Delete the push
my_feed.delete_push()

# Step 7 (Optional): Delete the feed
my_feed.delete_feed()

Redshift

  • FEED ID: You must use an existing feed.
  • QUERY ID: The query id is used as an identifier for the query. Query id must be unique. Once you have retrieved your data from S3 it is advised to delete the query.
  • SQL: The SQL query to be executed.
  • If you query an Athena table from Redshift, you must append the Athena schema to the table name.
    • For example: SELECT * FROM [query_api].[TABLE NAME]
  • If you use a LIMIT clause, you must wrap the query in a SELECT * FROM () clause.
    • For example: SELECT * FROM (SELECT * FROM [TABLE NAME] LIMIT 100)
import time

from dotenv import load_dotenv

from qapi_sdk import Redshift

load_dotenv()

# Step 1: Assign your FEED ID, QUERY ID, and SQL QUERY
feed_id = "[EXISTING FEED ID]"
query_id = "[QUERY NAME]"
query = "SELECT * FROM (SELECT * FROM [SCHEMA].[TABLE NAME] LIMIT 10)"

# Step 2: Create a Redshift object
my_query = Redshift(
    feed_id=feed_id,
    query_id=query_id
)

# Step 3: Execute the query push
my_query.push_query(sql=query)

# Step 4: Wait for the query to complete
while my_query.query_status():
    print("Waiting for query to complete...")
    time.sleep(10)

# Step 5 (Optional): Delete the query
my_query.delete_query()

CHANGELOG

[0.3.4] - 2020-06-02

  • Updated README.md to inform the user they can push a directory of files with the push_feed method or push a single file with the push_feed method.
  • Updated README.md to inform the user they can use the read_columns method to read in the headers of your CSV file from a directory or a single file automatically.
  • Changed the read_columns parameter data_key_dir to data_key. This is to make it consistent with the other methods.
  • Changed the order of operations steps in the README.md file to make it easier to follow.

[0.3.3] - 2020-06-01

  • Updated package to include Python 3.6+

[0.3.2] - 2020-05-30

  • Updated README.md

[0.3.0] - 2020-05-30

  • Added Redshift object to the SDK.
  • Added delete_query method to Redshift class.
  • Added query_status method to Redshift class.
  • Added push_query method to Redshift class.
  • Updated README.md

[0.2.1] - 2020-05-30

  • Added homepage and repository links to the pyproject.toml file.

[0.2.0] - 2020-05-29

  • Added FEED object to the SDK.
  • Added read_columns method to Feed class.
  • Added delete_push method to Feed class.
  • Added delete_feed method to Feed class.
  • Added push_status method to Feed class.
  • Added push_feed method to Feed class.
  • Updated README.md

[0.1.4] - 2022-05-29

  • Added QUERY object to the SDK.
  • Added delete_query method to Query class.
  • Added query_status method to Query class.
  • Added push_query method to Query class.
  • Added the CHANGELOG section.
  • Updated README.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qapi-sdk-0.3.4.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

qapi_sdk-0.3.4-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file qapi-sdk-0.3.4.tar.gz.

File metadata

  • Download URL: qapi-sdk-0.3.4.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.4 Darwin/21.6.0

File hashes

Hashes for qapi-sdk-0.3.4.tar.gz
Algorithm Hash digest
SHA256 e7554ab87774b84b8b67c940f514a87b5e7a5497cb50bbfc9f5b52165f67fb47
MD5 b1361d0b4bf716576c84f48dd9d1e975
BLAKE2b-256 4703a40b8ac142771d3c0aab01461f813f86688af71da19220cd79a615d2cbca

See more details on using hashes here.

File details

Details for the file qapi_sdk-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: qapi_sdk-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.4 Darwin/21.6.0

File hashes

Hashes for qapi_sdk-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 536147feb931371795376cf8b7167a40665f91dfcc701260ee4be4d8d7d48914
MD5 c1a4e866b740cc2d9437da0067762f98
BLAKE2b-256 0fdf497961056cebf5a6c55174e727d45e91ae1a16f6f8d9e6ad8c5a90f36cbb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page