Skip to main content

Pyplatform is a data analytics platform built around Google BigQuery. This package provides wrapper functions for interacting with cloud services and creating data pipelines using Google Cloud, Microsoft Azure, O365, and Tableau Server as source and destination.

Project description

Pyplatform is a data analytics platform built around Google BigQuery. This package provides wrapper functions for interacting with cloud services and creating data pipelines using Google Cloud, Microsoft Azure, O365, and Tableau Server as source and destination.

The platorm architecture:

  • enables fast and scalable SQL datawarehousing service
  • abstracts away the infrastuture by builiding data pipelines with serverless compute solutions in python runtime environments
  • simplifies development environment by using jupyter lab as the main tool

Installation

pip install pyplatform

Setting up development environment

git clone https://github.com/mhadi813/pyplatform
cd pyplatform
conda env create -f pyplatform_dev.yml

Authentication and environment variables

Credential file path can be set a environment varible in conda env activation script or bash profiles. Please reference conda documentation for enviroment variables and environment activation script

import os
# if env activation script not created: update path to credential files
# see ``secrets`` folder for credential tamplates
# see functions ``docstring`` for authentication methods when calling a function
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = './secrets/dummy_gcp_service_account_credentials.json'
os.environ["AZURE_CREDENTIALS"]= './secrets/dummy_ms_azure_credentials.json' 
os.environ['TABLEAU_SERVER_CREDENTIALS']='./secrets/dummy_tableau_server_credentials.json'
os.environ['PIVOTAL_CREDENTIALS']='./secrets/dummy_pivotal_credentials.json'

os.environ['DATASET'] = 'default_bigquery_dataset_name'
os.environ['STORAGE_BUCKET'] = 'default_storage_bucket_id'

Usage

common usage patterns:

- Http sources

- On-prem sources with VPN requirement

- Bigquery integration with Azure Logic Apps

- Event driven ETL process

- Streaming pipelines

Exploring the modules

from pyplatform.common import *
show_me()

import pyplatform as pyp
show_me(pyp)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for pyplatform, version 2020.12.1
Filename, size File type Python version Upload date Hashes
Filename, size pyplatform-2020.12.1-py3-none-any.whl (14.7 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size pyplatform-2020.12.1.tar.gz (445.3 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page