Pyplatform is a data analytics platform built around Google BigQuery. This package provides wrapper functions for interacting with cloud services and creating data pipelines using Google Cloud, Microsoft Azure, O365, and Tableau Server as source and destination.
Project description
Pyplatform is a data analytics platform built around Google BigQuery. This package provides wrapper functions for interacting with cloud services and creating data pipelines using Google Cloud, Microsoft Azure, O365, and Tableau Server as source and destination.
The platorm architecture:
- enables fast and scalable SQL datawarehousing service
- abstracts away the infrastuture by builiding data pipelines with serverless compute solutions in python runtime environments
- simplifies development environment by using jupyter lab as the main tool
Installation
pip install pyplatform
Setting up development environment
git clone https://github.com/mhadi813/pyplatform
cd pyplatform
conda env create -f pyplatform_dev.yml
Authentication and environment variables
Credential file path can be set a environment varible in conda env activation script or bash profiles. Please reference conda documentation for enviroment variables and environment activation script
import os
# if env activation script not created: update path to credential files
# see ``secrets`` folder for credential tamplates
# see functions ``docstring`` for authentication methods when calling a function
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = './secrets/dummy_gcp_service_account_credentials.json'
os.environ["AZURE_CREDENTIALS"]= './secrets/dummy_ms_azure_credentials.json'
os.environ['TABLEAU_SERVER_CREDENTIALS']='./secrets/dummy_tableau_server_credentials.json'
os.environ['PIVOTAL_CREDENTIALS']='./secrets/dummy_pivotal_credentials.json'
os.environ['DATASET'] = 'default_bigquery_dataset_name'
os.environ['STORAGE_BUCKET'] = 'default_storage_bucket_id'
Usage
common usage patterns:
- Http sources
- On-prem sources with VPN requirement
- Bigquery integration with Azure Logic Apps
- Event driven ETL process
- Streaming pipelines
Exploring the modules
from pyplatform.common import *
show_me()
import pyplatform as pyp
show_me(pyp)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyplatform-2020.12.1.tar.gz
(445.3 kB
view details)
Built Distribution
File details
Details for the file pyplatform-2020.12.1.tar.gz
.
File metadata
- Download URL: pyplatform-2020.12.1.tar.gz
- Upload date:
- Size: 445.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30763c1f3d78c50b605024d8c6740a47f91e2d77ff37bc0be1ad6251ed636d10 |
|
MD5 | f2779d33df9c899158d9cea351163fa7 |
|
BLAKE2b-256 | 260134cd76149d7b0288f919cf2dfbdf671a70e7adee0f4ce29ab8332cd6c3f8 |
File details
Details for the file pyplatform-2020.12.1-py3-none-any.whl
.
File metadata
- Download URL: pyplatform-2020.12.1-py3-none-any.whl
- Upload date:
- Size: 14.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16c8b7145ba3e71dfc7ad466f68e3b9aa0d340f83d8905c5e88aa36fbe26b9f3 |
|
MD5 | c055672e93c41cc0ea7d1bfdaaf17d85 |
|
BLAKE2b-256 | 9973ed7b49ef7677ad481a32b409cc484059f847e4fe70ed8634f04cbed8b7d3 |