Skip to main content

Pyplatform provides wrapper functions for using Google BigQuery as datawarehouse and creating data pipelines involving Google Cloud, Microsoft Azure, O365, and Tableau Server as source and destination.

Project description

Pyplatform provides wrapper functions for using Google BigQuery as datawarehouse and creating data pipelines involving Google Cloud, Microsoft Azure, O365, and Tableau Server as source and destination.

the platorm architecture:

  • enables fast and scalable SQL datawarehousing service
  • abstracts away the infrastuture by builiding data pipelines with serverless compute solutions in python runtime environments
  • simplifies development environment by using jupyter lab as the main tool

Installation

pip install pyplatform

Setting up development environment

git clone https://github.com/mhadi813/pyplatform
cd pyplatform
conda env create -f pyplatform_dev.yml

Environment variables

import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/default_service_account.json'
os.environ['DATASET'] = 'default_bigquery_dataset_name'
os.environ['STORAGE_BUCKET'] = 'default_storage_bucket_id'

Usage

common data pipeline architectures:

- Http sources

- On-prem servers

- Bigquery integration with Azure Logic Apps

- Event driven ETL process

- Streaming pipelines

Exploring modules

import pyplatform as pyp
pyp.show_me()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyplatform-2020.7.2.tar.gz (2.5 kB view hashes)

Uploaded Source

Built Distribution

pyplatform-2020.7.2-py3-none-any.whl (14.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page