Google Cloud Storage and Azure Storage functions for working with unstructured data
Project description
Pyplatform is a data analytics platform architeture built around Google BigQuery in a hybrid cloud environment.
the platorm:
- provides fast, scalable and reliable SQL database solution
- abstracts away the infrastuture by builiding data pipelines with serverless compute solutions in python runtime environments
- simplifies development environment by using jupyter lab as the main tool
Installation
pip install pyplatform
Setting up development environment
git clone https://github.com/mhadi813/pyplatform
cd pyplatform
conda env create -f pyplatform_dev.yml
Environment variables
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/default_service_account.json'
os.environ['DATASET'] = 'default_bigquery_dataset_name'
os.environ['STORAGE_BUCKET'] = 'default_storage_bucket_id'
Usage
common data pipeline architectures:
- Http sources
- On-prem servers
- Bigquery integration with Azure Logic Apps
- Event driven ETL process
- Streaming pipelines
Exploring modules
import pyplatform as pyp
pyp.show_me()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for pyplatform-datalake-0.0.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | a8d244906852a0c4878e6f27ee001d8a4be26cfabb1a156d9030ec8094dd126d |
|
MD5 | 5265f3e49d8722b78fbb8edb4367b180 |
|
BLAKE2b-256 | fc6be5d0db113784c663b76ddddc53fbef69148f693aa4bd90676320115cd18d |
Close
Hashes for pyplatform_datalake-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 987ae554a25f6049786b8e976faac257c19b4c71bacb44f8e4682ccf8a8847ca |
|
MD5 | baafdb0d23d7c3c03b42cd0dc74b0b19 |
|
BLAKE2b-256 | a3b14070c1699fae4bfb8feeee70cd75ffd31e05f2ff4be76d59eeb77069af79 |