BigQuery client wrapper with clean API
Project description
BigFlow
Documentation
- What is BigFlow?
- Getting started
- Installing Bigflow
- Help me
- BigFlow tutorial
- CLI
- Configuration
- Project setup and build
- Deployment
- Workflow & Job
- Starter
- Technologies
- Logging
- Roadmap
What is BigFlow?
BigFlow is a Python framework for data processing pipelines on GCP.
The main features are:
- Dockerized deployment environment
- Powerful CLI
- Automated build, deployment, versioning and configuration
- Unified project structure
- Support for the major data processing technologies — Dataproc (Apache Spark), Dataflow (Apache Beam) and BigQuery
- Project starter
Getting started
Start from setting up a development environment. Next, go through the BigFlow tutorial.
Installing BigFlow
Prerequisites. Before you start, make sure you have the following software installed:
- Python == 3.7
- Google Cloud SDK
- Docker Engine
You can install the bigflow
package globally but we recommend to
install it locally with venv
, in your project's folder:
python -m venv .bigflow_env
source .bigflow_env/bin/activate
Install the bigflow
PIP package:
pip install bigflow==1.0.dev67
Test it:
bigflow -h
Read more about BigFlow CLI.
Help me
You can ask questions on our gitter channel or stackoverflow.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
bigflow-1.0.dev72.tar.gz
(57.8 kB
view hashes)
Built Distribution
Close
Hashes for bigflow-1.0.dev72-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b9e587a6a2c55b2836d3087cac7c50b6bb343237e61a867bee2227dd7cd45725 |
|
MD5 | 60d20d4f98e280905730fc9a9f8684f1 |
|
BLAKE2b-256 | 38d568d7ef5ffd27d76d454a95082b666bd8114e294cb90c71ff5b4e0530f39b |