flow cli application
Project description
buildflow
buildflow is a unified batch and streaming framework that turns any python function into a scalable data pipeline that can read from our supported IO resources.
Key Features:
- Production Ready - Ready made IO connectors let users focus on processing data instead of reading and writing data
- Fast - Scalable multiprocessing powered by Ray
- Easy to learn- Get started with 2 lines of code
Quick Start
Install the framework
pip install buildflow
Import the framework.
import buildflow as flow
Add the flow.processor
decorator to your function to attach IO.
QUERY = 'SELECT * FROM `table`'
@flow.processor(input_ref=flow.BigQuery(query=QUERY))
def process(bigquery_row):
...
Use flow.run()
to kick off your pipeline.
flow.run()
Examples
All samples can be found here.
Streaming pipeline reading from Google Pub/Sub and writing to BigQuery.
# Turn your function into a stream processor
@flow.processor(
input_ref=flow.PubSub(subscription='my_subscription'),
output_ref=flow.BigQuery(table_id='project.dataset.table'),
)
def stream_process(pubsub_message):
...
flow.run()
Streaming pipeline reading from / writing to Google Pub/Sub.
# Turn your function into a stream processor
@flow.processor(
input_ref=flow.PubSub(subscription='my_subscription'),
output_ref=flow.PubSub(topic='my_topic'),
)
def stream_process(pubsub_message):
...
flow.run()
Batch pipeline reading and writing to BigQuery.
import buildflow as flow
QUERY = 'SELECT * FROM `project.dataset.input_table`'
@flow.processor(
input_ref=flow.BigQuery(query=QUERY),
output_ref=flow.BigQuery(table_id='project.dataset.output_table'),
)
def process(bigquery_row):
...
flow.run()
Batch pipeline reading from BigQuery and returning output locally.
import buildflow as flow
QUERY = 'SELECT * FROM `table`'
@flow.processor(input_ref=flow.BigQuery(query=QUERY))
def process(bigquery_row):
...
processed_rows = flow.run()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
buildflow-0.0.2.tar.gz
(22.7 kB
view hashes)
Built Distribution
buildflow-0.0.2-py3-none-any.whl
(32.0 kB
view hashes)
Close
Hashes for buildflow-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 87773a77b892c2447b522b57279f5f5903bfb8d6d1e0539f62349720e9017314 |
|
MD5 | 7f98ee91e67ec144ca509199ee5014fd |
|
BLAKE2b-256 | db5194e19392588ca7e396754c44de2324b4a7df41dc0ff7b887677cb9b6ebd7 |