Python stream processing
Project description
Beavers
Beavers is a python library for stream processing, optimize for analytics.
It is used at Tradewell Technologies, to calculate analytics and serve model predictions, in both realtime and batch jobs.
Key Features
- Works in real time (eg: reading from kafka) and replay mode (eg: reading from parquet)
- Optimized for analytics, it uses micro-batching (instead of processing records one by one)
- Similar to incremental, it updates nodes in a dag incrementally
- Taking inspiration from kafka streams, there are two types of nodes in the dag:
- Stream: ephemeral micro-batches of events (cleared after every cycle)
- State: durable state derived from streams
- Clear separation between the dag (which contains the business logic) and the IO (where the data comes from/goes). So the same dag can be used in real time mode, replay mode or can be easily tested.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
beavers-0.0.1rc0.tar.gz
(9.1 kB
view hashes)
Built Distribution
Close
Hashes for beavers-0.0.1rc0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24a7d0db9d51a251f242e69b6c50ad9e37c7de2ed8ec445bb70108d3412c7128 |
|
MD5 | a270d9525104135dc4015a63f1ca2997 |
|
BLAKE2b-256 | f56e074db3bc05f12a901ca7139cb5a54a61cec293eec69a71c3f091fa6636aa |