A kinesis consumer is purely written in python.
Project description
Kinesis Consumer in Python
A kinesis consumer is purely written in python. This is a lightweight wrapper on top of AWS python library boto3. You also can consume records from Kinesis Data Stream (KDS) via:
- Lambda function: I have a demo kinesis-lambda-sqs-demo showing how to consume records in a serverless and real-time way.
- Kinesis Firehose: This is a AWS managed service and easily save records into different sinks, like S3, ElasticSearch, Redshift.
Installation
Install the package via pip
:
pip install kcpy
Getting started
from kcpy import StreamConsumer
consumer = StreamConsumer('my_stream_name')
for record in consumer:
print(record)
The output would look like:
{
'ApproximateArrivalTimestamp': datetime.datetime(2018, 11, 13, 11, 57, 55, 117807),
'Data': b'Jessica Walter',
'PartitionKey': 'Jessica Walter',
'SequenceNumber': '1'
}
Or, you can consume stream data with checkpointing:
from kcpy import StreamConsumer
consumer = StreamConsumer('my_stream_name', consumer_name='my_consumer', checkpoint=True)
for record in consumer:
print(record)
Features
- Read records from a stream with multiple shards
Todo
- Save checkpoint for each shard
- Rebalance when the number of shards changes
- Allow kcpy to run on multiple machines
Changelog
0.1.5
- Add consumer checkpointing with a simple sqlite storage solution.
0.1.4
- Pass aws configurations into boto3 client directly.
0.1.3
- Update the README.
0.1.2
- Add markdown support for long description.
0.1.1
- Add a long description.
0.1.0
- First version of kcpy.
License
Copyright (c) 2018 Hengfeng Li. It is free software, and may be redistributed under the terms specified in the LICENSE file.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kcpy-0.1.5.tar.gz
(4.9 kB
view hashes)