stream and (de)serialize s3 objects with no local footprint
Project description
s3-streaming: handling (big) S3 files like regular files
Storing, retrieving and using files in S3 is a regular activity so it should be easy. It should also ...
- stream the data
- have an api that is python file-io like
- handle some of the desearization and compression stuff because why not
Install
pip install s3-streaming
Streaming S3 objects like regular files
The basics
Opening and reading S3 objects is similar to regular python io. The only difference is that you need to provide a
boto3.session.Session instance to handle the bucket access.
import boto3
from s3streaming import s3_open
with s3_open('s3://bucket/key', boto_session=boto3.session.Session()) as f:
for next_line in f:
print(next_line)
Injecting deserialization and compression handling in stream
Consider a file that is gzip compressed and contains lines of json. There's some boilerplate in dealing with that,
but why bother? Just handle that in stream.
from s3streaming import s3_open, deserialize, compression
reader_settings = dict(
boto_session=boto3.session.Session(),
deserializer=deserialize.json_lines,
compression=compression.gzip
)
with s3_open('s3://bucket/key.gzip', **reader_settings) as f:
for next_line in f:
print(next_line.keys()) # because the file was decompressed ...
print(next_line.values()) # ... and the json is now a loaded dict!
Other deserialize options include
csvcsv_as_dicttsvtsv_as_dictstring
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file s3_streaming-0.0.3-py3-none-any.whl.
File metadata
- Download URL: s3_streaming-0.0.3-py3-none-any.whl
- Upload date:
- Size: 4.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.18.4 setuptools/41.0.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4cb79c5271e58d973fd887c7b7332033cfd0b400e98923676fc25b42d85e91d3
|
|
| MD5 |
53f47f1b9cd5595c2f460839cc690347
|
|
| BLAKE2b-256 |
7622ea30383e94b45c333b26f3c1b4515c520466d7f76766134ac12905f08c9f
|