Skip to main content

Streams the content of an iterator to multiple S3 objects based on a regular expression

Project description

Python S3 Upload Split

S3 Upload Split is used to stream the content of an iterator to multiple S3 objects based on a provided regular expression. The iterator must be a list of dictionary, typically the resulset of a SQL query. Files will be called data-{pattern}.json where {pattern} is the match found using your regex.

Install

pip install s3-upload-split

Usage

Import

import re
from sqlalchemy import create_engine
from s3_upload_split import SplitUploadS3

bucket = 'YOUR_BUCKET_NAME' # ex: my-bucket
prefix = 'OUTPUT_PATH' # ex: db1/output/dev/
regex = re.compile(r'YOUR_REGEX') # ex: \\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2}\\D+(\\d{4}-\\d{2})-\\d{2}
engine = create_engine('sqlite:///bookstore.db') # https://github.com/pranaymethuku/bookstore-database/blob/master/database/bookstore.db

with engine.connect() as con:

    iterator = con.execute('SELECT * FROM book')
    SplitUploadS3(bucket, prefix, regex, iterator).handle_content()

Limitations

It creates one thread per matched pattern using your regex, so take it into account when you use that module. This is typically useful if your regex matches months in the input iterator.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3-upload-split-0.1.6.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

s3_upload_split-0.1.6-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file s3-upload-split-0.1.6.tar.gz.

File metadata

  • Download URL: s3-upload-split-0.1.6.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.8.11 Darwin/20.6.0

File hashes

Hashes for s3-upload-split-0.1.6.tar.gz
Algorithm Hash digest
SHA256 ece68ef800b578d67fb631e70df9ee2fd6cdf096a8e4c722ab05d87ce1377492
MD5 3c9af8575b8cc62370712decd411246a
BLAKE2b-256 7f444c67936c7555a6f3620a69f43bd00fc4f81ef642b7b86361e9ef12a126d6

See more details on using hashes here.

File details

Details for the file s3_upload_split-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: s3_upload_split-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.11 CPython/3.8.11 Darwin/20.6.0

File hashes

Hashes for s3_upload_split-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 18999691b236f8f9d4765a078a4cf4fdad9bd64cc7855ae27f2dc85bae1e8737
MD5 ae6c0b3244793f29861a6e02bf564d7c
BLAKE2b-256 b41241445e229eddd27513839d3dc6c8aecfb6d7fedf79952cd08068535cdb05

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page