Skip to main content

Dataflow splitting system for PyF framework

Project description

Introduction

pyf.splitter is a fully independent module that can be used with pyf or in any other projet. It does not have dependencies on pyf.

Purpose

The splitter purpose is simple and will stay so. It gives you an abstraction above a data flow (or any python iterable) and gives the illusion of manipulating in memory iterables when in fact everything is serialized on disk to avoid memory consumption.

The second and last purpose is to split (hence the name) your data flow according to some simple rules. Splitting is at the very least important to be able to store huge data chunks on disk without hitting file systems limitations (ever tried to store 600Gb files on a fat file system?)

It is important to note that we do not encapsulate (ie: hide) the bucket files. The splitter gives you the bucket file names it produced, you then use another function to read the files into another stream.

Running tests

To run tests you need to install tox:

pip install tox

and then just launch tox if you want the whole test suite, ie python2.7, python3.4 and pep8.

If you want to only run only kind of test (ie: python2.7 only) you can specify it like so:

tox -e py27

all defined test envs are defined in the tox.ini file

Changes

Oct 9 2015, version 3.1

  • Version 3.0 introduced python3 support and version 3.1 is a bugfix release to add more sane defaults in the separator. (The datetime objects were not serializable using the python3 version with the default separator)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

pyf.splitter-3.1.zip (14.6 kB view details)

Uploaded Source

pyf.splitter-3.1.tar.gz (11.1 kB view details)

Uploaded Source

Built Distributions

pyf.splitter-3.1-py2.7.egg (24.4 kB view details)

Uploaded Source

pyf.splitter-3.1-py2-none-any.whl (12.9 kB view details)

Uploaded Python 2

File details

Details for the file pyf.splitter-3.1.zip.

File metadata

  • Download URL: pyf.splitter-3.1.zip
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pyf.splitter-3.1.zip
Algorithm Hash digest
SHA256 5d15ed4a355976712ac81c25256155fb0c7a58d26d19f63a35f64d37875b5b51
MD5 185fd1e98cc698099eab04e1a9d678a0
BLAKE2b-256 a71e6fd414e20f0a4270f34417dfae94a6b5621ef75fda65532bb47f7a54def4

See more details on using hashes here.

File details

Details for the file pyf.splitter-3.1.tar.gz.

File metadata

  • Download URL: pyf.splitter-3.1.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pyf.splitter-3.1.tar.gz
Algorithm Hash digest
SHA256 0083f95ae83291952664fed3437e76aec9e9d64960ba3a0b005a4e0dcf940ce1
MD5 2a145f53d2962ae7b555f5e5abb897d2
BLAKE2b-256 27f6670c62d0281b05ae1eea8691f2e949fb8c0c737d1dd0a8310312a082cb9e

See more details on using hashes here.

File details

Details for the file pyf.splitter-3.1-py2.7.egg.

File metadata

File hashes

Hashes for pyf.splitter-3.1-py2.7.egg
Algorithm Hash digest
SHA256 e3b3c7af1ca1e36519fa57d20097f9c08bc753f535d8a95dad4d0bf34854538d
MD5 a167df64fb2a76e523a953d6f27360a1
BLAKE2b-256 4300f1ee914c29e1b6a7a3b3db9bb4248220b1af3ed23bfb826cc15dab9dae7a

See more details on using hashes here.

File details

Details for the file pyf.splitter-3.1-py2-none-any.whl.

File metadata

File hashes

Hashes for pyf.splitter-3.1-py2-none-any.whl
Algorithm Hash digest
SHA256 a7c7dea4812bb3af6443f2110b2a0053d12d03355e5da0d53419292a2b860315
MD5 74f6265b73da6d8a3e56896a1d79e3a2
BLAKE2b-256 56efac4e9ed3ef5ce3117df0c44d88875b26b5b74778846121b4df4add1f771e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page