Consistent io iterface to read and write from/to both local and different remote resources (e.g. http, s3)

These details have not been verified by PyPI

Project links

Project description

iotoolz

iotoolz is an improvement over e2fyi-utils and is inspired partly by toolz. iotoolz is a lib to help provide a consistent dev-x for interacting with any IO resources. It provides an abstract class iotoolz.AbcStream which mimics python's native open very closely (with some additional parameters and methods such as save).

API documentation can be found at https://iotoolz.readthedocs.io/en/latest/.

Change logs are available in CHANGELOG.md.

Python 3.6 and above

Licensed under Apache-2.0.

Quickstart

# install the default packages only (most lite-weight)
pip install iotoolz

iotoolz.streams

The helper object iotoolz.streams.stream_factory is a default singleton of iotoolz.streams.Streams provided to support most of the common use cases.

iotoolz.streams.as_stream is a util method provided by the singleton helper to create a stream object. This method accepts the same arguments as python's open method with the following additional parameters:

data: optional str or bytes that will be passed into the stream
fileobj: optional file-like object which will be copied into the stream
content_type: optional mime type information to describe the stream (e.g. application/json)
inmem_size: determines how much memory to allocate to the stream before rolling over to local file system. Defaults to no limits (may result in MemoryError).
schema_kwargs: optional mapping of schemas to their default kwargs.

from iotoolz.streams import as_stream

default_schema_kwargs = {
    "https": {"verify": False}  # pass to requests - i.e. don't verify ssl
}

# this will return a stream that reads from the site
http_google = as_stream(
    "https://google.com",
    mode="r",
    schema_kwargs=default_schema_kwargs
)

html = http_google.read()
content_type = http_google.content_type
encoding = http_google.encoding

# this will write to the https endpoint using the POST method (default is PUT)
with as_stream("https://foo/bar", mode="wb", use_post=True) as stream:
    stream.write(b"hello world")


# this will write to a local path
# save will write the current content to the local file
foo_txt = as_stream(
    "path/to/foo.txt",
    mode="w",
    content_type="text/plain",
    encoding="utf-8",
    data="foo bar",
).save()

# go to the end of the buffer
foo_txt.seek(0, whence=2)
# append more data
foo_txt.write("\nnext line")
# save and close the data
foo_txt.close()

Pipe streams

pipe is method to push data to a sink (similar to NodeJS stream except it has no watermark or buffering).

from  iotoolz.streams import as_stream

local_file = as_stream("path/to/google.html", content_type="text/html", mode="w")
temp_file = as_stream("tmp://google.html", content_type="text/html", mode="wb")

# when source is closed, all sinks will be closed also
with as_stream("https://google.com") as source:
    # writes to a temp file then to a local file in sequence
    source.pipe(temp_file).pipe(local_file)


local_file2 = as_stream("path/to/google1.html", content_type="text/html", mode="w")
local_file3 = as_stream("path/to/google2.html", content_type="text/html", mode="w")

# when source is closed, all sinks will be closed also
with as_stream("tmp://foo_src", mode="w") as source:
    # writes in a fan shape manner
    source.pipe(local_file2)
    source.pipe(local_file3)

    source.write("hello world")

TODO support transform streams so that pipe can be more useful

Creating a custom AbcStream class

The abstract class iotoolz.AbcStream requires the following methods to be implemented:

# This is the material method to get the data from the actual IO resource.
# It should return an iterable to the data and the corresponding StreamInfo.
# If resources to the data need to be released, you can also return a ContextManager
# to the iterable instead.
def _read_to_iterable(
    self, uri: str, chunk_size: int, **kwargs
) -> Tuple[Union[Iterable[bytes], ContextManager[Iterable[bytes]]], StreamInfo]:
    ...

# This is the material method to write the data to the actual IO resource.
# This method is only triggered when "close" or "save" is called.
# You should use the "file_" parameter (a file-like obj) to write the current data to
# the actual IO resource.
def _write_from_fileobj(
    self, uri: str, file_: IO[bytes], size: int, **kwargs
) -> StreamInfo:
    ...

StreamInfo is a dataclass to hold the various info about the data stream (e.g. content_type, encoding and etag).

Ideally, the implementation of any AbcStream class should also provide supported_schemas (Set[str]) as a class variable. This class variable will be used in the future to infer what sort of schemas that will be supported by the class. For example, since https and http are supported by iotoolz.HttpStream, all uri that starts with https:// and http:// can be handled by iotoolz.HttpStream.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.0

Nov 16, 2021

0.1.0rc17 pre-release

Jan 15, 2021

0.1.0rc16 pre-release

Jan 8, 2021

0.1.0rc15 pre-release

Dec 18, 2020

0.1.0rc14 pre-release

Dec 10, 2020

0.1.0rc13 pre-release

Dec 8, 2020

0.1.0rc12 pre-release

Nov 10, 2020

0.1.0rc11 pre-release

Nov 3, 2020

0.1.0rc10 pre-release

Nov 1, 2020

0.1.0rc9 pre-release

Oct 29, 2020

0.1.0rc8 pre-release

Oct 29, 2020

0.1.0rc7 pre-release

Oct 29, 2020

0.1.0rc6 pre-release

Oct 28, 2020

0.1.0rc5 pre-release

Oct 28, 2020

0.1.0rc4 pre-release

Oct 27, 2020

0.1.0rc3 pre-release

Oct 19, 2020

0.1.0rc2 pre-release

Oct 13, 2020

This version

0.1.0rc1 pre-release

Oct 13, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iotoolz-0.1.0rc1.tar.gz (17.4 kB view details)

Uploaded Oct 13, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iotoolz-0.1.0rc1-py3-none-any.whl (18.1 kB view details)

Uploaded Oct 13, 2020 Python 3

File details

Details for the file iotoolz-0.1.0rc1.tar.gz.

File metadata

Download URL: iotoolz-0.1.0rc1.tar.gz
Upload date: Oct 13, 2020
Size: 17.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.2 CPython/3.6.7 Linux/4.15.0-1077-gcp

File hashes

Hashes for iotoolz-0.1.0rc1.tar.gz
Algorithm	Hash digest
SHA256	`6e0c6091d0676d9f72e7f58c4473ce679a07ef086508d43d430fa2d477d5c3c3`
MD5	`5e23cab9a5677d907cc94bda29985bf8`
BLAKE2b-256	`aae822f641554a8b2e0a57f760373bfc70bda5195b8da08602c268a2d0eccc13`

See more details on using hashes here.

File details

Details for the file iotoolz-0.1.0rc1-py3-none-any.whl.

File metadata

Download URL: iotoolz-0.1.0rc1-py3-none-any.whl
Upload date: Oct 13, 2020
Size: 18.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.1.2 CPython/3.6.7 Linux/4.15.0-1077-gcp

File hashes

Hashes for iotoolz-0.1.0rc1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b442a22f5a277901788e682d58e4f839a9f52e3cba22d142d81f5afe96b253b5`
MD5	`3f18840cff46202ad5d384633cabc770`
BLAKE2b-256	`d86c0754ecb74261e83f6f9eb0e8f4fb9cff8d5641946d5a7547e042ecd4abf3`

See more details on using hashes here.

iotoolz 0.1.0rc1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

iotoolz

Quickstart

iotoolz.streams

Pipe streams

Creating a custom AbcStream class

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes