Skip to main content

Utilities for iterating file contents.

Project description

file-iterator

Tool to iterate the contents of different file types (plain, zip or gzip) through the same interface.

The motivation for file-iterator is accessibility to file contents and code readability, as well as providing a way to link handlers (functions) to common file-reading events (start/stop/end file reading).

Installing

file-iterator is on PyPI so all you need to do is:

pip install file-iterator

Testing

Just run:

pytest tests/

Overview - Tutorial

# Let's say we have the same text in 3 file formats.
name_txt = 'file.txt'
name_gzip = 'file.gz'
name_zip = 'file.zip'

# In Python, we could read the text file just like this:
f = open(name_txt, 'r')
for line in f:
    print(line)
f.close()

# For the GZIP file, a library is required.
import gzip
f = gzip.open(name_gzip, 'r')
for line in f:
    print(line)
f.close()

# As well as for the ZIP file.
import zipfile
z = zipfile.ZipFile(name_zip, 'r')
f = z.open(z.namelist()[0], 'r')
for line_bytes in f:
    print(line_bytes)
f.close()

With the FileIterator interface, we could iterate the contents of any file the same way. We also wouldn't need to close it.

from file_iterator import FileIterator

def print_contents(it):
    for line_bytes in it:
        print(line_bytes)
        
it = FileIterator.get_iter(name_txt, 'plain')
print_contents(it)

it = FileIterator.get_iter(name_gzip, 'gzip')
print_contents(it)

it = FileIterator.get_iter(name_zip, 'zip')
print_contents(it)

With the FileGroupIterator interface, we could iterate through all the contents simply.

from file_iterator import FileGroupIterator

names = [name_txt, name_gzip, name_zip]
it = FileGroupIterator(names)
print_contents(it)

For loops use a copy of the iterator. Therefore, the original doesn't exhaust itself and we can iterate multiple times.

print_contents(it)
print_contents(it)

We can also iterate using next().

# Returns None when everything has been read.
line_b = next(it)
while line_b:
    line_b = next(it)

# This iteration does exhaust the iterator object.
print(line_b is None) # Prints True.
for line_b in it:
    pass # Doesn't enter here.

It also supports context manager functionality:

with FileGroupIterator(names) as it:
    print_contents(it)

License

MIT

Todo

  • Upload package to PyPi.
  • Tests: Events.
  • Tutorial: Events usage.
  • Event handlers: Receive a parameter containing info about the sender (object who triggered the event).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

file-iterator-1.0.1.tar.gz (6.6 kB view details)

Uploaded Source

File details

Details for the file file-iterator-1.0.1.tar.gz.

File metadata

  • Download URL: file-iterator-1.0.1.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/48.0.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.2rc1

File hashes

Hashes for file-iterator-1.0.1.tar.gz
Algorithm Hash digest
SHA256 cae95326e36039f31dbe485efdb68516536eb916b3a997ab45e0e3d30f5ab4ab
MD5 ea1625bb1da42b2c13b2f291f14247b7
BLAKE2b-256 7d95e0b789f9e3511876f4fd16eba05f9916c5db64dad1e531d01d58c142d2b2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page