Simple interface to get reader-like objects for Python 3 and 2.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

Project description

get_reader

Get reader objects, like those returned by csv.reader(), from various data sources.

Works on Python 3.8 through 3.2, 2.7, and 2.6:

from get_reader import get_reader

reader = get_reader('myfile.csv')
for row in reader:
    print(', '.join(row))

Supports explicit file handling:

from get_reader import get_reader

with open('myfile.csv', newline='') as csvfile:
    reader = get_reader(csvfile)
    for row in reader:
        print(', '.join(row))

Automatically detects other data sources if supporting packages are installed:

from get_reader import get_reader

# From an Excel file
reader = get_reader('myfile.xlsx')  # requires xlrd package

# From a DataFrame
df = pd.DataFrame([...])
reader = get_reader(df)  # requires pandas

# From a DBF file
reader = get_reader('myfile.dbf')  # requires dbfread package

Explicit constructors can be called directly to override auto-detect behavior:

from get_reader import get_reader

# From a tab-delimited text file
reader = get_reader.from_csv('myfile.txt', delimiter='\t')

Install

You can install get_reader using pip or you can vendor it directly in your own projects:

pip install get_reader

No hard dependencies, although xlrd and dbfread are required for Excel or DBF files; tested on Python 2.6, 2.7, 3.2 through 3.8, PyPy, PyPy3, and Jython; and is freely available under the Apache License, version 2.

To install with optional extras, use the following:

pip install get_reader[excel,dbf]

Reference

get_reader(obj, *args, **kwds)

Return a reader object which will iterate over records in the given data—like a csv.reader().

The obj type is used to automatically determine the appropriate handler. If obj is a string, it is treated as a file path whose extension determines its content type. Any *args and **kwds are passed to the underlying handler.

Using auto-detection:

from get_reader import get_reader

# CSV file.
reader = get_reader('myfile.csv')

# Excel file.
reader = get_reader('myfile.xlsx', worksheet='Sheet2')

# Pandas DataFrame.
df = pandas.DataFrame([...])
reader = get_reader(df)

# DBF file.
reader = get_reader('myfile.dbf')

If the obj type cannot be determined automatically, you can call one of the "from_...()" constructor methods listed below.

from_csv(csvfile, encoding='utf-8', **kwds)

Return a reader object which will iterate over lines in the given csvfile. The csvfile can be a string (treated as a file path) or any object which supports the iterator protocol and returns a string each time its __next__() method is called---file objects and list objects are both suitable. If csvfile is a file object, it should be opened with newline=''.
from get_reader import get_reader
reader = get_reader.from_csv('myfile.tab', delimiter='\t')
Using explicit file handling:
from get_reader import get_reader

with open('myfile.csv') as csvfile:
    reader = get_reader.from_csv(fh)

from_dicts(records, fieldnames=None)

Return a reader object which will iterate over the given dictionary records. This can be thought of as converting a csv.DictReader() into a plain, non-dictionary csv.reader().
from get_reader import get_reader

dictrows = [
    {'A': 1, 'B': 'x'},
    {'A': 2, 'B': 'y'},
]
reader = get_reader.from_dicts(dictrows)
This method assumes that record contents are consistent. If the first record is a dictionary, it is assumed that all following records will be dictionaries with matching keys.

from_excel(path, worksheet=0)

Return a reader object which will iterate over lines in the given Excel worksheet. path must specify to an XLSX or XLS file and worksheet should specify the index or name of the worksheet to load (defaults to the first worksheet).

Load first worksheet:
from get_reader import get_reader
reader = get_reader.from_excel('mydata.xlsx')
Specific worksheets can be loaded by name (a string) or index (an integer):
reader = get_reader.from_excel('mydata.xlsx', 'Sheet 2')

from_pandas(df, index=True)

Return a reader object which will iterate over records in the pandas.DataFrame df.

from_dbf(filename, encoding=None, **kwds)

Return a reader object which will iterate over lines in the given DBF file (from dBase, FoxPro, etc.).

from_squint(obj, fieldnames=None)

Return a reader object which will iterate over the records returned from a squint Select, Query, or Result. If the fieldnames argument is not provided, this function tries to construct names using the values from the underlying object.

Freely licensed under the Apache License, Version 2.0

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

Release history Release notifications | RSS feed

1.0.0

Dec 29, 2019

0.0.2

Sep 22, 2019

This version

0.0.1

Aug 12, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

get_reader-0.0.1.tar.gz (11.3 kB view hashes)

Uploaded Aug 12, 2019 Source

Built Distribution

get_reader-0.0.1-py2.py3-none-any.whl (7.4 kB view hashes)

Uploaded Aug 12, 2019 Python 2 Python 3

Hashes for get_reader-0.0.1.tar.gz

Hashes for get_reader-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`44b74b3dc90b88d89c8a49ade1f8f2b5f928db52a0818208e520b5e79912af98`
MD5	`0fef08eca9eb05e2159e7a2db18541f9`
BLAKE2b-256	`647e859f0ed361036a446aab5d5de97fe47c89ae2f14110ca61d61e3adf186d0`

Hashes for get_reader-0.0.1-py2.py3-none-any.whl

Hashes for get_reader-0.0.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`a708b9b2c1d729b68e869ce4d6c224c9fd79fd53bd278d1b855df479a17e456f`
MD5	`d39e697094b0f63cddbc745d902e9b92`
BLAKE2b-256	`3d071c62b2ff475adde08bf0cad5a7c9a96cedd6d45c8210e4cf921bf52cbf99`