Skip to main content

Simple interface to get reader-like objects for Python 3 and 2.

Project description

get_reader

Get reader objects, like those returned by csv.reader(), from various data sources.

Works on Python 3.8 through 3.2, 2.7, and 2.6:

from get_reader import get_reader

reader = get_reader('myfile.csv')
for row in reader:
    print(', '.join(row))

Supports explicit file handling:

from get_reader import get_reader

with open('myfile.csv', newline='') as csvfile:
    reader = get_reader(csvfile)
    for row in reader:
        print(', '.join(row))

Automatically detects other data sources if supporting packages are installed:

from get_reader import get_reader

# From an Excel file
reader = get_reader('myfile.xlsx')  # requires xlrd package

# From a DataFrame
df = pd.DataFrame([...])
reader = get_reader(df)  # requires pandas

# From a DBF file
reader = get_reader('myfile.dbf')  # requires dbfread package

Explicit constructors can be called directly to override auto-detect behavior:

from get_reader import get_reader

# From a tab-delimited text file
reader = get_reader.from_csv('myfile.txt', delimiter='\t')

Install

You can install get_reader using pip or you can vendor it directly in your own projects:

pip install get_reader

No hard dependencies, although xlrd and dbfread are required for Excel or DBF files; tested on Python 2.6, 2.7, 3.2 through 3.8, PyPy, PyPy3, and Jython; and is freely available under the Apache License, version 2.

To install with optional extras, use the following:

pip install get_reader[excel,dbf]

Reference

get_reader(obj, *args, **kwds)

Return a reader object which will iterate over records in the given data—like a csv.reader().

The obj type is used to automatically determine the appropriate handler. If obj is a string, it is treated as a file path whose extension determines its content type. Any *args and **kwds are passed to the underlying handler.

Using auto-detection:

from get_reader import get_reader

# CSV file.
reader = get_reader('myfile.csv')

# Excel file.
reader = get_reader('myfile.xlsx', worksheet='Sheet2')

# Pandas DataFrame.
df = pandas.DataFrame([...])
reader = get_reader(df)

# DBF file.
reader = get_reader('myfile.dbf')

If the obj type cannot be determined automatically, you can call one of the "from_...()" constructor methods listed below.

from_csv(csvfile, encoding='utf-8', **kwds)

Return a reader object which will iterate over lines in the given csvfile. The csvfile can be a string (treated as a file path) or any object which supports the iterator protocol and returns a string each time its __next__() method is called---file objects and list objects are both suitable. If csvfile is a file object, it should be opened with newline=''.

from get_reader import get_reader
reader = get_reader.from_csv('myfile.tab', delimiter='\t')

Using explicit file handling:

from get_reader import get_reader

with open('myfile.csv') as csvfile:
    reader = get_reader.from_csv(fh)

from_dicts(records, fieldnames=None)

Return a reader object which will iterate over the given dictionary records. This can be thought of as converting a csv.DictReader() into a plain, non-dictionary csv.reader().

from get_reader import get_reader

dictrows = [
    {'A': 1, 'B': 'x'},
    {'A': 2, 'B': 'y'},
]
reader = get_reader.from_dicts(dictrows)

This method assumes that record contents are consistent. If the first record is a dictionary, it is assumed that all following records will be dictionaries with matching keys.

from_excel(path, worksheet=0)

Return a reader object which will iterate over lines in the given Excel worksheet. path must specify to an XLSX or XLS file and worksheet should specify the index or name of the worksheet to load (defaults to the first worksheet).

Load first worksheet:

from get_reader import get_reader
reader = get_reader.from_excel('mydata.xlsx')

Specific worksheets can be loaded by name (a string) or index (an integer):

reader = get_reader.from_excel('mydata.xlsx', 'Sheet 2')

from_pandas(df, index=True)

Return a reader object which will iterate over records in the pandas.DataFrame df.

from_dbf(filename, encoding=None, **kwds)

Return a reader object which will iterate over lines in the given DBF file (from dBase, FoxPro, etc.).

from_squint(obj, fieldnames=None)

Return a reader object which will iterate over the records returned from a squint Select, Query, or Result. If the fieldnames argument is not provided, this function tries to construct names using the values from the underlying object.


Freely licensed under the Apache License, Version 2.0

(C) Copyright 2018 -- 2019 Shawn Brown.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

get_reader-0.0.1.tar.gz (11.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

get_reader-0.0.1-py2.py3-none-any.whl (7.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file get_reader-0.0.1.tar.gz.

File metadata

  • Download URL: get_reader-0.0.1.tar.gz
  • Upload date:
  • Size: 11.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.7.4

File hashes

Hashes for get_reader-0.0.1.tar.gz
Algorithm Hash digest
SHA256 44b74b3dc90b88d89c8a49ade1f8f2b5f928db52a0818208e520b5e79912af98
MD5 0fef08eca9eb05e2159e7a2db18541f9
BLAKE2b-256 647e859f0ed361036a446aab5d5de97fe47c89ae2f14110ca61d61e3adf186d0

See more details on using hashes here.

File details

Details for the file get_reader-0.0.1-py2.py3-none-any.whl.

File metadata

  • Download URL: get_reader-0.0.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.7.4

File hashes

Hashes for get_reader-0.0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 a708b9b2c1d729b68e869ce4d6c224c9fd79fd53bd278d1b855df479a17e456f
MD5 d39e697094b0f63cddbc745d902e9b92
BLAKE2b-256 3d071c62b2ff475adde08bf0cad5a7c9a96cedd6d45c8210e4cf921bf52cbf99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page