Skip to main content

Consistent interface for stream reading and writing tabular data (csv/xls/json/etc)

Project description

Travis
Coveralls
PyPi
SemVer
Gitter

Consistent interface for stream reading and writing tabular data (csv/xls/json/etc).

Release v0.10 contains changes in exceptions module introduced in NOT backward-compatibility manner.

Features

  • supports various formats: csv/tsv/xls/xlsx/json/ndjson/ods/native/etc

  • reads data from variables, filesystem or Internet

  • streams data instead of using a lot of memory

  • processes data via simple user processors

  • saves data using the same interface

Getting Started

Installation

To get started:

$ pip install tabulator

Example

Open tabular stream from csv source:

from tabulator import Stream

with Stream('path.csv', headers=1) as stream:
    print(stream.headers) # will print headers from 1 row
    for row in stream:
        print(row)  # will print row values list

Stream

Stream takes the source argument:

<scheme>://path/to/file.<format>

and uses corresponding Loader and Parser to open and start to iterate over the tabular stream. Also user can pass scheme and format explicitly as constructor arguments. User can force Tabulator to use encoding of choice to open the table passing encoding argument.

In this example we use context manager to call stream.open() on enter and stream.close() when we exit:

  • stream can be iterated like file-like object returning row by row

  • stream can be used for manual iterating with iter(keyed/extended) function

  • stream can be read into memory using read(keyed/extended) function with row count limit

  • headers can be accessed via headers property

  • rows sample can be accessed via sample property

  • stream pointer can be set to start via reset method

  • stream could be saved to filesystem using save method

Below the more expanded example is presented:

from tabulator import Stream

def skip_even_rows(extended_rows):
    for number, headers, row in extended_rows:
        if number % 2:
            yield (number, headers, row)

stream = Stream('http://example.com/source.xls',
    headers=1, encoding='utf-8', sample_size=1000,
    post_parse=[skip_even_rows], sheet=1)
stream.open()
print(stream.sample)  # will print sample
print(stream.headers)  # will print headers list
print(stream.read(limit=10))  # will print 10 rows
stream.reset()
for keyed_row in stream.iter(keyed=True):
    print keyed_row  # will print row dict
for extended_row in stream.iter(extended=True):
    print extended_row  # will print (number, headers, row)
stream.reset()
stream.save('target.csv')
stream.close()

For the full list of options see - https://github.com/frictionlessdata/tabulator-py/blob/master/tabulator/stream.py#L17

API Reference

Snapshot

Stream(source,
       headers=None,
       scheme=None,
       format=None,
       encoding=None,
       sample_size=None,
       post_parse=None,
       **options)
    closed/open/close/reset
    headers -> list
    sample -> rows
    iter(keyed/extended=False) -> (generator) (keyed/extended)row[]
    read(keyed/extended=False, limit=None) -> (keyed/extended)row[]
    save(target, format=None, encoding=None, **options)
exceptions
~cli

Detailed

Contributing

Please read the contribution guideline:

How to Contribute

Thanks!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tabulator-0.10.1.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tabulator-0.10.1-py2.py3-none-any.whl (31.7 kB view details)

Uploaded Python 2Python 3

File details

Details for the file tabulator-0.10.1.tar.gz.

File metadata

  • Download URL: tabulator-0.10.1.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for tabulator-0.10.1.tar.gz
Algorithm Hash digest
SHA256 0365ad72ecbb7a15d3a5fb5160e90e7985c4cb0b93abb1d35f857649c7649051
MD5 1592f5127f50359d227b3b438e9696d3
BLAKE2b-256 90e3be5bede8bd3e32f1d756f927ea2bfd06f6bc1b0b141bf09530589b068fe3

See more details on using hashes here.

File details

Details for the file tabulator-0.10.1-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for tabulator-0.10.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 6626fec5e12cef873e574f3ea2cf0d7d225d43d1e109f8e21ab7ffa3904901fb
MD5 a13d04e75485470f7652bd2e0f53b33f
BLAKE2b-256 983a9c11ee27a73e4c8f65cdec61e88b8f9bf5f913f5dbb4478d6f5bb77a389d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page