Skip to main content
Python Software Foundation 20th Year Anniversary Fundraiser  Donate today!

Loads data from various formats

Project description


.. image::
:alt: Travis build status

Given a source of rowlike data, acts as a generator of OrderedDicts.


src = Source('mydata.csv')
for row in src:

data-dispenser thus serves as a single API for a variety of data sources.

* Free software: MIT license

Data source types supported

* file names / paths
* open file objects
* pymongo Collection objects
* strings interpretable as data
* URLs beginning with http:// or https://

Will work most reliably against filenames with extensions that indicate
the data format; otherwise data-dispenser may guess the input format wrong.

Data input formats supported

* csv
* yaml (requires ``pyyaml``)
* json
* pickle
* ``eval``-able Python
* xls
* xml (experimental)
* HTML with ``<table>``s

Multiple files

File paths with wildcards will be
effectively concatenated into one large data source.

Load limits

Large data sources could overwhelm your system's memory. Passing a ``limit``
keyword to the ``Source`` constructor limits the rows returned from each
source. For file paths with wildcards, the limit applies to each file
source, not to the number of file sources.


Source and bug tracker


0.1.0 (2014-05-21)

* First release on PyPI.

0.1.1 (2014-05-23)

* Fixed bugs in handling non-listlike YAML files

0.2.0 (2014-07-14)

* Support .xls
* Support URLs
* Support wildcards

0.2.1 (2014-27-14)

* Support .html

Project details

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page