Skip to main content

Extract/Transform Light - a simple library for reading delimited files.

Project description

Build Status


Extract/Transform Light - a simple library for reading delimited files.


Given CSV file:

Area id,Male,Female,Area

Define a list of transformation:

transformations = [
    # Map existing fields into dictionary.
    # For nested dictionaries use dot.delimited.keys.
    # Optional "via" parameter takes a callable returning transformed value.
    { "from": "Area id", "to": "id" },
    { "from": "Male", "to": "population.male", "via": int },
    { "from": "Female", "to": "population.female", "via": int },
    { "from": "Area", "to": "area", "via": float },

    # You can also add computed values, not present in the original data source.
    # Computer values take transformed dictionary as argument
    # and they do not require "from" parameter:
        "to": "",
        "via": lambda x: x['population']['male'] + x['population']['female']
    # Note that transformations are executed in the order they were defined.
    # This transformation uses value computed in the previous step:
        "to": 'population.density',
        "via": lambda x: round(x['population']['total'] / x['area']),

Read the file:

from etlite import delim_reader

with open("mydatafile.csv") as csvfile:
  reader = delim_reader(csvfile, transformations)
  data = [row for row in reader]

This produces a list of dictionaries:

        'id': 'A12345',
        'area': 0.25,
        'population': {
            'male': 34,
            'female': 45,
            'total': 79,
            'density': 316
        'id': 'A12346',
        'area': 0.32,
        'population': {
            'male': 108,
            'female': 99,
            'total': 207,
            'density': 647

delim_reader options

ETlite is just a thin wrapper on top of Python built-in CSV module. Thus you can pass to delim_reader same options as you would pass to csv.reader. For example:

reader = delim_reader(csvfile, transformations, delimiter="\t")

Exception handling

If desired transtormation cannot be performed, ETLite will raise TransformationError. If you do not want to abort data loading, you can pass an error handler to delim_reader.

Error handler must be a function. It will be passed an instance of TransformationError. Note: on_error must be pased as keywod argument.

from etlite import delim_reader

transformations = [
    # ...

def error_handler(err):
    # err is an instance of TransformationError
    print(err) # prints error message
    print(err.record) # prints raw record, prior to transformation

with open('my-data.csv') as stream:
    reader = delim_reader(stream, transformations, on_error=error_handler)
    for row in reader:

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

etlite-0.1.1.tar.gz (3.2 kB view hashes)

Uploaded source

Built Distribution

etlite-0.1.1-py3-none-any.whl (3.6 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page