Extract/Transform Light - a simple library for reading delimited files.
Project description
ETlite
Extract/Transform Light - a simple library for reading delimited files.
Example
Given CSV file:
Area id,Male,Female,Area
A12345,34,45,0.25
A12346,108,99,0.32
Define a list of transformation:
transformations = [
# Map existing fields into dictionary.
# For nested dictionaries use dot.delimited.keys.
# Optional "via" parameter takes a callable returning transformed value.
{ "from": "Area id", "to": "id" },
{ "from": "Male", "to": "population.male", "via": int },
{ "from": "Female", "to": "population.female", "via": int },
{ "from": "Area", "to": "area", "via": float },
# You can also add computed values, not present in the original data source.
# Computer values take transformed dictionary as argument
# and they do not require "from" parameter:
{
"to": "population.total",
"via": lambda x: x['population']['male'] + x['population']['female']
},
# Note that transformations are executed in the order they were defined.
# This transformation uses population.total value computed in the previous step:
{
"to": 'population.density',
"via": lambda x: round(x['population']['total'] / x['area']),
}
]
Read the file:
from etlite import delim_reader
with open("mydatafile.csv") as csvfile:
reader = delim_reader(csvfile, transformations)
data = [row for row in reader]
This produces a list of dictionaries:
[
{
'id': 'A12345',
'area': 0.25,
'population': {
'male': 34,
'female': 45,
'total': 79,
'density': 316
}
},
{
'id': 'A12346',
'area': 0.32,
'population': {
'male': 108,
'female': 99,
'total': 207,
'density': 647
}
}
]
delim_reader
options
ETlite is just a thin wrapper on top of Python built-in CSV module. Thus you can pass to delim_reader
same options as you would pass to csv.reader
. For example:
reader = delim_reader(csvfile, transformations, delimiter="\t")
Exception handling
If desired transtormation cannot be performed, ETLite will raise TransformationError
. If you do not want to abort data loading, you can pass an error handler to delim_reader
.
Error handler must be a function. It will be passed an instance of TransformationError
. Note: on_error
must be pased as keywod argument.
from etlite import delim_reader
transformations = [
# ...
]
def error_handler(err):
# err is an instance of TransformationError
print(err) # prints error message
print(err.record) # prints raw record, prior to transformation
with open('my-data.csv') as stream:
reader = delim_reader(stream, transformations, on_error=error_handler)
for row in reader:
do_something(row)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file etlite-0.1.1.tar.gz
.
File metadata
- Download URL: etlite-0.1.1.tar.gz
- Upload date:
- Size: 3.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.28.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c90213acb439d8324cc585bf8247b9cffae33c3d2068a6bc9441558c2aff3a0a |
|
MD5 | c4fc51dc16e4ab5723b29963b0a819b8 |
|
BLAKE2b-256 | 5cc77c84d8ba557bd89a2606a7ace646eaf6173c353487f9ce33cb378aa7d943 |
File details
Details for the file etlite-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: etlite-0.1.1-py3-none-any.whl
- Upload date:
- Size: 3.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.0 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.28.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 007cefc7eb58615933a0f92a0dfbae1352eb3bcc94d6c29fc3efed0a5d3c14b8 |
|
MD5 | 94faed0dbf38f92e62b17a8be3b94e3c |
|
BLAKE2b-256 | 18e4096d5f097a4968f6fd263afe1e313c73ca1245031681a78164b995009582 |