Cleaning your messy data.
Project description
Cleaning your messy data.
Getting started
Consider you need to clean up some messy data. Here is a deep nested dictionary containing unnecessary nesting, tuples and so.
some_messy_data = {
"body": {
"articles": [
{"title": "Title",
"published": {"date": ("2014-11-05", "23:00:00")}},
],
},
"meta": {
"meta1": {
"meta3": ('19911105', {"author": "Author name"})
},
"meta4": {"assetType": 1}
}
}
Necessary values are 'articles', 'author' and 'assetType'.
Now let the hack begin with the dripper. Defile ‘declaration’ dictionary to drip essential data.
declaration = {
"articles": {
"__type__": "list", # 'articles' is list of dictionary
"__source_root__": ['body', 'articles'], # The root position of 'articles'
"title": ["title"], # each dictionary of 'articles' will contain 'title'
"published": ["published", "date", 0], # and 'published'. you can pass the path to the value
},
"meta": {
"__source_root__": ["meta", "meta1", "meta3", 1],
"author": ["author"],
},
"asset_type": ["meta", "meta4", "assetType"],
}
And just use like this.
from dripper import dripper_factory
d = dripper_factory(declaration)
d(some_messy_data) == {
"articles": [
{'title': "Title",
'published': "2014-11-05"},
],
"meta": {
"author": "Author name",
},
"asset_type": 1,
}
Advanced
Converting
Use dripper.ValueDripper to pass converter function.
import dripper
declaration = {
"title": dripper.ValueDripper(["title"], converter=lambda s: s.lower())
}
d = dripper.dripper_factory(declaration)
d({"title": "TITLE"}) == {"title": "title"}
Technically, each ends (list) will be replaced by instance of dripper.ValueDripper.
Combining
By combining dripper.ValueDripper, result value of that key will be combined.
import dripper
declaration = {
"fullname": (dripper.ValueDripper(["firstname"]) +
dripper.ValueDripper(["lastname"]))
}
d = dripper.dripper_factory(declaration)
d({"firstname": "Hrioki", "lastname": "Kiyohara"}) == {"fullname": "HriokiKiyohara"}
0.2
Improved error handling.
Added MixDripper.
0.1
Initial version
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dripper-0.2.tar.gz
(3.8 kB
view hashes)