lightweight, simple, and fast declarative XML and JSON data extraction
Project description
Yankee - Simple Declarative Data Extraction from XML and JSON
This is kind of like Marshmallow, but only does deserialization. What it lacks in reversibility, it makes up for in speed. Schemas are compiled in advance allowing data extraction to occur very quickly.
Motivation
I have another package called patent_client. I also do a lot with legal data, some of which is in XML, and some of which is in JSON. But there's a lot of it. And I mean a lot, so speed matters.
Quick Start
There are two main modules: yankee.json.schema
and yankee.xml.schema
. Those modules support defining class-style deserializers. Both start by subclassing a Schema
class, and then defining attributes from the fields
submodule.
JSON Deserializer Example
from yankee.json import Schema, fields
class JsonExample(Schema):
name = fields.String()
birthday = fields.Date("birthdate")
deep_data = fields.Int("something.0.many.levels.deep")
obj = {
"name": "Johnny Appleseed",
"birthdate": "2000-01-01",
"something": [
{"many": {
"levels": {
"deep": 123
}
}}
]
}
JsonExample().deserialize(obj)
# Returns
{
"name": "Johnny Appleseed",
"birthday": datetime.date(2000, 1, 1),
"deep_data": 123
}
For JSON, the attributes are filled by pulling values off of the JSON object. If no path is provided, then the attribute name is used. Otherwise, a dotted string can be used to pluck an item from the JSON object.
XML Deserializer Example
import lxml.etree as ET
from yankee.xml import Schema, fields
class XmlExample(Schema):
name = fields.String("./name")
birthday = fields.Date("./birthdate")
deep_data = fields.Int("./something/many/levels/deep")
obj = ET.fromstring(b"""
<xmlObject>
<name>Johnny Appleseed</name>
<birthdate>2000-01-01</birthdate>
<something>
<many>
<levels>
<deep>123</deep>
</levels>
</many>
</something>
</xmlObject>
""".strip())
XmlExample().deserialize(obj)
# Returns
{
"name": "Johnny Appleseed",
"birthday": datetime.date(2000, 1, 1),
"deep_data": 123
}
For XML, the attributes are filled using XPath expressions. If no path is provided, then the entire object is passed to the field (no implicit paths). Any valid Xpath expression can be used.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file yankee-0.1.6.tar.gz
.
File metadata
- Download URL: yankee-0.1.6.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.6 CPython/3.9.12 Darwin/21.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9057b61899be02b9c638059975bef036df4b8428c7cb3522609e5567c67dc89 |
|
MD5 | 1ddbe2a07bd73d51f6fa0fa84d102bf2 |
|
BLAKE2b-256 | 3b327ef41cf49102bcee6575d308e5ee5c1541d8994ccf12506ba2ab548851f6 |
File details
Details for the file yankee-0.1.6-py3-none-any.whl
.
File metadata
- Download URL: yankee-0.1.6-py3-none-any.whl
- Upload date:
- Size: 19.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.6 CPython/3.9.12 Darwin/21.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13b816788a4af2656252303a0a2fb18fc3400328381fbea51f20ad095e50f25a |
|
MD5 | af20bebb75ec502b7623dfb2b80a6dc5 |
|
BLAKE2b-256 | 207d7bbfc8bdbd1a476a48db4b9f61f1a0c0b3bb641f209de1fcafe30d6b5e8e |