Skip to main content

A declarative, type-safe Python DSL for mapping complex nested JSON to relational database schemas

Project description

etielle: Declarative JSON-to-Relational Mapping in Python

etielle is a simple, powerful Python library for reshaping nested JSON data, typically from an API, into relational tables that fit your database schema. Think of etielle as a “JSON extractor” that you program with clear instructions: “Go here in the JSON, pull this data, and put it in that table.” The library’s name is a play on ETL (“Extract, Transform, Load”), which is the technical term for this set of operations.

Why Use etielle? (For Beginners)

JSON data from APIs (Application Program Interfaces—web services that typically return JSON) is often deeply nested and requires complicated parsing. etielle helps by:

  • Declaring what you want: Write Python code to describe your tables and how to fill them.
  • Traversing nested structures: Walk through arrays-within-dictionaries-within-arrays to any arbitrary depth.
  • Performing arbitrary transformations: Use the provided functions to perform common operations (like getting the key or index of the current item or its parent), or define your own.
  • Building relationships: Use “keys” to link data across different parts of the JSON, like foreign keys in a database.
  • Being beginner-friendly: Everything is type-safe (Python checks your types), composable (build complex things from simple pieces), and easy to debug.

Installation

We recommend using uv for faster installs, but pip works too.

With uv (Recommended for Speed)

For your project:

uv add etielle

For one-off use:

uv pip install etielle

With pip

pip install etielle

Quick Start: Your First Mapping

Let’s start with a simple example. Suppose you have this JSON:

{
  "users": [
    {"id": "u1", "name": "Alice", "posts": [{"id": "p1", "title": "Hello"}, {"id": "p2", "title": "World"}]},
    {"id": "u2", "name": "Bob", "posts": []}
  ]
}

We want two tables: “users” (id, name) and “posts” (id, user_id, title).

Here’s the code:

from etielle.core import MappingSpec, TraversalSpec, TableEmit, Field
from etielle.transforms import get, get_from_parent
from etielle.executor import run_mapping

data = { ... }  # Your JSON here

# Traverse users array
users_traversal = TraversalSpec(
    path=["users"],  # Path to the array
    iterate_items=False,  # Iterate list items (not dict keys)
    emits=[
        TableEmit(
            table="users",
            join_keys=[get("id")],  # Unique key for the row
            fields=[
                Field("id", get("id")),
                Field("name", get("name"))
            ]
        )
    ]
)

# Traverse posts under each user
posts_traversal = TraversalSpec(
    path=["users"],
    iterate_items=False,
    inner_path=["posts"],  # Nested path inside each user
    inner_iterate_items=False,
    emits=[
        TableEmit(
            table="posts",
            join_keys=[get("id")],
            fields=[
                Field("id", get("id")),
                Field("user_id", get_from_parent("id")),  # Link to parent user
                Field("title", get("title"))
            ]
        )
    ]
)

spec = MappingSpec(traversals=[users_traversal, posts_traversal])
result = run_mapping(data, spec)
print(result)  # Outputs dict of tables with rows
{}

Congrats! You’ve mapped your first JSON.

Core Concepts: Breaking It Down

Let’s explain the building blocks like you’re learning for the first time.

1. Context: Your “Location” in the JSON

Imagine traversing a JSON tree—Context is your GPS:

  • root: The entire JSON.
  • node: The current spot (e.g., a user object).
  • path: Directions to get here (e.g., (“users”, 0)).
  • parent: The previous spot (for looking “up”).
  • key/index: If in a dict/list, the current key or index.
  • slots: A notepad for temporary notes.

Contexts are created automatically as you traverse and are immutable (unchangeable) for safety.

2. Transforms: Smart Data Extractors

Transforms are like mini-functions that pull values from Context. They’re “lazy”—they don’t run until needed, and they adapt to the current Context.

Examples:

  • get("name"): Get “name” from current node.
  • get_from_parent("id"): Get “id” from parent.
  • index(): Current list position.
  • concat(literal("user_"), get("id")): Combine strings.

Full list in the Cheatsheet below.

3. TraversalSpec: How to Walk the JSON

This says: “Start here, then go deeper if needed, and do this for each item.”

  • path: Starting path (list of strings, e.g., [“users”]).
  • iterate_items: True for dicts (key-value pairs), False for lists.
  • inner_path: Optional deeper path (e.g., [“posts”] for nesting).
  • emits: What tables to create from each item.

You can have multiple Traversals in one MappingSpec—they run independently.

4. TableEmit and Fields: Building Your Tables

  • table: Name of the table.
  • fields: List of Field(name, transform) – columns and how to compute them.
  • join_keys: List of transforms for unique row IDs (like primary keys). Same keys across traversals merge rows.

5. Executor: Running It All

run_mapping(json_data, spec) executes everything and returns a dict of tables.

Detailed Examples

Example 1: Composite Keys for Merging Data

Merge user info from two parts of JSON:

spec = MappingSpec(traversals=[
    TraversalSpec(  # Basic user data
        path=["users"],
        iterate_items=False,
        emits=[TableEmit(
            table="users",
            join_keys=[get("id")],
            fields=[Field("id", get("id")), Field("name", get("name"))]
        )]
    ),
    TraversalSpec(  # Add email from another section
        path=["profiles"],
        iterate_items=False,
        emits=[TableEmit(
            table="users",  # Same table!
            join_keys=[get("user_id")],  # Matches previous keys
            fields=[Field("email", get("email"))]
        )]
    )
])

Rows with matching keys merge: e.g., add “email” to existing user row.

Example 2: Deep Nesting (Arbitrary Depth)

No limit to depth—use longer inner_path:

spec = MappingSpec(traversals=[
    TraversalSpec(
        path=["servers"],
        iterate_items=False,
        inner_path=["channels", "messages", "reactions"],  # 3 levels deep!
        inner_iterate_items=False,
        emits=[TableEmit(
            table="reactions",
            join_keys=[get_from_parent("id", depth=3), get_from_parent("id", depth=2), get_from_parent("id"), get("id")],
            fields=[
                Field("server_id", get_from_parent("id", depth=3)),
                Field("channel_id", get_from_parent("id", depth=2)),
                Field("message_id", get_from_parent("id")),
                Field("reaction", get("emoji"))
            ]
        )]
    )
])

Transform Cheatsheet

  • get(path): From current node (dot notation or list, e.g., “user.name” or [“user”, 0]).
  • get_from_parent(path, depth=1): From ancestor.
  • get_from_root(path): From top-level JSON.
  • key(): Current dict key.
  • index(): Current list index.
  • literal(value): Constant value.
  • **concat(*parts)**: Join strings.
  • **format_id(*parts, sep=“_“)**: Join non-empty parts with separator.
  • **coalesce(*transforms)**: First non-None value.
  • len_of(inner): Length of a list/dict/string.

Pro Tip: Transforms are lazy—they run in the “context” of where they’re used, making them super flexible.

Advanced Topics

  • Lazy Evaluation: Transforms don’t compute until executed, adapting to the current spot in JSON.
  • Custom Transforms: Define your own functions that take Context and return values.
  • Row Merging Rules: Last write wins for duplicate fields; missing keys skip rows.
  • Performance: Efficient for large JSON; traversals are independent.

Roadmap Ideas

  • Database integrations (e.g., SQLAlchemy).
  • More examples and benchmarks.
  • Visual mapping tools.

License

MIT

Need help? Open an issue on GitHub!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

etielle-0.2.0.tar.gz (65.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

etielle-0.2.0-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file etielle-0.2.0.tar.gz.

File metadata

  • Download URL: etielle-0.2.0.tar.gz
  • Upload date:
  • Size: 65.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for etielle-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c03cde5ded00c56f95614bff3a3407494aa2bfe5d27637744fdd58b037a95d2c
MD5 5b5377ea526e7863f5cfa7fb7400b62f
BLAKE2b-256 59903299e22c16ca387c3b19544146c091aa3c49984f4f5e6f0de997b66152ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for etielle-0.2.0.tar.gz:

Publisher: release.yml on Promptly-Technologies-LLC/etielle

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file etielle-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: etielle-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for etielle-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3270d787794c0e9a61183494158d40bf013342611ac0713b726fe7b077d09b2a
MD5 85f72ee127957f4a30a434bb181dcb10
BLAKE2b-256 86847e3a2797076a4abedb26088fd078bd1dd63ac510404f0b1883ee27c0b5f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for etielle-0.2.0-py3-none-any.whl:

Publisher: release.yml on Promptly-Technologies-LLC/etielle

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page