A declarative, type-safe Python DSL for mapping complex nested JSON to relational database schemas
Project description
Quickstart: Declarative JSON-to-Relational Mapping in Python
etielle is a simple, powerful Python library for reshaping nested
JSON data, typically from an API,
into relational tables that fit your database schema. Think of etielle
as a “JSON extractor” that you program with clear instructions: “Go here
in the JSON, pull this data, and put it in that table.” The library’s
name is a play on ETL (“Extract, Transform,
Load”), which is the technical term for
this set of operations.
- Repository: Promptly-Technologies-LLC/etielle
- PyPI:
etielle - Python: >= 3.13
Why Use etielle? (For Beginners)
JSON data from APIs is often deeply nested and requires complicated
parsing. etielle helps by:
- Traversing nested structures: Walk through arrays-within-dictionaries-within-arrays to any arbitrary depth.
- Performing arbitrary transformations: Use built-in functions or define your own custom transforms.
- Building relationships: Link records across your different output tables automatically.
- Emitting to arbitrary formats: Emit data to Pydantic models, TypedDicts, or ORM objects directly.
- Optionally loading data into a database: Load data into a database using SQLAlchemy or SQLModel.
Learning Path
- Quickstart: Quick and dirty introduction to
etielleand how to use it. - Introduction to ETL: The problem
etielleis solving: JSON data ETL. - Navigation: How to navigate through your
JSON data with
goto()andeach(). - Transforms: Getting and altering values from the JSON data.
- Mapping Tables: Outputting data with
FieldandTempField. - Relationships: Linking tables together
with
link_to(). - Database Loading: Persisting data
with
load()andrun().
Installation
We recommend using uv for faster installs, but pip works too.
uv add etielle
# or
pip install etielle
Optional: Install with ORM adapters
If you plan to bind relationships and flush to your database via SQLAlchemy or SQLModel, install with the optional extra for your ORM:
uv add "etielle[sqlalchemy]"
# or
uv add "etielle[sqlmodel]"
Quick Start: Your First Mapping
Let’s start with a simple example. Suppose you have this JSON:
import json
data = {
"users": [
{"id": "u1", "name": "Alice", "posts": [{"id": "p1", "title": "Hello"}, {"id": "p2", "title": "World"}]},
{"id": "u2", "name": "Bob", "posts": []}
]
}
We want two tables: “users” (id, name) and “posts” (id, user_id, title).
Here’s the code using the fluent API:
from etielle import etl, Field, TempField, get, get_from_parent
# Build and run the pipeline
result = (
etl(data)
# Extract users
.goto("users").each()
.map_to(table="users", fields=[
Field("id", get("id")),
Field("name", get("name")),
])
# Extract posts (nested under each user)
.goto("posts").each()
.map_to(table="posts", fields=[
Field("id", get("id")),
Field("user_id", get_from_parent("id")), # Link to parent user
Field("title", get("title")),
])
.run()
)
# result.tables gives you dict access by table name
out = {table: list(rows.values()) for table, rows in result.tables.items()}
print(json.dumps(out, indent=2))
{
"users": [
{
"id": "u1",
"name": "Alice"
},
{
"id": "u2",
"name": "Bob"
}
],
"posts": [
{
"id": "p1",
"user_id": "u1",
"title": "Hello"
},
{
"id": "p2",
"user_id": "u1",
"title": "World"
}
]
}
Congrats! You’ve mapped your first JSON using the fluent E->T->L pattern.
Core Concepts: Breaking It Down
1. The etl() Entry Point
Everything starts with etl(data). This creates a pipeline builder that
you chain methods onto:
from etielle import etl
result = (
etl(data) # Start with your JSON
.goto("users") # Navigate to a path
.each() # Iterate over items
.map_to(...) # Emit table rows
.run() # Execute and get results
)
2. Navigation: goto() and each()
Navigation tells etielle where to find your data:
goto(path): Navigate to a nested location (supports dot notation:"data.users")each(): Iterate over items in a list or dict
# Navigate to data["response"]["users"] and iterate
.goto("response.users").each()
# Or use list syntax
.goto(["response", "users"]).each()
3. Fields and TempFields
When you call map_to(), you define what data to extract:
Field(name, transform): A column that appears in your outputTempField(name, transform): Used for joins/relationships, but NOT in output
.map_to(table="users", fields=[
Field("id", get("id")), # Output column (also available for joins)
Field("name", get("name")), # Output column
])
4. Transforms: Smart Data Extractors
Transforms are functions that pull values from the current context:
| Transform | Purpose |
|---|---|
get("name") |
Get field from current node |
get_from_parent("id") |
Get field from parent context |
get_from_root("config.version") |
Get field from JSON root |
literal(42) |
Constant value |
key() |
Current dict key |
index() |
Current list index |
concat(a, b, c) |
Join strings |
# Combine transforms
Field("full_id", concat(literal("user_"), get("id"))) # "user_u1"
5. Running the Pipeline
Call .run() to execute and get results:
result = pipeline.run()
# Access tables by name
users = result.tables["users"] # Dict[tuple, dict]
# Or by model class (if using typed models)
users = result.tables[User] # Dict[tuple, User]
# Check for errors
if result.errors:
for table, errs in result.errors.items():
print(f"{table}: {errs}")
Detailed Examples
Example 1: Merging Data from Multiple Paths
Merge user info from two parts of JSON using goto_root():
from etielle import etl, Field, TempField, get
data = {
"users": [{"id": "u1", "name": "Alice"}],
"profiles": [{"user_id": "u1", "email": "alice@example.com"}]
}
result = (
etl(data)
# First path: basic user data
.goto("users").each()
.map_to(table="users", join_on=["id"], fields=[
Field("id", get("id")),
Field("name", get("name")),
])
# Second path: profile data (same table, merged by id)
.goto_root()
.goto("profiles").each()
.map_to(table="users", join_on=["id"], fields=[
Field("email", get("email")),
TempField("id", get("user_id")) # Join key for merging
])
.run()
)
user = list(result.tables["users"].values())[0]
print(user) # Has id, name, AND email merged together
{'id': 'u1', 'name': 'Alice', 'email': 'alice@example.com'}
Example 2: Deep Nesting
Handle deeply nested structures with chained navigation:
# servers -> channels -> messages -> reactions (3 levels deep)
result = (
etl(data)
.goto("servers").each()
.map_to(table="servers", fields=[...])
.goto("channels").each()
.map_to(table="channels", fields=[
Field("server_id", get_from_parent("id", depth=1)),
...
])
.goto("messages").each()
.map_to(table="messages", fields=[
Field("channel_id", get_from_parent("id", depth=1)),
Field("server_id", get_from_parent("id", depth=2)),
...
])
.run()
)
Example 3: Typed Output with Pydantic
Use model classes for validated, typed output:
from pydantic import BaseModel
from etielle import etl, Field, TempField, get
class User(BaseModel):
id: str
name: str
email: str | None = None
data = {"users": [{"id": "u1", "name": "Alice"}]}
result = (
etl(data)
.goto("users").each()
.map_to(table=User, fields=[ # Pass model class, not string
Field("id", get("id")),
Field("name", get("name")),
])
.run()
)
user = list(result.tables[User].values())[0]
print(f"Type: {type(user).__name__}, name: {user.name}")
Type: User, name: Alice
Transform Cheatsheet
| Transform | Purpose | Example |
|---|---|---|
get(path) |
From current node | get("user.name") |
get_from_parent(path, depth=1) |
From ancestor | get_from_parent("id") |
get_from_root(path) |
From JSON root | get_from_root("version") |
key() |
Current dict key | When iterating {"a": 1, "b": 2} |
index() |
Current list index | 0, 1, 2, … |
parent_key() |
Parent’s dict key | Access parent iteration key |
parent_index() |
Parent’s list index | Access parent iteration index |
node() |
Current node value | The whole current object |
literal(value) |
Constant | literal(42) |
concat(*parts) |
Join strings | concat(get("first"), literal(" "), get("last")) |
coalesce(*transforms) |
First non-None | coalesce(get("nickname"), get("name")) |
format_id(*parts, sep="_") |
Join with separator | format_id(get("type"), get("id")) |
len_of(inner) |
Length of list/dict/string | len_of(get("tags")) |
Common Mistakes
- Empty results?
- Check your
goto()path matches the JSON structure exactly - Make sure you called
.each()to iterate
- Check your
- Missing parent data?
- Check the
depthparameter inget_from_parent() - Ensure the parent context exists in your navigation chain
- Check the
- Duplicate or missing rows?
- Verify
TempFieldvalues are unique for each row - Check that join keys don’t contain
Nonevalues
- Verify
Next Steps
- Navigation - Deep dive into
goto(),each(), andgoto_root() - Transforms - All built-in transforms and how to use them
- Mapping Tables -
Field,TempField, merge policies - Relationships - Link tables with
link_to() - Database Loading - Persist with
load().run()
Glossary
- Pipeline: The chain of operations from
etl()torun() - Navigation: Methods like
goto()andeach()that position you in the JSON - Transform: A function that extracts values from the current context
- Field: An output column in your table
- TempField: A field used for joins/linking but not in final output
- Join Key: Values that uniquely identify a row (derived from TempFields)
License
MIT
Need help? Open an issue on GitHub!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file etielle-3.5.0.tar.gz.
File metadata
- Download URL: etielle-3.5.0.tar.gz
- Upload date:
- Size: 101.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4bc816875225de6e5f2f416bfebcb5168599f66d1aec080c512b8814fa5d2992
|
|
| MD5 |
90c75a883bb05e7feac2a4b12fbabff5
|
|
| BLAKE2b-256 |
d3da2d2b1f13b5cd070a880f8651927a844c22b82e72b19ed604e53455f8f1a4
|
Provenance
The following attestation bundles were made for etielle-3.5.0.tar.gz:
Publisher:
release.yml on Promptly-Technologies-LLC/etielle
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
etielle-3.5.0.tar.gz -
Subject digest:
4bc816875225de6e5f2f416bfebcb5168599f66d1aec080c512b8814fa5d2992 - Sigstore transparency entry: 738929823
- Sigstore integration time:
-
Permalink:
Promptly-Technologies-LLC/etielle@e4cdb75067bf1cd0e8151b8dc873b39fcf3064c3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Promptly-Technologies-LLC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e4cdb75067bf1cd0e8151b8dc873b39fcf3064c3 -
Trigger Event:
workflow_run
-
Statement type:
File details
Details for the file etielle-3.5.0-py3-none-any.whl.
File metadata
- Download URL: etielle-3.5.0-py3-none-any.whl
- Upload date:
- Size: 37.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
431c60003c0ec4b3a613b54b80ba0987e03e2b61e5dc38574ff6f82aee388fe5
|
|
| MD5 |
aeee6f6c6c8c3c5dc6050a5f1231619f
|
|
| BLAKE2b-256 |
663cbc6b45fc12015c4c8f1c2e0bfcb080478e798e2c0b4c510a157fb916cada
|
Provenance
The following attestation bundles were made for etielle-3.5.0-py3-none-any.whl:
Publisher:
release.yml on Promptly-Technologies-LLC/etielle
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
etielle-3.5.0-py3-none-any.whl -
Subject digest:
431c60003c0ec4b3a613b54b80ba0987e03e2b61e5dc38574ff6f82aee388fe5 - Sigstore transparency entry: 738929828
- Sigstore integration time:
-
Permalink:
Promptly-Technologies-LLC/etielle@e4cdb75067bf1cd0e8151b8dc873b39fcf3064c3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Promptly-Technologies-LLC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e4cdb75067bf1cd0e8151b8dc873b39fcf3064c3 -
Trigger Event:
workflow_run
-
Statement type: