A data transformation library for flattening complex nested structures into tabular formats while preserving hierarchical relationships
Project description
Transmog
Flatten nested JSON data into tabular formats while preserving parent-child relationships.
Installation
# Standard install (includes Parquet and ORC support)
pip install transmog
# Minimal install (CSV output only)
pip install transmog[minimal]
Quick Start
import transmog as tm
data = {"user": "Alice", "orders": [{"id": 101}, {"id": 102}]}
result = tm.flatten(data, name="users")
result.main # Main table
result.tables["users_orders"] # Child tables
result.save("output.csv") # Save to file
How it works: Nested JSON is flattened into related tables with foreign key relationships:
%%{init: {'theme': 'dark', 'themeVariables': {
'primaryColor': '#ff79c6',
'secondaryColor': '#bd93f9',
'tertiaryColor': '#44475a',
'mainBkg': '#282a36',
'nodeBorder': '#ff79c6',
'clusterBkg': '#44475a',
'clusterBorder': '#bd93f9',
'textColor': '#f8f8f2'
}}}%%
flowchart LR
subgraph Input["INPUT"]
JSON["user: Alice
orders: [
• id: 101
• id: 102
]"]
end
Input --> |flatten| ERD
subgraph ERD["OUTPUT"]
direction LR
users["users
━━━━━━━━━━━━━━
_id PK
user
_timestamp"]
users_orders["users_orders
━━━━━━━━━━━━━━━━
_id PK
_parent_id FK
id
_timestamp"]
users -->|1:N| users_orders
end
style Input fill:#44475a,stroke:#ff79c6,stroke-width:3px
style ERD fill:#44475a,stroke:#bd93f9,stroke-width:3px
style JSON fill:#282a36,stroke:#ff79c6,stroke-width:2px,color:#f8f8f2
style users fill:#282a36,stroke:#50fa7b,stroke-width:2px,color:#f8f8f2
style users_orders fill:#282a36,stroke:#8be9fd,stroke-width:2px,color:#f8f8f2
Features
- Flatten nested JSON to CSV, Parquet, or ORC
- Smart array handling preserves simple arrays, extracts complex arrays to child tables
- Read JSON, JSON Lines, JSON5, HJSON files
- Stream processing for large datasets
- Configurable ID generation strategies
API
flatten(data, name, config) — Flatten data in memory
result = tm.flatten("data.json", name="products")
result = tm.flatten([{"id": 1}, {"id": 2}])
result.save("output.parquet")
flatten_stream(data, output_path, name, output_format) — Stream directly to disk
tm.flatten_stream("large.jsonl", "output/", name="events", output_format="parquet")
Configuration
config = tm.TransmogConfig(
array_mode=tm.ArrayMode.SMART, # SMART, SEPARATE, INLINE, SKIP
id_generation="random", # random, natural, hash, or ["field1", "field2"]
id_field="_id",
parent_field="_parent_id",
time_field="_timestamp",
include_nulls=False,
max_depth=100,
batch_size=1000
)
result = tm.flatten(data, config=config)
Array Modes
| Mode | Behavior |
|---|---|
SMART |
Preserve simple arrays, extract complex arrays to child tables |
SEPARATE |
Extract all arrays to child tables |
INLINE |
Serialize arrays as JSON strings |
SKIP |
Omit arrays from output |
ID Generation
| Strategy | Description |
|---|---|
random |
Generate random UUID (default) |
natural |
Use existing ID field from data |
hash |
Deterministic hash of entire record |
["field1", ...] |
Deterministic hash of specified fields |
Documentation
Full documentation: scottdraper8.github.io/transmog
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file transmog-2.0.1.tar.gz.
File metadata
- Download URL: transmog-2.0.1.tar.gz
- Upload date:
- Size: 23.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d068d2a96ee07b51cde91b2de19c1f45bc0ae7514b0f3e8ff643b348e33449ee
|
|
| MD5 |
620f11f2e0b0c89c8cca609e815b7f7b
|
|
| BLAKE2b-256 |
be6e4164852527650ff4c68871a80a635cc5f989e3f42a705c647bb62960b864
|
File details
Details for the file transmog-2.0.1-py3-none-any.whl.
File metadata
- Download URL: transmog-2.0.1-py3-none-any.whl
- Upload date:
- Size: 28.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2555919717c56d21b7e83fc314bc7fb7bba2277660db58e4d3cc132c657bb9f
|
|
| MD5 |
57fbb5cc42284318010c5d674568ac4f
|
|
| BLAKE2b-256 |
d5eb5d281cd594278f515f0a34bdc313cf337c4b00f004f3d86199832cb09fbb
|