A data transformation library for flattening complex nested structures into tabular formats while preserving hierarchical relationships

These details have not been verified by PyPI

Project links

Project description

Transmog

Transform nested data into flat tables with a simple, intuitive API.

Overview

Transmog transforms nested JSON data into flat, tabular formats while preserving relationships between parent and child records.

Key Features:

Simple one-function API with smart defaults
Multiple output formats (CSV, Parquet)
Automatic relationship preservation
Memory-efficient streaming for large datasets

Installation

Standard install (includes Parquet support):

pip install transmog

Minimal install (CSV only):

pip install transmog[minimal]

Quick Start

import transmog as tm

# Transform nested data into flat tables
data = {"product_id": "PROD-123", "name": "Gaming Laptop", "specs": {"cpu": "i7", "ram": "16GB"}}
result = tm.flatten(data, name="products")

# Access flattened data in memory (list of dicts)
print(result.main)
# [{'product_id': 'PROD-123', 'name': 'Gaming Laptop', 'specs_cpu': 'i7', 'specs_ram': '16GB', '_id': '...', '_timestamp': '...'}]

# Save to files in different formats
result.save("products.csv")        # Single CSV file
result.save("products.parquet")    # Single Parquet file

Example: Nested JSON to Multiple Tables

Transform complex nested data with arrays intelligently using smart mode (default):

data = {
    "user": {"name": "Alice", "email": "alice@example.com"},
    "tags": ["premium", "verified"],  # Simple array - kept as native array
    "orders": [  # Complex array - exploded to child table
        {"id": 101, "amount": 99.99, "items": ["laptop", "mouse"]},
        {"id": 102, "amount": 45.50, "items": ["keyboard"]}
    ]
}

result = tm.flatten(data, name="customer")

# Main table - flattened user data with native arrays
print(result.main)
# [
#   {
#     'user_name': 'Alice',
#     'user_email': 'alice@example.com',
#     'tags': ['premium', 'verified'],  # Native array!
#     '_id': '...',
#     '_timestamp': '...'
#   }
# ]

# Complex arrays become separate tables with parent references
print(result.tables["customer_orders"])
# [
#   {'id': 101, 'amount': 99.99, 'items': ['laptop', 'mouse'], '_parent_id': '...', '_id': '...', '_timestamp': '...'},
#   {'id': 102, 'amount': 45.50, 'items': ['keyboard'], '_parent_id': '...', '_id': '...', '_timestamp': '...'}
# ]

# Access all tables in memory
print(f"Created {len(result.all_tables)} tables:")
print(list(result.all_tables.keys()))
# ['customer', 'customer_orders', 'customer_orders_items']

# Save to different formats for analysis
result.save("analytics/", "csv")       # CSV files for database import
result.save("warehouse/", "parquet")   # Parquet files for data warehouse

Configuration

Customize processing behavior with TransmogConfig:

# Default configuration
result = tm.flatten(data)

# Include nulls for CSV export (consistent columns)
result = tm.flatten(data, config=tm.TransmogConfig(include_nulls=True))

# Memory-efficient processing (smaller batches)
result = tm.flatten(data, config=tm.TransmogConfig(batch_size=100))

# High-performance processing (larger batches)
result = tm.flatten(data, config=tm.TransmogConfig(batch_size=10000))

File Processing:

result = tm.flatten("data.json")

Advanced Configuration

For more control over the flattening process:

# Create custom configuration
config = tm.TransmogConfig(
    # Array handling
    array_mode=tm.ArrayMode.SEPARATE,  # Extract all arrays to child tables
    # Options: SMART (default), SEPARATE, INLINE, SKIP

    # ID management
    id_generation="natural",           # Use existing ID field (options: random, natural, hash, or list)
    id_field="sku",                    # Name of ID field to use/create
    parent_field="_parent",            # Customize parent reference field name
    time_field="_timestamp",           # Add processing timestamp to records


    # Data processing
    include_nulls=False,               # Skip null and empty values (default: False)
    max_depth=100,                     # Maximum nesting depth

    # Performance tuning
    batch_size=5000,                   # Process more records per batch
)

result = tm.flatten(data, name="products", config=config)

# ID generation options
config = tm.TransmogConfig(id_generation="random")              # Always generate new UUIDs (default)
config = tm.TransmogConfig(id_generation="natural")             # Use existing ID field (fail if missing)
config = tm.TransmogConfig(id_generation="hash")                # Hash entire record (deterministic)
config = tm.TransmogConfig(id_generation=["user_id", "date"])   # Composite key (deterministic)

# Customize configuration as needed
config = tm.TransmogConfig(include_nulls=True)  # For consistent CSV columns
config.id_field = "product_id"
result = tm.flatten(data, config=config)

Documentation

Complete documentation is available at scottdraper8.github.io/transmog, including:

Contributing

For contribution guidelines, development setup, and coding standards, see the Contributing Guide in the documentation.

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.0.4

Mar 5, 2026

2.0.3

Feb 14, 2026

2.0.2

Feb 2, 2026

2.0.1

Nov 19, 2025

This version

2.0.0

Nov 12, 2025

1.2.0 yanked

Apr 25, 2025

Reason this release was yanked:

Misprint of version

1.1.1

Nov 6, 2025

1.1.0

Jul 1, 2025

1.0.6

Jun 3, 2025

1.0.5

Jun 2, 2025

1.0.4

May 27, 2025

1.0.3

May 23, 2025

1.0.2

May 22, 2025

1.0.1

May 19, 2025

1.0.0

May 16, 2025

0.1.2.5

Apr 25, 2025

0.1.2

Apr 25, 2025

0.1.1

Apr 25, 2025

0.1.0

Apr 24, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transmog-2.0.0.tar.gz (23.5 kB view details)

Uploaded Nov 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

transmog-2.0.0-py3-none-any.whl (28.2 kB view details)

Uploaded Nov 12, 2025 Python 3

File details

Details for the file transmog-2.0.0.tar.gz.

File metadata

Download URL: transmog-2.0.0.tar.gz
Upload date: Nov 12, 2025
Size: 23.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for transmog-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`30619e1fa5c997818061eccb0d31d2889e70162e03517f149392f99779992000`
MD5	`e08725f02cfeea040e592598923b1ce5`
BLAKE2b-256	`dddc296c3cd6cd1b4afd880c66dd5384684c8b7ad84bd6cabc1d20307f183874`

See more details on using hashes here.

File details

Details for the file transmog-2.0.0-py3-none-any.whl.

File metadata

Download URL: transmog-2.0.0-py3-none-any.whl
Upload date: Nov 12, 2025
Size: 28.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for transmog-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ed02fee87e54f500a4bc0eff91905987918fb44bf1b2ac61d83954bd6b099db9`
MD5	`c6b51948bbb802f059e30bf444af5d89`
BLAKE2b-256	`a71bf52bbec29213860a559fde21632d429e3c6ce67d2949527be255a4d5b459`

See more details on using hashes here.

transmog 2.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Transmog

Overview

Installation

Quick Start

Example: Nested JSON to Multiple Tables

Configuration

Advanced Configuration

Documentation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes