A data transformation library for flattening complex nested structures into tabular formats while preserving hierarchical relationships
Project description
Transmog
A Python library for transforming complex nested data structures into flat, tabular formats while preserving hierarchical relationships.
Features
- Multiple Input Formats: JSON, JSONL, CSV
- Nested Structure Handling: Flattens deeply nested objects with customizable separators
- Array Processing: Extracts arrays as child tables with parent-child relationships maintained
- Output Options: Python dictionaries, PyArrow tables, JSON, CSV, Parquet
- Performance Features: Chunked processing, streaming output, memory optimization
- Data Integrity: Deterministic ID generation, consistent parent-child linking
- Error Recovery: Configurable strategies for handling malformed data
Installation
pip install transmog
Optional dependencies:
pip install transmog[dev] # Development tools
Quick Example
import transmog as tm
# Sample nested data
data = {
"user": {
"id": 1,
"name": "John Doe",
"contact": {
"email": "john@example.com"
},
"orders": [
{"id": 101, "amount": 99.99},
{"id": 102, "amount": 45.50}
]
}
}
# Process the data
processor = tm.Processor()
result = processor.process(data)
# Access the data
tables = result.to_dict()
main_table = tables["main"]
orders = tables["user_orders"]
# Export to different formats
result.write_all_json("output/json")
result.write_all_csv("output/csv")
result.write_all_parquet("output/parquet")
Configuration
# Use pre-configured modes
config = tm.TransmogConfig.memory_optimized()
# or
config = tm.TransmogConfig.performance_optimized()
# Custom configuration
config = (
tm.TransmogConfig.default()
.with_naming(separator=".")
.with_processing(cast_to_string=True)
.with_metadata(id_field="custom_id")
.with_error_handling(max_retries=3)
)
processor = tm.Processor(config=config)
Large Dataset Processing
# Memory-optimized processing
processor = tm.Processor.memory_optimized()
# Chunked processing
result = processor.process_chunked(
"large_data.jsonl",
entity_name="records",
chunk_size=1000
)
# Streaming output
processor.stream_process_file(
"large_data.jsonl",
entity_name="records",
output_format="parquet",
output_destination="output_dir"
)
Error Handling
# Skip and log errors
processor = tm.Processor().with_error_handling(recovery_strategy="skip")
# Partial recovery (preserves valid portions)
processor = tm.Processor.with_partial_recovery()
Documentation
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file transmog-1.0.6.tar.gz.
File metadata
- Download URL: transmog-1.0.6.tar.gz
- Upload date:
- Size: 88.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffde0788e96553b2d2d215a35365119e6a2c909ca2841ac2cf2544b4d603a2a1
|
|
| MD5 |
d579191382734c656284c379f663e164
|
|
| BLAKE2b-256 |
02451bda770165b91e583bad1113c59668c588659137c20868246ebf03e97378
|
File details
Details for the file transmog-1.0.6-py3-none-any.whl.
File metadata
- Download URL: transmog-1.0.6-py3-none-any.whl
- Upload date:
- Size: 112.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c1ee0abd99d4e2bf65351836ab949e89b5fd7b315e9b6d5b48c5814a6962417e
|
|
| MD5 |
0d08dccdad721be62a19a3a894ed3bc4
|
|
| BLAKE2b-256 |
52e269f361299e0d842a69d52ad09b767e709e5ea458dada38d405188590ccfc
|