Stop writing custom parsers for every data format. Flatten anything.

These details have not been verified by PyPI

Project links

Project description

Flatten Anything 🔨

Stop writing custom parsers for every data format. Flatten anything.

The Problem

Every data pipeline starts the same way: "I have this nested JSON file, and I need to flatten it." Then next week: "Now it's XML." Then: "The client sent Excel files." Before you know it, you have 200 lines of custom parsing code for each format.

The Solution

from flatten_anything import flatten, ingest

# That's it. That's the whole library.
data = ingest('your_nightmare_file.json')
flat = flatten(data)

It just works. No matter what garbage is in your file.

Installation

Basic Installation

# Core installation (JSON, CSV, YAML, XML, API support)
pip install flatten-anything

With Optional Format Support

# Add Parquet support
pip install flatten-anything[parquet]

# Add Excel support
pip install flatten-anything[excel]

# Install everything
pip install flatten-anything[all]

What's Included

Format	Core Install	Optional Install
JSON/JSONL	✅ Included	-
CSV/TSV	✅ Included	-
YAML	✅ Included	-
XML	✅ Included	-
API/URLs	✅ Included	-
Parquet	❌	`pip install flatten-anything[parquet]`
Excel	❌	`pip install flatten-anything[excel]`

The core package is kept lightweight (~35MB) while Parquet and Excel support can add ~100MB+ if you need them.

Quick Start

Flatten nested JSON

from flatten_anything import flatten, ingest

# Load any supported file format
data = ingest('deeply_nested.json')

# Flatten it
flat = flatten(data)

# {'user.name': 'John', 'user.address.city': 'NYC', 'user.scores.0': 100}

Real-world example

# Your horrible nested JSON
data = {
    "user": {
        "name": "John",
        "contacts": {
            "emails": ["john@example.com", "john@work.com"],
            "phones": {
                "home": "555-1234",
                "work": "555-5678"
            }
        }
    },
    "metrics": [1, 2, 3]
}

flat = flatten(data)
# {
#     'user.name': 'John',
#     'user.contacts.emails.0': 'john@example.com',
#     'user.contacts.emails.1': 'john@work.com',
#     'user.contacts.phones.home': '555-1234',
#     'user.contacts.phones.work': '555-5678',
#     'metrics.0': 1,
#     'metrics.1': 2,
#     'metrics.2': 3
# }

Works with any format

# JSON
data = ingest('data.json')

# CSV  
data = ingest('data.csv')

# Parquet
data = ingest('data.parquet')

# Excel
data = ingest('data.xlsx')

# XML
data = ingest('data.xml')

# YAML
data = ingest('config.yaml')

# All flatten the same way
flat = flatten(data)

Supported Formats

Format	Extensions	Status
JSON	`.json`	✅ Fully supported
JSONL	`.jsonl`	✅ Fully supported
CSV	`.csv`, `.tsv`	✅ Fully supported
Parquet	`.parquet`, `.parq`	✅ Fully supported
Excel	`.xlsx`, `.xls`	✅ Fully supported
XML	`.xml`	✅ Fully supported
YAML	`.yaml`, `.yml`	✅ Fully supported

Why Flatten Anything?

Zero configuration - No schemas, no options, just works
Production ready - Handle nulls, mixed types, empty arrays without crashing
Actually tested - On real messy production data, not toy examples
Minimal dependencies - Just the essentials (pandas, pyyaml, etc.)
One job - Flatten data. That's it. No bloat.

Advanced Usage

Control the output structure

# Have multiple records? Each gets flattened
data = ingest('multiple_records.json')  # List of records
flattened_records = [flatten(record) for record in data]

Integrate with pandas

import pandas as pd

# Flatten and convert to DataFrame
data = ingest('nested_data.json')
flat = flatten(data)
df = pd.DataFrame([flat])

Pipeline ready

# Chain with your existing workflow
for filename in Path('data/').glob('*.json'):
    data = ingest(filename)
    flat = flatten(data)
    # Your analysis here
    process_data(flat)

Use Cases

Data Engineering: Normalize data lakes with mixed formats
ETL Pipelines: Consistent structure regardless of source format
Data Analysis: Flatten nested JSON APIs into DataFrames
Log Processing: Convert nested log formats to flat structures
Config Management: Flatten complex YAML/JSON configs for validation

FAQ

Q: What happens with null values?
A: They're preserved. {'a': {'b': null}} becomes {'a.b': None}

Q: What about empty arrays?
A: They're kept. {'items': []} becomes {'items': []}

Q: Can it handle huge files?
A: Currently loads into memory. Streaming support coming in v1.1.

Q: What if my JSON has inconsistent structure?
A: It still works. Missing keys are simply not included in the output.

Contributing

Found a bug? File that doesn't flatten? Open an issue with a sample file.

PRs welcome, especially for:

More file formats
Performance improvements
Edge case handling

License

MIT - Use it however you want.

Roadmap

✅ v1.0 - Core flattening for common formats
🔄 v1.1 - Streaming support for large files
📋 v1.2 - API endpoint support with pagination
🔮 v1.3 - HDF5 and scientific formats

Built with frustration at writing the same parsing code for the 100th time.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.1.1

Sep 24, 2025

1.1.0

Sep 23, 2025

This version

1.0.1

Sep 15, 2025

1.0.0

Sep 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flatten_anything-1.0.1.tar.gz (15.7 kB view details)

Uploaded Sep 15, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

flatten_anything-1.0.1-py3-none-any.whl (14.9 kB view details)

Uploaded Sep 15, 2025 Python 3

File details

Details for the file flatten_anything-1.0.1.tar.gz.

File metadata

Download URL: flatten_anything-1.0.1.tar.gz
Upload date: Sep 15, 2025
Size: 15.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for flatten_anything-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`816514f0f3c39e69c00bd73dc84d64a928c4946a10b711dacee3d6cc51d4dae5`
MD5	`c93dd8145703bbf0ee375e995ccb6e1b`
BLAKE2b-256	`7174fcc345c39abee40efc51be6c97016fedc43acfce3e2a04729927a02a8608`

See more details on using hashes here.

File details

Details for the file flatten_anything-1.0.1-py3-none-any.whl.

File metadata

Download URL: flatten_anything-1.0.1-py3-none-any.whl
Upload date: Sep 15, 2025
Size: 14.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for flatten_anything-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5ae5ff9ba5edca7ac40cbaccb706a95c7ef135a9c3d374a431012120abe98431`
MD5	`5118b3f106baf6872567af2b25d4d912`
BLAKE2b-256	`b4fb825cf61d665205eaa74064222869a06653ae4ebe2e724f941858f37077ff`

See more details on using hashes here.

flatten-anything 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Flatten Anything 🔨

The Problem

The Solution

Installation

Basic Installation

With Optional Format Support

What's Included

Quick Start

Flatten nested JSON

Real-world example

Works with any format

Supported Formats

Why Flatten Anything?

Advanced Usage

Control the output structure

Integrate with pandas

Pipeline ready

Use Cases

FAQ

Contributing

License

Roadmap

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes