Stop writing custom parsers for every data format. Flatten anything.

These details have not been verified by PyPI

Project links

Project description

Flatten Anything 🔨

Stop writing custom parsers for every data format. Flatten anything.

The Problem

Every data pipeline starts the same way: "I have this nested JSON file, and I need to flatten it." Then next week: "Now it's XML." Then: "The client sent Excel files." Before you know it, you have 200 lines of custom parsing code for each format.

The Solution

from flatten_anything import flatten, ingest

# That's it. That's the whole library.
data = ingest('your_nightmare_file.json')
flat = flatten(data)

It just works. No matter what format. No matter how nested.

What's New in v1.1

🚀 Streaming Support

Process files larger than memory without breaking a sweat:

# Stream a 10GB CSV file
for chunk in ingest('huge_file.csv', stream=True):
    flat = flatten(chunk)
    # Process each chunk without loading entire file

🎯 Smarter Flattening

New records parameter intelligently handles multiple records:

# Automatically flattens each record separately (new default!)
data = ingest('users.csv')
flat = flatten(data)  # Returns list of flattened records

# Or treat as single structure when needed
flat = flatten(data, records=False)  # Flattens entire structure

Installation

Basic Installation

# Core installation (JSON, CSV, YAML, XML, API support)
pip install flatten-anything

With Optional Format Support

# Add Parquet support
pip install flatten-anything[parquet]

# Add Excel support
pip install flatten-anything[excel]

# Install everything
pip install flatten-anything[all]

Format Support Matrix

Format	Core Install	Optional Install	Streaming
JSON/JSONL	✅ Included	-	✅ JSONL only
CSV/TSV	✅ Included	-	✅ Yes
YAML	✅ Included	-	❌ No
XML	✅ Included	-	❌ No
API/URLs	✅ Included	-	❌ No
Parquet	❌	`pip install flatten-anything[parquet]`	✅ Yes
Excel	❌	`pip install flatten-anything[excel]`	❌ No

Quick Start

Basic Usage

from flatten_anything import flatten, ingest

# Load any supported file format
data = ingest('data.json')

# Flatten it (automatically handles single vs multiple records)
flat = flatten(data)

Streaming Large Files

# Process huge files in chunks
for chunk in ingest('massive.csv', stream=True, chunk_size=10000):
    flat_records = flatten(chunk)
    # Process chunk (e.g., write to database, analyze, etc.)
    process_records(flat_records)

Real-world Example

# Your horrible nested JSON
data = {
    "user": {
        "name": "John",
        "contacts": {
            "emails": ["john@example.com", "john@work.com"],
            "phones": {
                "home": "555-1234",
                "work": "555-5678"
            }
        }
    },
    "metrics": [1, 2, 3]
}

flat = flatten(data)
# {
#     'user.name': 'John',
#     'user.contacts.emails.0': 'john@example.com',
#     'user.contacts.emails.1': 'john@work.com',
#     'user.contacts.phones.home': '555-1234',
#     'user.contacts.phones.work': '555-5678',
#     'metrics.0': 1,
#     'metrics.1': 2,
#     'metrics.2': 3
# }

Multiple Records Handling

# CSV data with multiple records
users = [
    {"name": "Alice", "age": 30, "city": "NYC"},
    {"name": "Bob", "age": 25, "city": "LA"}
]

# Default: flatten each record (records=True)
flat = flatten(users)
# [
#     {"name": "Alice", "age": 30, "city": "NYC"},
#     {"name": "Bob", "age": 25, "city": "LA"}
# ]

# Flatten as single structure (records=False)
flat = flatten(users, records=False)
# {
#     "0.name": "Alice", "0.age": 30, "0.city": "NYC",
#     "1.name": "Bob", "1.age": 25, "1.city": "LA"
# }

Advanced Usage

Integrate with pandas

import pandas as pd

# Method 1: Load entire file
data = ingest('data.csv')
flat = flatten(data)
df = pd.DataFrame(flat)

# Method 2: Stream large files
dfs = []
for chunk in ingest('huge.csv', stream=True, chunk_size=5000):
    flat_chunk = flatten(chunk)
    dfs.append(pd.DataFrame(flat_chunk))
final_df = pd.concat(dfs, ignore_index=True)

Control Empty Lists

data = {"items": [], "count": 0}

# Preserve empty lists (default)
flatten(data, preserve_empty_lists=True)
# {"items": [], "count": 0}

# Remove empty lists
flatten(data, preserve_empty_lists=False)
# {"count": 0}

Memory-Efficient Pipeline

from pathlib import Path

# Process directory of large files without memory issues
for filepath in Path('data/').glob('*.csv'):
    for chunk in ingest(filepath, stream=True):
        flat = flatten(chunk)
        # Process and immediately discard to save memory
        send_to_database(flat)

API Reference

ingest()

ingest(source, format=None, stream=False, chunk_size=5000, **kwargs)

source: File path or URL to ingest
format: Optional format override. Auto-detected if not specified
stream: Enable streaming for large files (supported formats only)
chunk_size: Records per chunk when streaming
Returns: List of records or generator if streaming

flatten()

flatten(data, prefix="", preserve_empty_lists=True, records=True)

data: Data structure to flatten
prefix: Key prefix (used internally for recursion)
preserve_empty_lists: Keep or remove empty lists
records: Treat list as multiple records (True) or single structure (False)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.1.1

Sep 24, 2025

1.1.0

Sep 23, 2025

1.0.1

Sep 15, 2025

1.0.0

Sep 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flatten_anything-1.1.1.tar.gz (26.8 kB view details)

Uploaded Sep 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

flatten_anything-1.1.1-py3-none-any.whl (33.4 kB view details)

Uploaded Sep 24, 2025 Python 3

File details

Details for the file flatten_anything-1.1.1.tar.gz.

File metadata

Download URL: flatten_anything-1.1.1.tar.gz
Upload date: Sep 24, 2025
Size: 26.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for flatten_anything-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`333f4fb68cf0d9ff157269ecfb1b2e1db47a143fe97d485d4194ce1427cdbd14`
MD5	`e044660e92f52a9537b28a1b093a7491`
BLAKE2b-256	`15ff5f01a71369fbc7a6beff097833886215a7aa5906554b91da867970d652f6`

See more details on using hashes here.

File details

Details for the file flatten_anything-1.1.1-py3-none-any.whl.

File metadata

Download URL: flatten_anything-1.1.1-py3-none-any.whl
Upload date: Sep 24, 2025
Size: 33.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for flatten_anything-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`07cda183ab0859a9ba51474cc7ecb96372275b83cfb6095b695bfaf93bc167c0`
MD5	`52f0d4a16e21c12caca21f1aaec1ea91`
BLAKE2b-256	`5ea447299e5bb1b884b5a162bc47eefbe8192d3a5522f2cf8927720addea8a96`

See more details on using hashes here.

flatten-anything 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Flatten Anything 🔨

The Problem

The Solution

What's New in v1.1

🚀 Streaming Support

🎯 Smarter Flattening

Installation

Basic Installation

With Optional Format Support

Format Support Matrix

Quick Start

Basic Usage

Streaming Large Files

Real-world Example

Multiple Records Handling

Advanced Usage

Integrate with pandas

Control Empty Lists

Memory-Efficient Pipeline

API Reference

ingest()

flatten()

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes