Powerful deep search for nested dict/list structures

These details have not been verified by PyPI

Project links

Project description

nestfind

Powerful deep search for nested dict/list structures in Python.

Traverses arbitrarily nested dict/list data using a flexible path-based syntax, supporting fallback paths, multiple sources, wildcard matching, predicate filtering, and more.

Installation

pip install nestfind

Quick Start

from nestfind import deep_search

data = {
    "user": {
        "profile": {
            "name": "Alice",
            "email": "alice@example.com"
        }
    }
}

deep_search(data, "user", "profile", "name")   # → "Alice"
deep_search(data, "email")                      # → "alice@example.com"  (wide search)

Path Segment Types

Segment	Description	Example
`str`	Wide search key — finds key anywhere in nested structure	`"name"`
`str + "!"`	Condition key — returns the parent dict containing this key	`"uri!"`
`str + "?"`	Optional key — exact match only, skips wide search if not found	`"nickname?"`
`"*"`	Wildcard — matches ALL keys/items at this level	`"*"`
`int`	List index — exact positional access, supports negative	`0`, `-1`
`callable`	Predicate filter — include item only if callable returns truthy	`lambda u: u.get("active")`

Modes

Single path

deep_search(data, "a", "b", "c")

Fallback mode — tries paths in order, returns first non-empty result

deep_search(data, ["uri"], ["browser_native_hd_url"])

Multi-source mode — each list is `[source, *keys]`

deep_search([source1, "key1"], [source2, "key2", "key3"])

Parameters

Parameter	Type	Default	Description
`return_first`	`bool`	`True`	Return first match or list of all matches
`default`	`Any`	`None`	Value to return if nothing found
`type_filter`	`type` or `tuple`	`None`	Only return results of this type
`value_filter`	`callable`	`None`	Only return results where `value_filter(v)` is truthy
`transform`	`callable`	`None`	Apply function to each result before returning
`max_depth`	`int`	`None`	Maximum nesting depth for wide search
`exclude_keys`	`list[str]`	`None`	Skip these keys during wide search
`strict`	`bool`	`False`	Disable wide search — exact path traversal only
`with_path`	`bool`	`False`	Return `(value, path)` tuples instead of bare values
`debug`	`bool`	`False`	Enable debug logging

Examples

from nestfind import deep_search, DeepSearch

data = {
    "users": [
        {"id": 1, "name": "Alice", "active": True},
        {"id": 2, "name": "Bob",   "active": False},
    ]
}

# Get all emails using wildcard
deep_search(data, "users", "*", "name", return_first=False)
# → ["Alice", "Bob"]

# Filter with predicate
deep_search(data, "users", lambda u: u.get("active"), "name")
# → "Alice"

# Return with path
deep_search(data, "name", with_path=True)
# → ("Alice", ["users", 0, "name"])

# Type filter
deep_search(data, "id", type_filter=int)
# → 1

# Class wrapper — bind config once, reuse
ds = DeepSearch(exclude_keys=["metadata"], max_depth=5)
ds(data, "users", "*", "name", return_first=False)
# → ["Alice", "Bob"]

`DeepSearch` class

Bind configuration once and reuse across calls:

class FacebookMapper:
    deep_search = DeepSearch(exclude_keys=["metadata"])

    def map(self, raw):
        return self.deep_search(raw, "user", "name")

Advanced Examples

Parsing inconsistent API responses

Real-world APIs often return the same data under different keys depending on the endpoint or version. Use fallback mode to handle all variants transparently:

# Instagram-style response — video URL can live under many keys
media = {
    "video_versions": [
        {"type": 101, "url": "https://cdn.example.com/video_hd.mp4"},
        {"type": 102, "url": "https://cdn.example.com/video_sd.mp4"},
    ]
}

url = deep_search(
    media,
    ["video_versions", 0, "url"],       # preferred: first video version
    ["video_dash_manifest"],             # fallback 1
    ["browser_native_hd_url"],           # fallback 2
    ["browser_native_sd_url"],           # fallback 3
)
# → "https://cdn.example.com/video_hd.mp4"

Multi-source with priority

When you have multiple raw payloads and want the first one that has a given value:

post    = {"media": {"image_versions": {"candidates": [{"url": "https://img.example.com/post.jpg"}]}}}
story   = {}   # empty / missing
reel    = {"image_versions": {"candidates": [{"url": "https://img.example.com/reel.jpg"}]}}

thumbnail = deep_search(
    [story,  "image_versions", "candidates", 0, "url"],
    [post,   "media", "image_versions", "candidates", 0, "url"],
    [reel,   "image_versions", "candidates", 0, "url"],
)
# → "https://img.example.com/post.jpg"  (story was empty, post matched first)

Wildcard + predicate chaining

Collect the display URL of every video item in a feed that has more than 1M views:

feed = {
    "items": [
        {"media_type": 2, "view_count": 1_500_000, "video_url": "https://cdn.example.com/a.mp4"},
        {"media_type": 1, "view_count": 3_000_000, "image_url": "https://cdn.example.com/b.jpg"},
        {"media_type": 2, "view_count": 800_000,   "video_url": "https://cdn.example.com/c.mp4"},
        {"media_type": 2, "view_count": 2_200_000, "video_url": "https://cdn.example.com/d.mp4"},
    ]
}

viral_videos = deep_search(
    feed,
    "items",
    lambda item: item.get("media_type") == 2 and item.get("view_count", 0) > 1_000_000,
    "video_url",
    return_first=False,
)
# → ["https://cdn.example.com/a.mp4", "https://cdn.example.com/d.mp4"]

Condition key `"!"` — grab the parent dict

Useful when you need the whole object that contains a specific key, not just the value at that key:

story = {
    "reel": {
        "items": [
            {
                "id": "abc123",
                "media": {
                    "uri": "https://cdn.example.com/story.mp4",
                    "width": 1080,
                    "height": 1920,
                }
            }
        ]
    }
}

# Get the entire media dict that contains "uri", not just the uri value
media_obj = deep_search(story, "media", "uri!")
# → {"uri": "https://cdn.example.com/story.mp4", "width": 1080, "height": 1920}

# Now you can access sibling keys directly
print(media_obj["width"], media_obj["height"])   # 1080 1920

Optional key `"?"` — graceful missing fields

Skip a segment silently when it may or may not exist, without falling back to wide search:

user_a = {"profile": {"display_name": "Alice",  "nickname": "ali"}}
user_b = {"profile": {"display_name": "Bob"}}   # no nickname

# "nickname?" won't error or wide-search if missing — just moves on
for user in [user_a, user_b]:
    label = deep_search(
        user,
        "profile", "nickname?",     # use nickname if present …
        default=deep_search(user, "profile", "display_name"),  # … else display_name
    )
    print(label)
# → "ali"
# → "Bob"

`with_path` — audit where a value came from

When debugging deeply nested structures, knowing where a value was found is as important as the value itself:

config = {
    "services": {
        "auth": {
            "database": {
                "host": "db-auth.internal",
                "port": 5432,
            }
        },
        "api": {
            "database": {
                "host": "db-api.internal",
                "port": 5432,
            }
        }
    }
}

results = deep_search(config, "host", return_first=False, with_path=True)
# → [
#     ("db-auth.internal", ["services", "auth", "database", "host"]),
#     ("db-api.internal",  ["services", "api",  "database", "host"]),
# ]

for value, path in results:
    print(" → ".join(str(p) for p in path), "=", value)
# services → auth → database → host = db-auth.internal
# services → api  → database → host = db-api.internal

`transform` + `type_filter` — extract and reshape in one pass

raw = {
    "stats": {
        "impressions": "12400",   # string from API
        "clicks":      "837",
        "spend":       "42.50",
    }
}

# Pull all numeric-looking strings and cast to float in one call
values = deep_search(
    raw,
    "stats",
    "*",
    return_first=False,
    value_filter=lambda v: isinstance(v, str) and v.replace(".", "").isdigit(),
    transform=float,
)
# → [12400.0, 837.0, 42.5]

`exclude_keys` + `max_depth` — scoped search in large payloads

Prevent the wide search from wandering into noisy or irrelevant subtrees:

response = {
    "data": {
        "user": {"id": 1, "name": "Alice"},
    },
    "metadata": {
        "user": {"id": 999, "name": "__system__"},   # should be ignored
    },
    "debug": {
        "trace": {"user": {"id": -1}}                # deep noise, also ignored
    }
}

name = deep_search(
    response,
    "user", "name",
    exclude_keys=["metadata", "debug"],
    max_depth=3,
)
# → "Alice"  (metadata and debug subtrees are skipped entirely)

`strict=True` — exact path, no surprises

When you know the exact structure and want to disable wide search for performance or correctness:

data = {
    "a": {
        "b": {
            "c": 42,
            "extra": {"c": 999}   # would be found by wide search
        }
    }
}

deep_search(data, "a", "b", "c")                    # → 42  (wide search off by default for exact hit)
deep_search(data, "a", "b", "c", strict=True)       # → 42  (exact path only)
deep_search(data, "b", "c",     strict=True)        # → None (strict: won't descend into "a" automatically)

Reusable mapper class with `DeepSearch`

Bind a shared configuration at the class level and override per-call as needed:

from nestfind import DeepSearch

class InstagramMediaMapper:
    ds = DeepSearch(exclude_keys=["debug", "logging"], max_depth=8)

    def map(self, raw: dict) -> dict:
        return {
            "id":        self.ds(raw, "pk"),
            "shortcode": self.ds(raw, "code"),
            "type":      self.ds(raw, "media_type", type_filter=int),
            "url":       self.ds(
                             raw,
                             ["video_versions", 0, "url"],
                             ["image_versions", "candidates", 0, "url"],
                         ),
            "width":     self.ds(raw, "original_width",  type_filter=int),
            "height":    self.ds(raw, "original_height", type_filter=int),
            "owner_id":  self.ds(raw, "owner", "pk"),
            "timestamp": self.ds(raw, "taken_at",        type_filter=int),
        }

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

May 11, 2026

0.1.0

May 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nestfind-0.1.1.tar.gz (8.2 kB view details)

Uploaded May 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nestfind-0.1.1-py3-none-any.whl (9.1 kB view details)

Uploaded May 11, 2026 Python 3

File details

Details for the file nestfind-0.1.1.tar.gz.

File metadata

Download URL: nestfind-0.1.1.tar.gz
Upload date: May 11, 2026
Size: 8.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nestfind-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`4b202ea926e4275ca8964cf0cff402f207187a68c01f2e1bb889c023ae654d18`
MD5	`3682b123d213ba3437f66780d49bc806`
BLAKE2b-256	`6f3625ab3f40bd8819db4324cbffe5fffe2daff925098db90bba7b36caefb2cf`

See more details on using hashes here.

File details

Details for the file nestfind-0.1.1-py3-none-any.whl.

File metadata

Download URL: nestfind-0.1.1-py3-none-any.whl
Upload date: May 11, 2026
Size: 9.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nestfind-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`03120d2641f4937a71a3309f4e7c7cd95535c6c62148825ca81dbdf36094e8cb`
MD5	`effbe5cff2105a8a957a84bc5278c859`
BLAKE2b-256	`893e9d6be82c11a1c4f96de47ab9a588eb90233b2184f03d615dc97852ab1014`

See more details on using hashes here.

nestfind 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

nestfind

Installation

Quick Start

Path Segment Types

Modes

Single path

Fallback mode — tries paths in order, returns first non-empty result

Multi-source mode — each list is [source, *keys]

Parameters

Examples

DeepSearch class

Advanced Examples

Parsing inconsistent API responses

Multi-source with priority

Wildcard + predicate chaining

Condition key "!" — grab the parent dict

Optional key "?" — graceful missing fields

with_path — audit where a value came from

transform + type_filter — extract and reshape in one pass

exclude_keys + max_depth — scoped search in large payloads

strict=True — exact path, no surprises

Reusable mapper class with DeepSearch

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Multi-source mode — each list is `[source, *keys]`

`DeepSearch` class

Condition key `"!"` — grab the parent dict

Optional key `"?"` — graceful missing fields

`with_path` — audit where a value came from

`transform` + `type_filter` — extract and reshape in one pass

`exclude_keys` + `max_depth` — scoped search in large payloads

`strict=True` — exact path, no surprises

Reusable mapper class with `DeepSearch`