Skip to main content

Find files by describing them in natural language, with the generated filter sandboxed in Docker.

Project description

nfind

Find files by describing them in natural language.

The name is short for natural-findfind, but driven by a natural-language description instead of a filter expression. nfind takes a plain-English description, asks an LLM to write a small filter function for it — in Python (filter_paths) or Node.js (filterPaths) — and runs that function against your file tree to print the matching paths — a natural-language cousin of find.

The generated code is never executed on your machine directly. It runs inside a disposable, hardened Docker container with the search directory bind-mounted read-only, networking disabled, all Linux capabilities dropped, and CPU, memory, and process limits applied. The container can therefore inspect file contents and metadata to answer richer questions, but it cannot modify your files or reach the network.

Why nfind?

nfind sits in a gap no other file-search tool fills. It combines three things at once:

  1. Natural language — you describe what you want, not a query grammar or a find incantation.
  2. A real generated program, not a one-liner — the LLM writes an actual Python/Node filter, so it can express structural, relational, and computed questions (e.g. "directories that contain only audio files", "Python files and their line counts") that a glob or a single find predicate can't.
  3. Local, sandboxed execution that reads file contents — the program runs over your real tree in a read-only, no-network container, so it can open and inspect files — yet your file list and contents never leave the machine (only your prompt is sent to the model).

Each neighbouring category has only part of this:

Tool category Natural language Reads contents / structure Runs locally
find / fd / ripgrep, Spotlight (mdfind) partial
fselect / osquery (SQL over files)
NL→command helpers (sgpt, gh copilot) ✗ (just a one-liner)
Send-the-file-list-to-an-LLM tools (e.g. lfind) ✗ (filenames only)
nfind

In one line: nfind is like asking an analyst to write and run a one-off script against a folder — safely, and without your files leaving your machine. See docs/comparison.md for the full breakdown.

Requirements

  • Python 3.11+
  • Docker installed and running
  • An API key for your provider — OPENAI_API_KEY by default, or the matching key for another provider

Install

uv tool install nfind
# or
pip install nfind

To install from a local checkout:

uv tool install .
# or
pip install .

Usage

export OPENAI_API_KEY=sk-...

# Search the current directory
nfind "directories that contain only audio files"

# Search a specific directory
nfind "Python files that import requests" ./src

# Search specific files (a root may be a file, not just a directory)
nfind "files that define a class" ./src/app.py ./src/models.py

# Help (both forms work)
nfind -h
nfind --help

Output modes

By default nfind prints one path per line, like find. When your prompt asks for extra per-file information, the generated filter attaches it to each result and you can surface it:

# Default: clean, pipeable list of paths
nfind "Python files that import os"

# Verbose: path plus any extra fields the prompt produced
nfind "Python files, and for each the number of lines" --verbose
# /path/to/a.py	lines=42

# JSON: machine-readable records (path plus extra fields) with a count
nfind "Python files, and for each the number of lines" --json
# { "count": 2, "results": [ { "path": "...", "lines": 42 }, ... ] }

--json and --verbose are mutually exclusive. The richer output appears only when the prompt asks for it; otherwise every mode just lists paths.

Runtimes

The model picks the runtime per prompt — Python (default) or Node.js, when the JS/TS ecosystem fits better (e.g. parsing TypeScript with ts-morph). nfind runs the filter in the matching sandbox image; both run under the same isolation. See docs/runtimes.md.

nfind "TypeScript files that export a default, using ts-morph" ./src

Dependencies

Some prompts need a library (reading MP3 tags, image sizes, PDF text). The generated filter declares the packages it imports — pip for Python, npm for Node — and nfind installs them into a derived sandbox image, but only approved packages. A built-in default list (per runtime) installs without asking; new packages are confirmed and then remembered.

nfind "MP3 files whose title tag contains 'live', using mutagen" ~/Music   # prompts if new
nfind "images larger than 4000px on a side" ~/Photos --yes                 # approve without asking
nfind "files containing TODO" . --no-deps                                  # standard library only

The Python defaults include tree-sitter and per-language grammar wheels (tree-sitter-python, -go, -rust, …), so a filter can parse source structure — functions, imports, classes — without a dedicated runtime (the Node.js runtime is reserved for type-aware TS/JS analysis). Packages are installed at image-build time (which needs network); the container that runs the filter still has no network. See docs/dependencies.md.

macOS metadata

On macOS, --macos-meta exposes a small slice of macOS-specific metadata — Finder tags and download provenance (the quarantine flag and "where from" URLs) — to the filter. These live on the host and aren't visible inside the Linux sandbox, so nfind reads them host-side (read-only) and passes them in. This unlocks queries that combine macOS metadata with file contents — something neither Spotlight nor a container-only filter can do alone:

nfind "PDFs I downloaded from the web that mention 'invoice', using pypdf" ~/Downloads --macos-meta
nfind "files tagged Red whose contents contain a TODO" ~/Projects --macos-meta

For pure-metadata lookups ("everything tagged Red"), Spotlight (mdfind) is faster. The flag is a no-op off macOS. See docs/macos-metadata.md.

Reviewing the generated code

The filter is generated by an LLM, so you may want to see it before it runs:

# Print the generated filter (to stderr) before running it
nfind "files with no extension" --show-code

# Save the generated filter to a file
nfind "files with no extension" --save filter.py

# Replay a saved filter through the sandbox (no LLM call, no network)
nfind --run filter.py
nfind --run filter.py ./other-directory   # different search root

# Show the code and ask for confirmation before running (aborts on "no")
nfind "files with no extension" -i        # or --confirm

The code is printed to stderr, so stdout stays a clean, pipeable list of paths even with --show-code. On a terminal it is syntax-highlighted with Pygments; the highlighting is disabled when NO_COLOR is set or when stderr is redirected.

If the model's reply doesn't validate (malformed JSON, wrong function shape, an invalid package name), nfind feeds the error back and retries a few times before giving up; --verbose reports when a retry happens.

The first run builds the worker image for the chosen runtime (nfind-search-paths:latest for Python, nfind-search-node:latest for Node.js); later runs reuse it. Pass --rebuild to force a fresh build.

Useful options

Option Default Purpose
--model gpt-4o-mini Model used to generate the filter; provider/model for non-OpenAI (see Providers)
--timeout 10.0 Seconds the filter may run before it is killed
--memory 256m Worker container memory limit
--cpus 1.0 Worker container CPU limit
--pids-limit 64 Max processes inside the worker
--rebuild off Rebuild the worker image first
--exclude GLOB Skip matching names/paths during enumeration (repeatable)
--no-ignore off Include default ignored directories such as .git and node_modules
--max-depth N unlimited Descend at most N levels below each search path
--verbose / -v off Show extra per-path fields alongside each path
--json off Output records (path + extra fields) as JSON
--print0 / -0 off Separate result paths with NUL bytes for xargs -0
--yes / -y off Approve any requested packages without prompting
--no-deps off Reject third-party packages (standard library only)
--macos-meta off macOS: expose Finder tags and download metadata to the filter
--show-code off Print the generated filter before running
--save Write the generated filter as a replayable script
--run Replay a saved filter through the sandbox without an LLM call
--confirm / -i off Show the code and confirm before running

Providers

By default nfind uses OpenAI. To use another provider, pass --model provider/model; nfind reuses the OpenAI SDK against that provider's OpenAI-compatible endpoint, so there is no extra dependency to install — just set the provider's API key.

nfind "files with no extension"                          # OpenAI (OPENAI_API_KEY)
nfind "..." --model anthropic/claude-sonnet-4-6          # ANTHROPIC_API_KEY
nfind "..." --model gemini/gemini-2.5-flash              # GEMINI_API_KEY
nfind "..." --model groq/llama-3.3-70b-versatile         # GROQ_API_KEY
nfind "..." --model openrouter/<vendor>/<model>          # OPENROUTER_API_KEY (near-universal)
nfind "..." --model ollama/llama3.1                      # local, no key

Supported prefixes: openai, anthropic, gemini, groq, mistral, deepseek, xai, openrouter, ollama, lmstudio. Each reads its own *_API_KEY (local servers need none). nfind handles providers without strict JSON mode automatically. Capable models follow the filter contract best; weaker ones may need a retry or a stronger model.

Example prompts

  • "directories that contain only audio files"
  • "files that have no extension"
  • "directories that contain more than 50 files"
  • "Python virtual environments (directories with a pyvenv.cfg directly inside)"
  • "initialized Terraform root modules"

Library use

from nfind import search

# Returns a list of records, each a dict with at least a "path" key (a host path).
# When the prompt asks for extra per-file values, they appear as additional keys.
records = search(".", "directories that contain only audio files")
paths = [record["path"] for record in records]

Safety model

  • Search roots are mounted read-only under /data; results are mapped back to host paths afterward.
  • The worker container runs with --network none, --cap-drop ALL, --security-opt no-new-privileges, a read-only root filesystem, and a small tmpfs for scratch space.
  • The host validates that the filter returns only paths it was given, so generated code cannot inject arbitrary paths into the output.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nfind-0.1.0.tar.gz (43.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nfind-0.1.0-py3-none-any.whl (53.6 kB view details)

Uploaded Python 3

File details

Details for the file nfind-0.1.0.tar.gz.

File metadata

  • Download URL: nfind-0.1.0.tar.gz
  • Upload date:
  • Size: 43.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for nfind-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ec24bf8ced36d3e7cf25daf48b8b978b9a89c72b2497fe059f1236be1e042cf5
MD5 6785f1622c25b6069f284cf78de03827
BLAKE2b-256 787642c710cd5f67ad0a640dc4492e3bcbef750fdcf617738e9a39995502a459

See more details on using hashes here.

File details

Details for the file nfind-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: nfind-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 53.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for nfind-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a40a1ba67a56357910a860ec12e055f17a2a0f56a2d2088b684c13ebadebe7ec
MD5 fffb965986abf0b71d9b6fa4947ff349
BLAKE2b-256 4b06c62b3d67ccf9f3328c72b4ca186a44073a46e073e5acb503ae68ac6e1c64

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page