Skip to main content

Grep-like tool for dataframes using Narwhals

Project description

nwgrep

Grep-like tool for dataframes that works with pandas, polars, and any other backend supported by Narwhals.

uv Ruff License: MIT Claude Gemini

Installation

uv add nwgrep

With specific backends:

uv add nwgrep[pandas]  # pandas support
uv add nwgrep[polars]  # polars support
uv add nwgrep[dask]    # dask support
uv add nwgrep[cudf]    # cuDF (GPU) support
uv add nwgrep[all]     # all major backends

Or using pip:

pip install nwgrep

Usage

Method 1: Direct Function Call

from nwgrep import nwgrep
import pandas as pd

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Eve"],
    "status": ["active", "locked", "active"],
})

result = nwgrep(df, "active")

print(result)
    name  status
0  Alice  active
2    Eve  active

Method 2: Using .pipe() (Pandas/Polars Style)

# Works with any backend!
result = df.pipe(nwgrep, "active")

# Beautiful chaining
result = (
    df
    .pipe(nwgrep, "active")
    .pipe(lambda x: x.sort_values('name', ascending=False))
)

print(result)
    name  status
2    Eve  active
0  Alice  active

Method 3: Accessor Pattern (df.grep)

from nwgrep import register_grep_accessor
import pandas as pd

# Register once at start of your script/notebook
register_grep_accessor()

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Eve"],
    "status": ["active", "locked", "active"],
})

# Now you can use .grep() directly!
result = df.grep("active")
result = df.grep("ACTIVE", case_sensitive=False)
result = df.grep("active", columns=["status"])

Works with both pandas and polars:

import polars as pl
from nwgrep import register_grep_accessor

register_grep_accessor()

df = pl.DataFrame({
    "name": ["Alice", "Bob", "Eve"],
    "status": ["active", "locked", "active"],
})

# Same syntax!
result = df.grep("active")

Method 4: Narwhals Plugin Integration

nwgrep is fully Narwhals-compliant. This means you can use it directly with Narwhals objects, and it will be auto-discovered as a plugin by other Narwhals-based tools.

import narwhals as nw
import pandas as pd
from nwgrep import nwgrep

# nwgrep handles Narwhals objects natively and returns Narwhals objects
df = nw.from_native(pd.DataFrame({"col": ["foo", "bar"]}))
result = nwgrep(df, "foo")

print(type(result)) # <class 'narwhals.dataframe.DataFrame'>

All Search Options

# Case-insensitive search
df.grep("ACTIVE", case_sensitive=False)

# Invert match (like grep -v)
df.grep("active", invert=True)

# Search specific columns only
df.grep("example.com", columns=["email"])

# Regex search
df.grep(r".*@example\.com", regex=True)

# Multiple patterns (OR logic)
df.grep(["Alice", "Bob"])

# Whole word matching
df.grep("active", whole_word=True)

Command Line

# Basic search
uv run nwgrep "error" logfile.parquet

# Case insensitive
uv run nwgrep -i "warning" data.feather

# Invert match
uv run nwgrep -v "success" data.parquet

# Regex search
uv run nwgrep -E "err(or|!)?" data.parquet

# Search specific columns
uv run nwgrep --columns name,email "alice" users.feather

# Limit output rows
uv run nwgrep -n 10 "pattern" large_file.parquet

# Output as NDJSON (Streams lazily if polars is installed!)
uv run nwgrep "pattern" data.parquet --format ndjson

# Output as CSV
uv run nwgrep --format csv "pattern" data.parquet > results.csv

Which Method Should I Use?

Method When to Use
nwgrep(df, ...) Simple scripts, maximum compatibility
df.pipe(nwgrep, ...) Data pipelines, functional style
df.grep(...) Interactive use (notebooks), cleanest syntax
Narwhals native When working within a Narwhals data-agnostic pipeline

Features

  • 🚀 Backend agnostic: Works with pandas, polars, daft, pyarrow
  • 🔍 Multiple search modes: Literal, regex, case-sensitive/insensitive
  • 📊 Column filtering: Search all columns or specific ones
  • Lazy evaluation: Efficient with large datasets when using polars/daft
  • 🎯 Familiar interface: grep-like flags and behavior
  • 🔧 Type safe: Full type hints with ty (Red Knot) type checking
  • 🎨 Flexible API: Function, pipe, or method - your choice!

API Reference

nwgrep(df, pattern, **kwargs)

Parameters:

  • df: DataFrame or LazyFrame (Native or Narwhals)
  • pattern: str or list of str - Search pattern(s)
  • columns: list of str, optional - Specific columns to search
  • case_sensitive: bool, default True
  • regex: bool, default False - Treat pattern as regex
  • invert: bool, default False - Return non-matching rows
  • whole_word: bool, default False - Match whole words only

Returns: Same type as input (Native or Narwhals DataFrame/LazyFrame)

register_grep_accessor()

Registers .grep() method on pandas and polars DataFrames. Call once at the start of your script or notebook.

Examples

Check the examples directory for complete scripts using both Pandas and Polars.

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nwgrep-0.1.0.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nwgrep-0.1.0-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file nwgrep-0.1.0.tar.gz.

File metadata

  • Download URL: nwgrep-0.1.0.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for nwgrep-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2f272784f580d9f72dfcddce316972d60a43fa6bf182de166a5220ec2c6d903f
MD5 3e2207a19ba84365d77548d64e989e24
BLAKE2b-256 df76df817aeac5005837ce430fee06489f9c76e751faa10df518baaf1bbba33f

See more details on using hashes here.

File details

Details for the file nwgrep-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: nwgrep-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for nwgrep-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a5292129f9795632116363dfbe91543f8ee944a4feb98d77592020825a640803
MD5 ad3123c5ade424df17549118aa5e7dcd
BLAKE2b-256 1707e2ed3e08c9635bf8d66ceac78ac19e80d50dfbb1cedce67b2898365dc2d7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page