Grep-like tool for dataframes using Narwhals
Project description
nwgrep
Grep-like tool for dataframes that works with pandas, polars, and any other backend supported by Narwhals.
Installation
uv add nwgrep
With specific backends:
uv add nwgrep[pandas] # pandas support
uv add nwgrep[polars] # polars support
uv add nwgrep[dask] # dask support
uv add nwgrep[cudf] # cuDF (GPU) support
uv add nwgrep[all] # all major backends
Or using pip:
pip install nwgrep
Usage
Method 1: Direct Function Call
from nwgrep import nwgrep
import pandas as pd
df = pd.DataFrame({
"name": ["Alice", "Bob", "Eve"],
"status": ["active", "locked", "active"],
})
result = nwgrep(df, "active")
print(result)
name status
0 Alice active
2 Eve active
Method 2: Using .pipe() (Pandas/Polars Style)
# Works with any backend!
result = df.pipe(nwgrep, "active")
# Beautiful chaining
result = (
df
.pipe(nwgrep, "active")
.pipe(lambda x: x.sort_values('name', ascending=False))
)
print(result)
name status
2 Eve active
0 Alice active
Method 3: Accessor Pattern (df.grep)
from nwgrep import register_grep_accessor
import pandas as pd
# Register once at start of your script/notebook
register_grep_accessor()
df = pd.DataFrame({
"name": ["Alice", "Bob", "Eve"],
"status": ["active", "locked", "active"],
})
# Now you can use .grep() directly!
result = df.grep("active")
result = df.grep("ACTIVE", case_sensitive=False)
result = df.grep("active", columns=["status"])
Works with both pandas and polars:
import polars as pl
from nwgrep import register_grep_accessor
register_grep_accessor()
df = pl.DataFrame({
"name": ["Alice", "Bob", "Eve"],
"status": ["active", "locked", "active"],
})
# Same syntax!
result = df.grep("active")
Method 4: Narwhals Plugin Integration
nwgrep is fully Narwhals-compliant. This means you can use it directly with Narwhals objects, and it will be auto-discovered as a plugin by other Narwhals-based tools.
import narwhals as nw
import pandas as pd
from nwgrep import nwgrep
# nwgrep handles Narwhals objects natively and returns Narwhals objects
df = nw.from_native(pd.DataFrame({"col": ["foo", "bar"]}))
result = nwgrep(df, "foo")
print(type(result)) # <class 'narwhals.dataframe.DataFrame'>
All Search Options
# Case-insensitive search
df.grep("ACTIVE", case_sensitive=False)
# Invert match (like grep -v)
df.grep("active", invert=True)
# Search specific columns only
df.grep("example.com", columns=["email"])
# Regex search
df.grep(r".*@example\.com", regex=True)
# Multiple patterns (OR logic)
df.grep(["Alice", "Bob"])
# Whole word matching
df.grep("active", whole_word=True)
Command Line
# Basic search
uv run nwgrep "error" logfile.parquet
# Case insensitive
uv run nwgrep -i "warning" data.feather
# Invert match
uv run nwgrep -v "success" data.parquet
# Regex search
uv run nwgrep -E "err(or|!)?" data.parquet
# Search specific columns
uv run nwgrep --columns name,email "alice" users.feather
# Limit output rows
uv run nwgrep -n 10 "pattern" large_file.parquet
# Output as NDJSON (Streams lazily if polars is installed!)
uv run nwgrep "pattern" data.parquet --format ndjson
# Output as CSV
uv run nwgrep --format csv "pattern" data.parquet > results.csv
Which Method Should I Use?
| Method | When to Use |
|---|---|
nwgrep(df, ...) |
Simple scripts, maximum compatibility |
df.pipe(nwgrep, ...) |
Data pipelines, functional style |
df.grep(...) |
Interactive use (notebooks), cleanest syntax |
| Narwhals native | When working within a Narwhals data-agnostic pipeline |
Features
- 🚀 Backend agnostic: Works with pandas, polars, daft, pyarrow
- 🔍 Multiple search modes: Literal, regex, case-sensitive/insensitive
- 📊 Column filtering: Search all columns or specific ones
- ⚡ Lazy evaluation: Efficient with large datasets when using polars/daft
- 🎯 Familiar interface: grep-like flags and behavior
- 🔧 Type safe: Full type hints with ty (Red Knot) type checking
- 🎨 Flexible API: Function, pipe, or method - your choice!
API Reference
nwgrep(df, pattern, **kwargs)
Parameters:
df: DataFrame or LazyFrame (Native or Narwhals)pattern: str or list of str - Search pattern(s)columns: list of str, optional - Specific columns to searchcase_sensitive: bool, default Trueregex: bool, default False - Treat pattern as regexinvert: bool, default False - Return non-matching rowswhole_word: bool, default False - Match whole words only
Returns: Same type as input (Native or Narwhals DataFrame/LazyFrame)
register_grep_accessor()
Registers .grep() method on pandas and polars DataFrames. Call once at the start of your script or notebook.
Examples
Check the examples directory for complete scripts using both Pandas and Polars.
License
MIT License - see LICENSE file for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nwgrep-0.1.0.tar.gz.
File metadata
- Download URL: nwgrep-0.1.0.tar.gz
- Upload date:
- Size: 9.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f272784f580d9f72dfcddce316972d60a43fa6bf182de166a5220ec2c6d903f
|
|
| MD5 |
3e2207a19ba84365d77548d64e989e24
|
|
| BLAKE2b-256 |
df76df817aeac5005837ce430fee06489f9c76e751faa10df518baaf1bbba33f
|
File details
Details for the file nwgrep-0.1.0-py3-none-any.whl.
File metadata
- Download URL: nwgrep-0.1.0-py3-none-any.whl
- Upload date:
- Size: 10.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.27 {"installer":{"name":"uv","version":"0.9.27","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5292129f9795632116363dfbe91543f8ee944a4feb98d77592020825a640803
|
|
| MD5 |
ad3123c5ade424df17549118aa5e7dcd
|
|
| BLAKE2b-256 |
1707e2ed3e08c9635bf8d66ceac78ac19e80d50dfbb1cedce67b2898365dc2d7
|