Skip to main content

A Python wrapper for ripgrep

Project description

rpygrep: A Pythonic ripgrep Wrapper

rpygrep is a Python library that provides a convenient and type-safe interface for interacting with the ripgrep command-line utility. It allows you to programmatically construct ripgrep commands for finding files and searching content, and then parse the structured JSON output into Python objects.

Used in the Filesystem Operations MCP Server.

Features

  • Fluent API: Build ripgrep commands using method chaining.
  • File Finding & Content Searching: Dedicated classes (RipGrepFind and RipGrepSearch) for different ripgrep modes.
  • Synchronous & Asynchronous Execution: Supports both run() and arun() methods for flexible integration.
  • Structured Output: Automatically parses ripgrep's JSON output into rich Python data classes, making results easy to access and manipulate.
  • Comprehensive Options: Control over a wide range of ripgrep options, including case sensitivity, glob patterns, file types, context lines, maximum depth, and more.

Example

ripgrep = (
    self._ripgrep_find.include_types(included_types or [])
    .exclude_types(excluded_types or [])
    .include_globs(included_globs or [])
    .exclude_globs(excluded_globs or [])
    .max_depth(max_depth)
)

ripgrep.run()
ripgrep = (
    self._ripgrep_search.include_types(included_types or [])
    .add_safe_defaults()
    .exclude_types(excluded_types or [])
    .include_globs(included_globs or [])
    .exclude_globs(excluded_globs or [])
    .before_context(before_context)
    .after_context(after_context)
    .add_patterns(patterns)
    .max_depth(max_depth)
    .max_count(matches_per_file)
    .case_sensitive(case_sensitive)
)

await ripgrep.arun()

Prerequisites

rpygrep requires the ripgrep command-line tool to be installed on your system. You can find installation instructions for ripgrep here.

Installation

uv add rpygrep

Usage Examples

The following examples demonstrate common use cases for rpygrep. For these examples to work, assume you have a directory structure similar to the one used in the tests:

.
├── code_with_hello_world.py
├── data.json
├── should_be_ignored.env
├── test_with_Hello_World.txt
├── .hidden
├── .git/
└── subdir/
    ├── nested.txt
    ├── script_with_hello.sh
    └── should_be_ignored.env

Basic File Finding (RipGrepFind)

Find all Python files in the current directory and its subdirectories:

from pathlib import Path
from rpygrep import RipGrepFind

# Assuming your script is run from the root of the example directory structure
current_dir = Path(".") 

rg_find = RipGrepFind(working_directory=current_dir)
rg_find.include_types(["py"])

print("Python files found:")
for path in rg_find.run():
    print(path)
# Expected output:
# code_with_hello_world.py

Basic Content Search (RipGrepSearch)

Search for "hello" (case-insensitive) in all files and print matching lines:

from pathlib import Path
from rpygrep import RipGrepSearch

current_dir = Path(".") 

rg_search = RipGrepSearch(working_directory=current_dir)
rg_search.add_pattern("hello")
rg_search.case_sensitive(False) # Make it case-insensitive

print("\nSearch results for 'hello' (case-insensitive):")
for result in rg_search.run():
    print(f"Found in {result.path}:")
    for match in result.matches:
        # .text might be None if lines.bytes is used, so check for it
        line_content = match.data.lines.text.strip() if match.data.lines.text else "Binary content"
        print(f"  Line {match.data.line_number}: {line_content}")
# Expected output (similar to):
# Found in test_with_Hello_World.txt:
#   Line 1: Hello, World!
# Found in code_with_hello_world.py:
#   Line 2:     print('hello, world!')
# Found in subdir/script_with_hello.sh:
#   Line 2: echo 'Hello'

Asynchronous Usage

Both RipGrepFind and RipGrepSearch support asynchronous execution using their arun() methods.

import asyncio
from pathlib import Path
from rpygrep import RipGrepSearch

async def main():
    current_dir = Path(".") 

    rg_search = RipGrepSearch(working_directory=current_dir)
    rg_search.add_pattern("World")

    print("\nAsynchronous search results for 'World':")
    async for result in rg_search.arun():
        print(f"Async found in {result.path}:")
        for match in result.matches:
            line_content = match.data.lines.text.strip() if match.data.lines.text else "Binary content"
            print(f"  Line {match.data.line_number}: {line_content}")

# To run this:
# asyncio.run(main())

Excluding File Types

Exclude common binary and data file types from your search:

from pathlib import Path
from rpygrep import RipGrepFind, DEFAULT_EXCLUDED_TYPES

current_dir = Path(".") 

rg_find = RipGrepFind(working_directory=current_dir)
rg_find.exclude_types(DEFAULT_EXCLUDED_TYPES) # Exclude common binary/data types

print("\nFiles found excluding default binary/data types:")
for path in rg_find.run():
    print(path)

Limiting Search Depth

Search only the current directory, without recursing into subdirectories:

from pathlib import Path
from rpygrep import RipGrepFind

current_dir = Path(".") 

rg_find = RipGrepFind(working_directory=current_dir)
rg_find.max_depth(1) # Only search current directory, no subdirectories

print("\nFiles found with max depth 1:")
for path in rg_find.run():
    print(path)
# Expected output (may vary based on actual files at depth 1):
# code_with_hello_world.py
# data.json
# should_be_ignored.env
# test_with_Hello_World.txt
# .hidden

Adding Context Lines

Include lines before and after a match:

from pathlib import Path
from rpygrep import RipGrepSearch

current_dir = Path(".") 

rg_search = RipGrepSearch(working_directory=current_dir)
rg_search.add_pattern("print")
rg_search.before_context(1) # 1 line before
rg_search.after_context(1)  # 1 line after

print("\nSearch results for 'print' with context:")
for result in rg_search.run():
    print(f"Found in {result.path}:")
    # Context lines are also in result.context
    for context_line in result.context:
        line_content = context_line.data.lines.text.strip() if context_line.data.lines.text else "Binary content"
        print(f"  Context Line {context_line.data.line_number}: {line_content}")
    for match in result.matches:
        line_content = match.data.lines.text.strip() if match.data.lines.text else "Binary content"
        print(f"  Match Line {match.data.line_number}: {line_content}")

API Reference (Key Classes)

rpygrep.RipGrepFind

Used for finding files based on various criteria.

  • include_glob(glob: str): Includes files matching the given glob pattern.
  • exclude_glob(glob: str): Excludes files matching the given glob pattern.
  • include_type(ripgrep_type: RIPGREP_TYPE_LIST): Includes files of a specific ripgrep file type (e.g., "py", "ts", "md").
  • exclude_type(ripgrep_type: RIPGREP_TYPE_LIST): Excludes files of a specific ripgrep file type.
  • max_depth(depth: int): Limits the search to a specified number of subdirectory levels.
  • sort(by: Literal["none", "path", "name", "size", "accessed", "created", "modified"], ascending: bool = True): Sorts the results.
  • run() -> Iterator[Path]: Executes the command synchronously and yields Path objects.
  • arun() -> AsyncIterator[Path]: Executes the command asynchronously and yields Path objects.

rpygrep.RipGrepSearch

Used for searching content within files.

  • add_pattern(pattern: str): Adds a regular expression pattern to search for. Can be called multiple times for multiple patterns.
  • case_sensitive(case_sensitive: bool): Sets whether the search should be case-sensitive (True) or case-insensitive (False).
  • before_context(context: int): Specifies the number of lines of context to include before a match.
  • after_context(context: int): Specifies the number of lines of context to include after a match.
  • max_count(count: int): Sets the maximum number of matches to return per file.
  • max_file_size(size: int): Sets the maximum file size (in bytes) to search.
  • as_json(): Configures ripgrep to output results in JSON format (automatically called by run/arun).
  • run() -> Iterator[RipGrepSearchResult]: Executes the command synchronously and yields RipGrepSearchResult objects.
  • arun() -> AsyncIterator[RipGrepSearchResult]: Executes the command asynchronously and yields RipGrepSearchResult objects.

rpygrep.RipGrepSearchResult

A dataclass representing a single search result for a file.

  • path: Path: The path to the file where matches were found.
  • begin: RipGrepBegin: Information about the beginning of the file's search results.
  • matches: list[RipGrepMatch]: A list of RipGrepMatch objects, each representing a found match.
  • context: list[RipGrepContext]: A list of RipGrepContext objects, representing context lines around matches.
  • end: RipGrepEnd: Information about the end of the file's search results, including statistics.

Other Important Types

  • RIPGREP_TYPE_LIST: A Literal type listing all supported ripgrep file types.
  • DEFAULT_EXCLUDED_TYPES: A list of RIPGREP_TYPE_LIST values that are commonly excluded (e.g., binary files, large data files).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rpygrep-0.1.0.tar.gz (34.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rpygrep-0.1.0-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file rpygrep-0.1.0.tar.gz.

File metadata

  • Download URL: rpygrep-0.1.0.tar.gz
  • Upload date:
  • Size: 34.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.19

File hashes

Hashes for rpygrep-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a8c640be11c58c9216f819ea95b665805e742689ccd590368199dacef4b87ab6
MD5 0a9b3753e5f3055483a1cd74e9deb8f7
BLAKE2b-256 d59e8c84bc0e7456406542d371f529e55fe7a7bae40a0fb12821c012b19199fb

See more details on using hashes here.

File details

Details for the file rpygrep-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rpygrep-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.7.19

File hashes

Hashes for rpygrep-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fa47a0874ffd0da94cadc0b0621be70c14e7a97ca6b331b9e38976e2268a978d
MD5 41a15e1d18c9da361a5a5b559bde8c68
BLAKE2b-256 49f49257aca07cf4e8b270203fc48ea6e9a580e26a746b1fba6d8be461ccd730

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page