Skip to main content

A Python native binding for ripgrep

Project description

ripgrep-python

A Python native binding for ripgrep, a fast recursive search tool written in Rust.

This library provides a true native integration with ripgrep's Rust API, not just a subprocess wrapper, offering excellent performance and seamless integration with Python.

Features

  • Fast recursive search using ripgrep's core engine
  • Native integration (no subprocess overhead)
  • Rich search options (case sensitivity, file type filtering, glob patterns, etc.)
  • Multiple output modes (content with line numbers, file lists, match counts)
  • ripgrep-like interface that closely mirrors the original ripgrep command-line experience
  • Full regex support with multiline capabilities
  • Context support (before/after lines)
  • Pythonic API with full type hints and IDE support
  • Type-safe with complete .pyi stub files for static analysis

Installation

  • Linux: x86_64, aarch64, i686, armv7, s390x, ppc64le
  • Windows: x64, x86
  • macOS: Intel (x86_64), Apple Silicon (aarch64)

From PyPI

pip install ripgrep-python

From Source

You'll need Rust and Cargo installed (see Rust installation guide).

# Clone the repository
git clone https://github.com/LinXueyuanStdio/ripgrep-python.git
cd ripgrep-python

# Build and install the package
pip install maturin
maturin develop

Quick Start

The interface provides a Grep class that closely mirrors ripgrep's command-line interface:

import pyripgrep

# Create a grep instance
grep = pyripgrep.Grep()

# Basic search - find files containing 'pattern'
files = grep.search("pattern")
print(files)  # ['file1.py', 'file2.rs', ...]

# Search with content and line numbers
content = grep.search("pattern", output_mode="content", n=True)
print(content)  # ['file1.py:42:matching line', ...]

# Count matches per file
counts = grep.search("pattern", output_mode="count")
print(counts)  # {'file1.py': 5, 'file2.rs': 12, ...}

# Advanced filtering
results = grep.search(
    "struct",
    path="src/",
    type="rust",
    i=True,
    C=2,
    head_limit=10
)

API Documentation

The main interface is the Grep class, which provides a unified search method with various options:

from typing import Dict, List, Literal, Optional, Union

def search(
    self,
    pattern: str,                                           # Required: regex pattern to search for
    path: Optional[str] = None,                            # Path to search (default: current directory)
    glob: Optional[str] = None,                            # Glob pattern for file filtering (e.g., "*.py")
    output_mode: Optional[Literal["content", "files_with_matches", "count"]] = None,  # Output format
    B: Optional[int] = None,                               # Lines before match (-B flag)
    A: Optional[int] = None,                               # Lines after match (-A flag)
    C: Optional[int] = None,                               # Lines before and after match (-C flag)
    n: Optional[bool] = None,                              # Show line numbers (-n flag)
    i: Optional[bool] = None,                              # Case insensitive search (-i flag)
    type: Optional[str] = None,                            # File type filter (e.g., "rust", "python")
    head_limit: Optional[int] = None,                      # Limit number of results
    multiline: Optional[bool] = None                       # Enable multiline mode (-U flag)
) -> Union[List[str], Dict[str, int]]:
    """
    Search for pattern in files with various options.

    Returns:
        - List[str]: When output_mode is "files_with_matches" (default) or "content"
        - Dict[str, int]: When output_mode is "count" (filename -> match count)
    """

Output Modes

files_with_matches (default)

Returns list of file paths containing matches:

files = grep.search("TODO")
# Returns: ['src/main.rs', 'docs/readme.md', ...]

content

Returns matching lines with optional context and line numbers:

# Basic content search
lines = grep.search("function", output_mode="content")
# Returns: ['src/app.js:function myFunc() {', ...]

# With line numbers and context
lines = grep.search("error", output_mode="content",
                   n=True, C=2)

count

Returns match counts per file:

counts = grep.search("import", output_mode="count")
# Returns: {'src/main.py': 15, 'src/utils.py': 8, ...}

Usage Examples

Basic Search

import pyripgrep

grep = pyripgrep.Grep()

# Find all files containing "TODO"
files = grep.search("TODO")
for file in files:
    print(file)

# Show actual matching lines
content = grep.search("TODO", output_mode="content", n=True)
for line in content[:5]:  # First 5 matches
    print(line)

File Type Filtering

# Search only in Rust files
rust_files = grep.search("struct", type="rust")

# Search only in Python files
py_files = grep.search("def", type="python")

# Supported: rust, python, javascript, typescript, java, c, cpp, go, etc.

Advanced Filtering

# Use glob patterns
js_files = grep.search("function", glob="*.js")

# Case insensitive search
files = grep.search("ERROR", i=True)

# Search in specific directory with context
results = grep.search(
    "impl",
    path="src/",
    output_mode="content",
    C=3,
    n=True,
    head_limit=10
)

Regular Expressions

# Find function definitions
functions = grep.search(r"fn\s+\w+", output_mode="content", type="rust")

# Find import statements
imports = grep.search(r"^(import|from)\s+", output_mode="content", type="python")

# Multiline matching
structs = grep.search(r"struct\s+\w+\s*\{", multiline=True, output_mode="content")

Performance and Statistics

import time

# Time a search operation
start = time.time()
results = grep.search("pattern", path="large_directory/")
duration = time.time() - start

print(f"Found {len(results)} files in {duration:.3f} seconds")

# Get detailed match counts
counts = grep.search("pattern", output_mode="count")
total_matches = sum(counts.values())
print(f"Total matches: {total_matches} across {len(counts)} files")

Comparison with ripgrep CLI

ripgrep command Python equivalent
rg pattern grep.search("pattern")
rg pattern -l grep.search("pattern", output_mode="files_with_matches")
rg pattern -n grep.search("pattern", output_mode="content", n=True)
rg pattern -c grep.search("pattern", output_mode="count")
rg pattern -i grep.search("pattern", i=True)
rg pattern -A 3 grep.search("pattern", A=3, output_mode="content")
rg pattern -B 3 grep.search("pattern", B=3, output_mode="content")
rg pattern -C 3 grep.search("pattern", C=3, output_mode="content")
rg pattern -t py grep.search("pattern", type="python")
rg pattern -g "*.js" grep.search("pattern", glob="*.js")
rg pattern -U grep.search("pattern", multiline=True)

Type Annotations

This library provides full type hint support for better IDE experience and static type checking:

Type-Safe API

from typing import Dict, List
import pyripgrep

# Create typed instance
grep: pyripgrep.Grep = pyripgrep.Grep()

# Type inference for different output modes
files: List[str] = grep.search("pattern")  # files_with_matches mode
content: List[str] = grep.search("pattern", output_mode="content")  # content mode
counts: Dict[str, int] = grep.search("pattern", output_mode="count")  # count mode

IDE Support

The library includes complete .pyi stub files providing:

  • IntelliSense: Full autocompletion in VS Code, PyCharm, etc.
  • Type checking: Works with mypy, pyright, and other static analyzers
  • Method overloads: Different return types based on output_mode parameter
  • Parameter hints: Detailed documentation for all parameters

Example with Type Annotations

from typing import Dict, List
import pyripgrep

def analyze_codebase(pattern: str, directory: str) -> Dict[str, int]:
    """Analyze codebase for pattern occurrences with full type safety."""
    grep: pyripgrep.Grep = pyripgrep.Grep()

    # Type checker knows this returns Dict[str, int]
    counts: Dict[str, int] = grep.search(
        pattern,
        path=directory,
        output_mode="count",
        i=True
    )

    return counts

# Usage with type checking
results: Dict[str, int] = analyze_codebase("TODO", "src/")
total_todos: int = sum(results.values())

For more examples, see examples/typed_usage_demo.py and docs/TYPE_ANNOTATIONS.md.

Development

Building from Source

Prerequisites

  1. Install Rust toolchain:

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    source ~/.cargo/env
    
  2. Install Python development dependencies:

    pip install maturin
    

Build and Install

# Clone the repository
git clone https://github.com/LinXueyuanStdio/ripgrep-python.git
cd ripgrep-python

# Development build (creates importable module)
maturin develop

# Or build wheel for distribution
maturin build --release

Running Examples

# Run the new interface demo
python examples/new_interface_demo.py

# Run basic usage example
python examples/basic_usage.py

# Run performance tests
python examples/test_new_interface.py

Testing

# Run Python tests
python -m pytest tests/ -v

# Test the module import
python -c "import pyripgrep; print('Import successful')"

Publishing

For maintainers and contributors:

# Build for current platform
make build

# Build for all platforms
make build-all

# Publish to TestPyPI
make publish-test

# Publish to PyPI
make publish-prod

See Publishing Guide for detailed cross-platform build instructions.

Migration Guide

If you're using the old interface (RipGrep class), here's how to migrate to the new Grep class:

Old Interface

import pyripgrep

rg = pyripgrep.RipGrep()
options = pyripgrep.SearchOptions()
options.case_sensitive = False

results = rg.search("pattern", ["."], options)
files = rg.search_files("pattern", ["."], options)
counts = rg.count_matches("pattern", ["."], options)

New Interface

import pyripgrep

grep = pyripgrep.Grep()

# Search with content (equivalent to old search method)
results = grep.search("pattern", i=True, output_mode="content")

# Search for files (equivalent to old search_files method)
files = grep.search("pattern", i=True, output_mode="files_with_matches")

# Count matches (equivalent to old count_matches method)
counts = grep.search("pattern", i=True, output_mode="count")

Performance

This library provides native Rust performance through direct API integration:

  • No subprocess overhead - direct Rust function calls
  • Optimized file walking - uses ripgrep's ignore crate for .gitignore support
  • Binary detection - automatically skips binary files
  • Parallel processing - leverages Rust's concurrency for large searches

Benchmark results show 10-50x performance improvement over subprocess-based solutions on large codebases.

Troubleshooting

Import Errors

If you get import errors, ensure maturin build completed successfully:

maturin develop --release
python -c "import pyripgrep; print('Success!')"

Rust Toolchain Issues

Update Rust if you encounter build issues:

rustup update

Performance Issues

For very large searches, consider using head_limit to restrict results:

# Limit to first 1000 results
results = grep.search("pattern", head_limit=1000)

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Run maturin develop to test locally
  6. Submit a pull request

License

MIT License - see LICENSE file for details

Acknowledgments

  • BurntSushi for the original ripgrep tool
  • PyO3 for Rust-Python bindings
  • Maturin for building and packaging

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ripgrep_python-0.1.0.tar.gz (51.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

ripgrep_python-0.1.0-cp38-abi3-win_amd64.whl (1.2 MB view details)

Uploaded CPython 3.8+Windows x86-64

ripgrep_python-0.1.0-cp38-abi3-win32.whl (1.1 MB view details)

Uploaded CPython 3.8+Windows x86

ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.5 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ s390x

ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.7 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ppc64le

ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.4 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARMv7l

ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.4 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ ARM64

ripgrep_python-0.1.0-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl (1.5 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.5+ i686

ripgrep_python-0.1.0-cp38-abi3-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

ripgrep_python-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file ripgrep_python-0.1.0.tar.gz.

File metadata

  • Download URL: ripgrep_python-0.1.0.tar.gz
  • Upload date:
  • Size: 51.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.3

File hashes

Hashes for ripgrep_python-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0024b2209a8ef8fb7abf9d2c0ecca6e24e218b097475e0df56496d0337eac176
MD5 73d66dbd3442f0b4d25c487c1f0f6845
BLAKE2b-256 1163a6681a4d3d34ea8e062f03376b0e63460f49db430f0dd7ed50a379419a56

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 2fe265a967c5c753efa44557e35d577834902601972a24fd965362b8eea54203
MD5 d2ff68db3a556cf62ebf61995119ccd4
BLAKE2b-256 a2836b1130015ab911759c52b67adb67ede998b6d8dc8f13732c27ceebb79698

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-win32.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 c73d8d79e92a03b4435b53e4833bc4af332bfe59f028d2f7e1779e0080540398
MD5 84617adbd90d8ff756487c0e6cecb50a
BLAKE2b-256 ede4343ff4c1aca92466fdef4647534fefd97c9bcf52bd913096b49464e60699

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5e12708281f7162c4d98b7cd0e867dee2717024f1db0a1b07edbf20853b25d7a
MD5 ede4e97ee345973b3ff95acd16017b4a
BLAKE2b-256 56397e4712d8a1db1404df212d14cadbdca1b157270e0f717f9597a5c18b8741

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 254bbc06e6ff3bf50f248c1077b1a160fd20ffe310d5c6776a13063285e9c26c
MD5 15f304105fb62a2e7255bbba68d472f6
BLAKE2b-256 5dc0537e83ac7e46126de8b75ae45f462386eb84341261daec184462701727b2

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 3f7ed9c704d894bc01982a35037547b90267d6bbf799a6a48045dabf67f19ec4
MD5 498d4c60a8125ca8e61343a8f04c9cef
BLAKE2b-256 ce7f20ec138c20fb65dfa0b4c855bd3eb9312c7398d999d8301d8898bbb71216

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 37aaca93eac7fadfcf5f946a360129ce589f8e4aae60c8048a7d33f5227a444b
MD5 2f3e03f4fdc3e871ea3bf45ed3638407
BLAKE2b-256 02c1b81de4dcff929426cc3191eeb6366efc3a67fbacbb0467d986ef8814447c

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 5928262ba81d922ccce8a8a387e96d3aa9217a4b1a7be12ff18d87e0334643b0
MD5 821a3ef971fa0b90195c167ea0e962ef
BLAKE2b-256 64f244bd9fcf27c13191c090199c7b563f536f4b71a3006a652ba31288bf635d

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl
Algorithm Hash digest
SHA256 fbec833152ee45535d1d72d6862e283a5153ccd16b73e9ea01745c33ba8a165f
MD5 5b1ab9201fa4956b2574fe311dda5afb
BLAKE2b-256 4cde94b3a09298f09e1a5fc5679c19f198e9cf3210e015c6c95900da847c9663

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8aa4f646e93a1341b779c1870de5e62169495a0b93e0c619476e931777c36bca
MD5 4d241109ea01c171ce5909f0437be880
BLAKE2b-256 b0bb5437c8f3502ae1b83a7c30043f89c61ed3dd31918fcfe5310ed7b3f2c3a5

See more details on using hashes here.

File details

Details for the file ripgrep_python-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for ripgrep_python-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 5ff217f9bcfddcec9bcb6b55311c67d5731152710e67e514a3182cb3e165cba1
MD5 42c7455893494e6503c3ba09a9275589
BLAKE2b-256 f47de43b9e056bdc6e363de010db0ccdd8fe2a3cd7cec23e9bcc7d7704935d0b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page