Skip to main content

Python API for ripgrep-style file walking and searching

Project description

rgapi

rgapi is a Python API for ripgrep-style walking and search. It is meant for Python code that wants fd-style file discovery or rg-style searching without shelling out.

It uses the same ignore, grep-regex, and grep-searcher crates that ripgrep uses for walking, regex matching, and file scanning. Walking and searching run in parallel by default. Most expensive work stays in Rust.

Overview

For common file discovery and search:

from rgapi import fd, rg, rg_iter

fd(".", ext="py", exclude="test_*.py")
for row in rg_iter("TODO", ".", include="*.py", context=2): print(row.asdict())
rg("TODO", ".", ext="py", skip_dir=".venv", paths=True)

For cell-aware search of Jupyter notebooks (see Notebooks):

from rgapi import nbrg

nbrg("read_csv", ".", cell_context=1)

For direct access to the regex, search, and walk pieces:

from rgapi import compile, search_path, search_text, walk

matcher = compile("TODO")
matcher.is_match("TODO")
matcher.finditer("TODO TODO")

walk(".")
search_text(matcher, "alpha\nTODO\nomega\n", path="memory.txt", context=1)
search_path(matcher, "src/lib.rs", display_path="src/lib.rs")

Install

pip install rgapi

Semantics

fd and walk return slash-separated paths relative to root. They use the ignore crate, so .gitignore, .ignore, and the usual ripgrep filters apply by default. .rgignore files are also honored and take precedence over .gitignore. Hidden files are skipped unless hidden=True. Pass ignore=False to disable all ignore filtering (including .rgignore). Symlinks are not followed unless follow_links=True; same_file_system=True avoids crossing filesystem boundaries. Traversal is parallel, and result order is not guaranteed; use sorted(...) if order matters. root arguments accept str or pathlib.Path and expand ~; search_path also accepts path-like file paths. Display labels such as display_path are stringified without expansion.

fd adds fd-like filtering on top of walk: pattern is a substring match on the relative path, and include/exclude use glob syntax. glob= is accepted as an alias for include=. A basename glob such as *.py also matches recursively, so it finds src/app.py. Use ext="py" or ext=["py", "rs"] for extension filters, min_depth=/max_depth= to bound recursion, and max_filesize= to skip files above a byte limit.

path_re and skip_path_re are regex filters on slash-separated relative paths. They filter returned paths or searched files, but do not control traversal. skip_dir uses glob syntax to prune matching directory subtrees, and skip_dir_re does the same with regex.

rg and rg_iter return structured rows rather than raw CLI text. They accept the same include, exclude, glob, ext, path_re, skip_path_re, skip_dir, skip_dir_re, min_depth, max_depth, max_filesize, follow_links, and same_file_system filters as fd. Each row is a SearchLine with:

kind         'match', 'before', 'after', or 'context'
path         path relative to root
line_number  1-based line number
line         line text without the trailing newline
matches      list of (start, end) byte offsets for match rows

rg, search_text, and search_path return SearchResults by default, a list subclass whose str() and notebook pretty display are rg-style multiline text. rg_iter yields rows lazily.

SearchLine has a structured repr, an rg-style str, and SearchLine.asdict() returns row fields as a plain Python dict. rg(..., paths=True) returns unique matched paths, and rg(..., count=True) returns the total number of match spans. paths and count cannot both be set.

before_context, after_context, and context are like rg -B, rg -A, and rg -C. Files containing NUL bytes or invalid UTF-8 are skipped.

Search is case-sensitive by default, matching rg. Use smart_case=True for rg --smart-case behavior, or case_sensitive=False to force case-insensitive matching.

Notebooks

nbrg searches Jupyter .ipynb files cell-by-cell, so results are cells rather than raw JSON lines, and each match is identified by its cell id (the nbformat cell/message id) rather than a line number. Searching a notebook with plain rg matches the escaped JSON text (including outputs and metadata) and reports meaningless JSON line numbers; nbrg instead searches each cell's reconstructed source and reports the cell id, which is stable across edits and points at the actual unit you work with.

from rgapi import nbrg

nbrg("read_csv", ".")                  # cells whose source matches, across all notebooks under "."
nbrg("read_csv", ".", cell_context=1)  # also include neighbouring cells as context

Notebooks are walked, parsed, and matched together in one parallel Rust pass, using the same regex engine as rg, so regex behaviour and the case_sensitive/smart_case flags match rg. Only cell source is searched, not outputs or metadata. nbrg accepts the same discovery filters as fd/rg (include, exclude, glob, hidden, max_depth, skip_dir, …).

nbrg returns NbResults, a list of NbCell. Each NbCell has:

path         notebook path relative to root
cell_index   0-based position of the cell in the notebook
cell_id      nbformat cell id (falls back to the cell index for notebooks without ids)
cell_type    'code', 'markdown', or 'raw'
kind         'match' or 'context'
source       full cell source
matches      list of SearchLine rows for the matched lines within the cell

NbCell.asdict() returns those fields as a plain dict (with matches as SearchLine dicts). str()/pretty display is one truncated, newline-escaped line per cell, keyed by cell_id rather than a line number: path:cell_id:source for matches and path:cell_id-source for context cells. A cell with several matches appears once, with every hit collected in matches.

cell_context=N includes the N cells before and after each matching cell as kind="context" rows (deduplicated per notebook).

Notebook walking, parsing, and matching all happen in parallel in Rust, in the same pass as the file walk. Parsing uses a lean model that reads only each cell's id, cell_type, and source and skips outputs and metadata, so large embedded outputs (images, plots) are never materialized. search_nb(pattern, path, ...) searches a single notebook file the same way.

Benchmarks

tools/bench.py compares the rg CLI with in-process rgapi. Run it against a release build. One run on this machine, using best time from seven repeats:

fixture rg rgapi
6 x 2 MB files, 2 matches 6.54 ms 1.44 ms
800 x 1.5 KB files, 2 matches 13.90 ms 10.94 ms
tiny dir, repeated 30x 5.92 ms 2.14 ms

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rgapi-0.1.10.tar.gz (27.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

rgapi-0.1.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

rgapi-0.1.10-cp313-cp313-macosx_11_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

rgapi-0.1.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

rgapi-0.1.10-cp312-cp312-macosx_11_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

rgapi-0.1.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

rgapi-0.1.10-cp311-cp311-macosx_11_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

rgapi-0.1.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

rgapi-0.1.10-cp310-cp310-macosx_11_0_arm64.whl (1.5 MB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

File details

Details for the file rgapi-0.1.10.tar.gz.

File metadata

  • Download URL: rgapi-0.1.10.tar.gz
  • Upload date:
  • Size: 27.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rgapi-0.1.10.tar.gz
Algorithm Hash digest
SHA256 c6396de435a46fb64963938703e9b26fa4f6aaa80109ed5d54cfb5d0df85ecee
MD5 b421de1c11e16532feea32fec373226b
BLAKE2b-256 865c283f7fdd9017e6b624a839ea2d8d48eec286b7251a4dc7b16e35fcc0731d

See more details on using hashes here.

Provenance

The following attestation bundles were made for rgapi-0.1.10.tar.gz:

Publisher: ci.yml on AnswerDotAI/rgapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rgapi-0.1.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rgapi-0.1.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0fa9f94e16e731dbde1ebad0695e9af52012f04ad0cea96e3bf91827b7f93a7e
MD5 aae96dfb28bd1c1be02a012dc368ed31
BLAKE2b-256 b32d2457d0d8f94a46969b1d64a9d5d46d610e863b52b651c572a8cbf326fca3

See more details on using hashes here.

Provenance

The following attestation bundles were made for rgapi-0.1.10-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on AnswerDotAI/rgapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rgapi-0.1.10-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rgapi-0.1.10-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 25400c8da71e8481bed417af98a58acbabc0c28837b80042b4a23a0abd63b346
MD5 7262b2d688dc736d696ad93a860c5db6
BLAKE2b-256 b0f96919ea89852870f56c0e4097e8a5cecb16e8388a803a888fca657ffd4f20

See more details on using hashes here.

Provenance

The following attestation bundles were made for rgapi-0.1.10-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: ci.yml on AnswerDotAI/rgapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rgapi-0.1.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rgapi-0.1.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 faaabae0383e93965897ca4fb45232da165b74ed9d6142194f83f9310323641a
MD5 344899f7f49cc6241f296d893cacec48
BLAKE2b-256 5b6280438243b95a2149af727e4ca1eb1a694def8ab690f0fffbcd5d14474640

See more details on using hashes here.

Provenance

The following attestation bundles were made for rgapi-0.1.10-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on AnswerDotAI/rgapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rgapi-0.1.10-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rgapi-0.1.10-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5bce0ed6a03c4b18183ea3ccd3d196dc40855d50bf59e9a524994a1bb395cf15
MD5 c7de8775b534e66fb93c4fb07a7321c8
BLAKE2b-256 a36d2068c650bacf703361819b18ae7f7de03cc018745fa6718f39983771df5e

See more details on using hashes here.

Provenance

The following attestation bundles were made for rgapi-0.1.10-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: ci.yml on AnswerDotAI/rgapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rgapi-0.1.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rgapi-0.1.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 090c8556062bde6409ceaaa89b72283362206953efd1af1179b6b1f3c8a8d273
MD5 02431adf491b0c372d070fcdde5f1851
BLAKE2b-256 402ffe644ed1dcb53dcc63578e812ddcb866378805118cc0563b63540dfc7bfa

See more details on using hashes here.

Provenance

The following attestation bundles were made for rgapi-0.1.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on AnswerDotAI/rgapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rgapi-0.1.10-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rgapi-0.1.10-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bc4440a757d0d85df825c2b73fb1430e45ecceca75dc1c297f604be019cc2439
MD5 857b7754ca559c0cb5c7d3e5a59a69ae
BLAKE2b-256 cd8ccd0e6bd4ab8ef307cd5db459b8ace4467e9480ee051bd9da71594a0d7645

See more details on using hashes here.

Provenance

The following attestation bundles were made for rgapi-0.1.10-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: ci.yml on AnswerDotAI/rgapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rgapi-0.1.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for rgapi-0.1.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f5531fba6be72cd3b490afc5f643bf79290f153982dfc550f3c5ef83b7aefdb0
MD5 9e69245015993a7c8f3b1ec7000bd16c
BLAKE2b-256 1ea77598ea0051ec9ef49e9da8160bb8ceec306858043691603e0dc11e4428bb

See more details on using hashes here.

Provenance

The following attestation bundles were made for rgapi-0.1.10-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: ci.yml on AnswerDotAI/rgapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file rgapi-0.1.10-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for rgapi-0.1.10-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b2c2608790822e9ae30e7d07442fd20c003dfbaff4f0498758ddf41bcde73b04
MD5 ea586542c11ab9a8160450321a9c298d
BLAKE2b-256 3dfd11c4c72fa8deb3c514612304543f5df63079bcdd74ad3551278edc75c69f

See more details on using hashes here.

Provenance

The following attestation bundles were made for rgapi-0.1.10-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: ci.yml on AnswerDotAI/rgapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page