Grep-shaped CLI search powered by DSPy RLM

Project description

rlmgrep

Grep-shaped search and question answering powered by DSPy RLM. It accepts a natural-language query, scans the files you point at, and prints matching lines in a grep-like format. Use --answer to get a narrative response grounded in the selected files/directories.

Use -v (verbose) to see all of the RLM iterations / thinking.

Quickstart

uv tool install --python 3.11 rlmgrep
# or from GitHub:
# uv tool install --python 3.11 git+https://github.com/halfprice06/rlmgrep.git

export OPENAI_API_KEY=...  # or set keys in ~/.rlmgrep

rlmgrep --answer "What does this repo do and where are the entry points?" .

Quickstart answer mode

rlmgrep -C 2 "Where is retry/backoff configured and what are the defaults?" .

Quickstart context mode

Requirements

Python 3.11+
Deno runtime (DSPy RLM uses a Deno-based interpreter)
API key for your chosen provider (OpenAI, Anthropic, Gemini, etc.)

Non-text Files (PDF + Office + Media)

One of rlmgrep’s most useful features is that it can “grep” PDFs and Office files by converting them into text before the RLM search runs.

How it works:

PDFs are parsed with pypdf. Each page gets a marker line like ===== Page N =====, and output lines include a page=N suffix. Line numbers refer to the extracted text (not PDF coordinates).
Office & binary docs (.docx, .pptx, .xlsx, .html, .zip, etc.) are converted to Markdown via MarkItDown. This happens during ingestion, so rlmgrep can search them like any other text file.
Images can be described by a vision model and then searched through MarkItDown (OpenAI/Anthropic/Gemini), enable and configure in config.toml.
Audio transcription is supported through OpenAI when enabled, configure in config.toml.
MarkItDown is loaded lazily. Text-only runs skip MarkItDown import entirely; non-text conversion is only enabled when candidate non-text files are present.

Sidecar caching:

For images/audio, converted text is cached next to the original file as <original>.<ext>.md and reused on later runs.
Use -a/--text if you want to treat binary files as raw text (UTF‑8 with replacement) and skip conversion.

Install Deno

DSPy's default implementation of RLM requires the Deno runtime. Install it with the official scripts:

macOS/Linux:

curl -fsSL https://deno.land/install.sh | sh

Windows PowerShell:

irm https://deno.land/install.ps1 | iex

Verify it is on your PATH:

deno --version

Usage

rlmgrep [options] "query" [paths...]

Common options:

--answer return a narrative answer before the grep output
--answer-only return only the narrative answer (no grep output)
--signature "..." custom DSPy output fields (outputs only), printed as sectioned text
--signature-json "..." custom DSPy output fields (outputs only), printed as JSON
-C N context lines before/after (grep-style)
-A N context lines after
-B N context lines before
-m N max matching lines per file
-g GLOB include files matching glob (repeatable, comma-separated)
--type T include file types (repeatable, comma-separated)
--hidden include hidden files and directories
--no-ignore do not respect .gitignore
--no-recursive do not recurse directories
-a, --text treat binary files as text
-y, --yes skip file count confirmation
--paths-from-stdin treat stdin as newline-delimited file paths (e.g., rg -l). Without this flag, piped stdin is treated as content.
--model, --sub-model override model names
--api-key, --api-base, --model-type override provider settings
--max-iterations, --max-llm-calls cap RLM search effort
-v, --verbose show verbose RLM output

Examples:

# Natural-language query over a repo
rlmgrep -C 2 "Where is retry/backoff configured and what are the defaults?" .

# Restrict to Python files
rlmgrep "Where do we parse JWTs and enforce expiration?" --type py .

# Glob filters (repeatable or comma-separated)
rlmgrep "How do we map external API errors into internal error codes?" -g "**/*.py" -g "**/*.ts" .

# Read file content from stdin
cat README.md | rlmgrep --answer "What is this tool for and how is it used?"

# Use rg/grep to find candidate files, then rlmgrep over that list
rg -l "token" . | rlmgrep --paths-from-stdin --answer "What does this token control and where is it validated?"

# Custom structured output (sectioned text)
rlmgrep --signature 'summary: str, severities: list[Literal["low","medium","high"]], findings: list[dict[str,str]]' "Audit auth and summarize issues" .

# Custom structured output (JSON)
rlmgrep --signature-json 'summary: str, findings: list[dict[str,str]]' "Audit auth and summarize issues" .

Input selection

Directories are searched recursively by default. Use --no-recursive to stop recursion.
Hidden files and ignore files (.gitignore, .ignore, .rgignore) are respected by default. Use --hidden or --no-ignore to include them.
--type uses built-in type mappings (e.g., py, js, md); unknown values are treated as file extensions.
-g/--glob matches path globs against normalized paths (forward slashes).
Paths are printed relative to the current working directory when possible.
If no paths are provided and stdin is a TTY, rlmgrep defaults to .. If stdin is piped, it reads from stdin and uses the synthetic path <stdin>.
rlmgrep asks for confirmation when more than 1000 files would be loaded (use -y/--yes to skip), and aborts when more than 5000 files would be loaded.

Output contract (stable for agents)

Matches are written to stdout; warnings go to stderr.
Output uses rg-style headings by default:
- A file header line like ./path/to/file
- Then line:\ttext for matches, line-\ttext for context lines
Line numbers are always included and are 1-based.
When context ranges are disjoint, a -- line separates groups.
Exit codes:
- 0 = at least one match
- 1 = no matches
- 2 = usage/config/error

Custom signature mode:

Use --signature or --signature-json with output fields only (for example: summary: str, findings: list[str]).
Do not include inputs or ->; inputs are fixed internally.
Inputs are fixed internally as directory: dict, file_map: str, query: str.
Supported output types: str, int, float, bool, list[T], dict[str, T], Literal[...].
JSON mapping is: str -> JSON string, int/float -> JSON number, bool -> JSON boolean, list[T] -> JSON array, dict[str, T] -> JSON object, Literal[...] -> JSON scalar.
--signature emits sectioned text with clear headers.
--signature-json emits one compact JSON object to stdout (status/progress/warnings stay on stderr).
In custom signature mode, successful execution returns exit code 0.

Regex-style queries (best effort)

rlmgrep can interpret traditional regex-style patterns inside a natural-language prompt. The RLM may use Python (including re) in its internal REPL to approximate regex logic, but it is not guaranteed to behave exactly like grep/rg.

Example (best-effort regex semantics + extra context):

rlmgrep "Find Python functions that look like `def test_\\w+` and are marked as slow or flaky in nearby comments." .

If you need strict, deterministic regex behavior, use rg/grep.

Configuration

rlmgrep creates a default config automatically if missing. The config path is:

~/.rlmgrep/config.toml

Default config values (from rlmgrep/config.py):

model = "openai/gpt-5.2"
sub_model = "openai/gpt-5-mini"
api_base = "https://api.openai.com/v1"
model_type = "responses"
# openrouter example:
# model = "openrouter/anthropic/claude-3.5-sonnet"
# api_base = "https://openrouter.ai/api/v1"
# api_key = ""
temperature = 1.0
max_tokens = 64000
max_iterations = 10
max_llm_calls = 20
file_warn_threshold = 1000
file_hard_max = 5000
markitdown_max_concurrency = 4
# markitdown_enable_images = false
# markitdown_image_llm_model = "gpt-5-mini"
# markitdown_image_llm_provider = "openai"
# markitdown_image_llm_api_key = ""
# markitdown_image_llm_api_base = ""
# markitdown_image_llm_prompt = ""
# markitdown_enable_audio = false
# markitdown_audio_model = "gpt-4o-mini-transcribe-2025-12-15"
# markitdown_audio_provider = "openai"
# markitdown_audio_api_key = ""
# markitdown_audio_api_base = ""

CLI flags override config values. Model keys are resolved as:

CLI flags (--api-key, --sub-api-key)
Config values (api_key, sub_api_key)
Provider env vars inferred from the model name:
- OPENAI_API_KEY
- ANTHROPIC_API_KEY
- GEMINI_API_KEY
- OPENROUTER_API_KEY

If more than one provider key is set and the model does not make the provider obvious, rlmgrep emits a warning and requires an explicit --api-key.

Skill (Anthropic-style)

A ready-to-copy skill lives in:

skills/rlmgrep-usage/SKILL.md

Install it by copying the folder into your agent’s skills directory (for example, ~/.claude/skills/rlmgrep-usage/), then invoke it as $rlmgrep-usage in prompts. This is a lightweight, documentation-only skill meant to guide when to use rlmgrep vs rg/grep.

Development

Install locally: pip install -e . or uv tool install .
Run: rlmgrep "query" .
Run tests: python3.11 -m pytest

Project details

Release history Release notifications | RSS feed

0.1.37

Feb 14, 2026

This version

0.1.36

Feb 6, 2026

0.1.34

Feb 4, 2026

0.1.33

Feb 4, 2026

0.1.32

Feb 4, 2026

0.1.31

Feb 4, 2026

0.1.30

Feb 4, 2026

0.1.28

Feb 4, 2026

0.1.27

Feb 4, 2026

0.1.26

Feb 3, 2026

0.1.24

Feb 3, 2026

0.1.18

Feb 3, 2026

0.1.17

Feb 3, 2026

0.1.16

Feb 3, 2026

0.1.15

Feb 3, 2026

0.1.14

Feb 3, 2026

0.1.13

Feb 3, 2026

0.1.12

Feb 3, 2026

0.1.11

Feb 3, 2026

0.1.10

Feb 3, 2026

0.1.9

Feb 3, 2026

0.1.8

Feb 3, 2026

0.1.7

Feb 3, 2026

0.1.6

Feb 3, 2026

0.1.5

Feb 3, 2026

0.1.4

Feb 3, 2026

0.1.3

Feb 3, 2026

0.1.2

Feb 3, 2026

0.1.1

Feb 3, 2026

0.1.0

Feb 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rlmgrep-0.1.36.tar.gz (29.7 kB view details)

Uploaded Feb 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rlmgrep-0.1.36-py3-none-any.whl (25.7 kB view details)

Uploaded Feb 6, 2026 Python 3

File details

Details for the file rlmgrep-0.1.36.tar.gz.

File metadata

Download URL: rlmgrep-0.1.36.tar.gz
Upload date: Feb 6, 2026
Size: 29.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for rlmgrep-0.1.36.tar.gz
Algorithm	Hash digest
SHA256	`ed6cec2d4d05e11519821ad87277f6340edcd7a358ea4512c99f1b6b86988d36`
MD5	`c4afb7988caef18a193d27e642958d99`
BLAKE2b-256	`5d4c44b622601f92737d975f955900d064fd2bb4cbdde7a70d4cbca26caaf396`

See more details on using hashes here.

File details

Details for the file rlmgrep-0.1.36-py3-none-any.whl.

File metadata

Download URL: rlmgrep-0.1.36-py3-none-any.whl
Upload date: Feb 6, 2026
Size: 25.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for rlmgrep-0.1.36-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ff64a1e35d34e23f94df89e7b8024812475face25d1672dc18f191e1129b8b2b`
MD5	`b8c9b44aa0ee8980794cb9e6d86c2424`
BLAKE2b-256	`9565d495d4cee7a647c8b12c4765c2641bf912649529ed89adba0e5413f836fc`

See more details on using hashes here.

rlmgrep 0.1.36

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

rlmgrep

Quickstart

Requirements

Non-text Files (PDF + Office + Media)

Install Deno

Usage

Input selection

Output contract (stable for agents)

Regex-style queries (best effort)

Configuration

Skill (Anthropic-style)

Development

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes