A ridiculously simple search engine factory

These details have not been verified by PyPI

Project links

Homepage

Project description

grub

A ridiculously simple search engine factory.

Point grub at anything — a folder, a codebase, a Python package, a website, a pile of notes — and get back a working search engine in one line. No servers, no indexes to babysit, no configuration.

pip install grub

from grub import grub

grub('./my_notes', 'where did I write about retirement savings')

The AI-first way (start here)

You probably shouldn't be calling grub yourself at all.

grub ships with agent skills — instruction files that teach an AI coding agent (Claude Code, Cursor, and friends) how to drive grub on your behalf. The skills live in .claude/skills/:

Skill	What it lets the agent do
`grub-search`	Build a search index over a folder, codebase, module, website, or list of strings, and answer questions against it.
`grub-extend`	Wire grub up to custom embedding providers (OpenAI, Cohere, …), new backends, or new data sources.

With the skills in place, you stop writing code and start asking:

"Search my ./docs folder and tell me which file explains the deployment process."

"Index this codebase and find where rate limiting is implemented."

"Use semantic search over my meeting notes to find anything about the Q3 budget."

"Search these three documentation URLs and summarize what they say about authentication."

The agent reads the skill, picks the right source type, the right search method (lexical, semantic, or hybrid), chunks long documents when it helps, and hands you the answer. You never see a SearchStore constructor. You never tune a vectorizer. You describe the outcome you want, in English, and it happens.

Why AI-first?

Because the interface to software is changing, and grub is built for the change.

For decades, "using a tool" meant learning the tool — its API, its flags, its mental model — and then translating your intent into its vocabulary. That translation tax was unavoidable. It is not anymore.

An AI agent is a universal adapter between human intent and machine capability. It already knows grub's vocabulary; you don't have to. So the job of a well-designed library is no longer "expose a clever API to humans" — it's "expose powerful, composable capabilities, and ship the knowledge an agent needs to wield them." That knowledge is the skill files.

grub leans all the way into this:

The skills are the primary interface. They are documentation an agent executes, not documentation a human reads and then forgets.
The Python API is the substrate. It stays clean, small, and honest — because an agent calling it deserves the same good design a human would.
You operate at the level of intent. "Find the doc about X" instead of "instantiate, configure, fit, query, parse."

The future of tooling is not humans memorizing more APIs. It's humans stating goals and agents composing capabilities. grub is a small tool, so it's a small example — but the shape is the same all the way up.

For the dinosaurs who want to operate with code directly 🦖

No judgment. Sometimes you are the agent, and a REPL is the fastest path. The Python API is built to be a pleasure to use directly.

One function does it all

from grub import grub

search = grub('./docs')                     # build a searcher
results = grub('./docs', 'how to deploy')    # ...or search in one call

grub() figures out what you handed it:

grub('./docs')                       # a folder of files
grub('src/**/*.py')                  # a glob
grub(some_module)                    # a Python package's source
grub('https://example.com/guide')    # a web page (HTML stripped to text)
grub({'intro': '...', 'faq': '...'}) # a dict of documents
grub(['first doc', 'second doc'])    # a list of strings

Results that explain themselves

results = grub('./docs', 'configure logging')

for hit in results:
    print(hit.score, hit.key, hit.snippet)

results.keys        # ['logging.md', 'setup.md', ...]  best-first
results.scores      # [0.71, 0.33, ...]
print(results.show())            # a tidy ranked rendering
print(search['logging.md'])      # the full original text of a hit

Every hit carries a score and a snippet — the line that shows you why it matched.

Three ways to search

grub(src, query, method='tfidf')     # lexical: shared words (default, fast)
grub(src, query, method='semantic')  # embeddings: shared *meaning*
grub(src, query, method='hybrid')    # a blend of both

Semantic search finds "automobile" when you searched "car". It needs embeddings — either pip install 'grub[semantic]' (a local sentence-transformers model) or your own provider:

grub('./docs', method='semantic', embed=my_openai_embedding_function)

Long documents, chunking, and persistence

grub('./book.txt', chunk=1500)       # split into passages, not whole files
grub('./src', extensions=['.py'])    # filter what gets indexed

from grub import Searcher
grub('./big_codebase').save('code.grub')   # build once
Searcher.load('code.grub')                 # reload instantly

From the command line

grub ./docs "how do I configure logging"
grub ./src --extensions .py --snippets "retry with backoff"
grub https://example.com/guide --semantic "getting started"
grub ./docs                                  # interactive prompt

The legacy API still works

The original SearchStore and friends are unchanged and still exported, so existing code keeps running:

from grub import SearchStore

import sklearn, os
search = SearchStore(os.path.dirname(sklearn.__file__) + '/{}.py')
search('how to calibrate the estimates of my classifier')

How it works

grub is a thin, honest pipeline:

source ──to_store──▶ store ──backend──▶ scores ──▶ SearchResults

to_store turns any source into a Mapping[str, str].
A backend (TF-IDF, embeddings, or a hybrid) scores every document against your query.
Results come back ranked, scored, and annotated with snippets.

Every stage is swappable — see the grub-extend skill or grub/backends.py. That's the whole trick: simple things stay simple, powerful things stay possible.

Install

pip install grub               # core (TF-IDF / lexical search)
pip install 'grub[semantic]'   # adds local embedding-based search

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.7

May 20, 2026

0.1.6

May 13, 2026

0.1.5

May 12, 2026

0.1.4

Apr 29, 2025

0.1.3

Jun 17, 2022

0.1.2

Jun 17, 2022

0.1.1

Dec 1, 2021

0.1.0

Sep 17, 2021

0.0.13

Aug 30, 2021

0.0.12

Aug 9, 2021

0.0.11

Feb 26, 2021

0.0.10

Dec 23, 2020

0.0.9

Dec 22, 2020

0.0.8

Dec 22, 2020

0.0.7

Dec 22, 2020

0.0.6

Oct 9, 2020

0.0.5

Oct 9, 2020

0.0.4

Oct 6, 2020

0.0.3

Sep 24, 2020

0.0.2

Sep 24, 2020

0.0.1

Sep 24, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grub-0.1.7.tar.gz (43.2 kB view details)

Uploaded May 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

grub-0.1.7-py3-none-any.whl (33.0 kB view details)

Uploaded May 20, 2026 Python 3

File details

Details for the file grub-0.1.7.tar.gz.

File metadata

Download URL: grub-0.1.7.tar.gz
Upload date: May 20, 2026
Size: 43.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for grub-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`9ef5f49cae2fb6682dd1fd6c60150e31d51089a84a7c0922572c73e430c5c9b4`
MD5	`8edec0ff3ac2f5ee3313e4c705b034cb`
BLAKE2b-256	`5d4ecfb136b9d3fb035ff32adf4a0361aceac684a149e637ba6771ccdc8e1561`

See more details on using hashes here.

File details

Details for the file grub-0.1.7-py3-none-any.whl.

File metadata

Download URL: grub-0.1.7-py3-none-any.whl
Upload date: May 20, 2026
Size: 33.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.15 {"installer":{"name":"uv","version":"0.11.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for grub-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ea3a0ff369eb525f54c1f9bf0e78d01147fd162174f23857510fc6a9c1f22cbf`
MD5	`8ce85c74772ac75d7a641900582e0f4b`
BLAKE2b-256	`6d654702858a038ece5dfa4ed2e826bf723affce71348778718454cb4bf56d94`

See more details on using hashes here.

grub 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

grub

The AI-first way (start here)

Why AI-first?

For the dinosaurs who want to operate with code directly 🦖

One function does it all

Results that explain themselves

Three ways to search

Long documents, chunking, and persistence

From the command line

The legacy API still works

How it works

Install

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes