Skip to main content

Hybrid file search — semantic + keyword matching

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

hygrep (hhg)

Hybrid file search — semantic + keyword matching

pip install hygrep
hhg build ./src
hhg "authentication flow" ./src

What it does

Search code and text using natural language. Combines semantic understanding with keyword matching (BM25) for accurate results:

$ hhg build ./src                    # Build index first
Found 40 files (0.0s) Indexed 646 blocks from 40 files (34.2s)

$ hhg "error handling" ./src         # Then search
api_handlers.ts:127 function errorHandler
  function errorHandler(err: Error, req: Request, res: Response, next: NextFunc...

errors.rs:7 class AppError
  pub enum AppError {
      Database(DatabaseError),

2 results (0.52s)

Why hhg over grep?

grep finds exact text. hhg understands what you're looking for.

Query grep finds hhg finds
"error handling" Comments mentioning it errorHandler(), AppError
"authentication" Strings containing "auth" login(), verify_token()
"database" Config files, comments Connection, query(), Db

Hybrid search combines semantic understanding (finds related concepts) with BM25 keyword matching (finds exact terms). Best of both worlds.

Use grep/ripgrep for exact strings (TODO, FIXME, import statements). Use hhg when you want implementations, not mentions.

Install

Requires Python 3.11-3.13 (onnxruntime lacks 3.14 support).

pip install hygrep
# or
uv tool install hygrep --python 3.13
# or
pipx install hygrep

Models are downloaded from HuggingFace on first use (~40MB).

Usage

hhg build [path]                # Build/update index (required first)
hhg "query" [path]              # Semantic search
hhg status [path]               # Check index status
hhg list [path]                 # List all indexes under path
hhg clean [path]                # Delete index
hhg clean [path] -r             # Delete index and all sub-indexes

# Options
hhg -n 5 "error handling" .     # Limit results
hhg --json "auth" .             # JSON output for scripts/agents
hhg -l "config" .               # List matching files only
hhg -t py,js "api" .            # Filter by file type
hhg --exclude "tests/*" "fn" .  # Exclude patterns

# Model
hhg model                       # Check if model is installed
hhg model install               # Download model (auto-downloads on first use)

Note: Options must come before positional arguments.

Output

Default:

src/auth.py:42 function login
  def login(user, password):
      """Authenticate user and create session."""
      ...

JSON (--json):

[
  {
    "file": "src/auth.py",
    "type": "function",
    "name": "login",
    "line": 42,
    "end_line": 58,
    "content": "def login(user, password): ...",
    "score": 0.87
  }
]

Compact JSON (--json --compact): Same fields without content.

How it Works

Query → Embed → Hybrid search (semantic + BM25) → Results
                        ↓
             Requires 'hhg build' first (.hhg/)
             Auto-updates stale files on search

Supported Files

Code (22 languages): Bash, C, C++, C#, Elixir, Go, Java, JavaScript, JSON, Kotlin, Lua, Mojo, PHP, Python, Ruby, Rust, Svelte, Swift, TOML, TypeScript, YAML, Zig

Text: Markdown, plain text, RST — smart chunking with header context for docs, blog posts, research papers

Development

git clone https://github.com/nijaru/hygrep && cd hygrep
pixi install && pixi run build-ext && pixi run test

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hygrep-0.0.14-cp313-cp313-manylinux_2_17_x86_64.whl (102.4 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

hygrep-0.0.14-cp313-cp313-macosx_11_0_arm64.whl (92.2 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

hygrep-0.0.14-cp312-cp312-manylinux_2_17_x86_64.whl (102.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

hygrep-0.0.14-cp312-cp312-macosx_11_0_arm64.whl (92.2 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

hygrep-0.0.14-cp311-cp311-manylinux_2_17_x86_64.whl (102.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

hygrep-0.0.14-cp311-cp311-macosx_11_0_arm64.whl (92.2 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file hygrep-0.0.14-cp313-cp313-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.14-cp313-cp313-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 4ed5c3538793a91108762ab3095216ca185990125043ab0447e166c17b7a4618
MD5 4ed32e995696aeca02dc3917f9256ef7
BLAKE2b-256 5bb088c428900e49188a097c20750eb681cdf6546114dc2811c86ad14cbf088d

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.14-cp313-cp313-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.14-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.14-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f3db869d1a04fe636efe4e88f9057f4f780dc8d906c298a809e48c564857703b
MD5 5baafcb59eae93a408a838999cc9e37b
BLAKE2b-256 7479df2d6c2a27ae99c77b53573e625eeb1b6f1a97f1937445ca654ccae46443

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.14-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.14-cp312-cp312-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.14-cp312-cp312-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 c564e5bcac9f733c7b88c79580179a310bb74c078ad780f84c4b201e786be216
MD5 4441b9d558e08798e5e207b9b69fec01
BLAKE2b-256 d51591f52b0c1de0fc215769fff4928a3aa924c6307e4cc937d161370cd6bda8

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.14-cp312-cp312-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.14-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.14-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7840939cd3801509778d17f13af3d7d5782071fafeca7a9376297e174d481e07
MD5 608d0930ffff894f59d0081937d07212
BLAKE2b-256 00094e4dbe9651a44f6bafd27898ed68ea51d5b20d33d88db4f412bbf1893d7b

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.14-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.14-cp311-cp311-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.14-cp311-cp311-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 c895717454b41e0c7d236da7a7c9656d67058568e09b8cbafb8c0ffbad774fb8
MD5 d8ac31d34f664526df29399f9590ada9
BLAKE2b-256 7c8e40c0ba60bf1d31d0ef290ea959f917cc20a5131daa190add9083c5837461

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.14-cp311-cp311-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.14-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.14-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a64203554596443d96d3bae6ec691fd7bd8886105ae6d30a3bebb30cffbe46a6
MD5 598e313f5c8f2bd69c7d0279b00aa5f2
BLAKE2b-256 c6966ac0a2e798fb949cf47528375e1b8d5e2adbebabafcf37d5356639e90fc4

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.14-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page