Skip to main content

Hybrid file search — semantic + keyword matching

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

hygrep (hhg)

Hybrid file search — semantic + keyword matching

pip install hygrep
hhg build ./src
hhg "authentication flow" ./src

What it does

Search code and text using natural language. Combines semantic understanding with keyword matching (BM25) for accurate results:

$ hhg build ./src                    # Build index first
Found 40 files (0.0s) Indexed 646 blocks from 40 files (34.2s)

$ hhg "error handling" ./src         # Then search
api_handlers.ts:127 function errorHandler
  function errorHandler(err: Error, req: Request, res: Response, next: NextFunc...

errors.rs:7 class AppError
  pub enum AppError {
      Database(DatabaseError),

2 results (0.52s)

Why hhg over grep?

grep finds exact text. hhg understands what you're looking for.

Query grep finds hhg finds
"error handling" Comments mentioning it errorHandler(), AppError
"authentication" Strings containing "auth" login(), verify_token()
"database" Config files, comments Connection, query(), Db

Hybrid search combines semantic understanding (finds related concepts) with BM25 keyword matching (finds exact terms). Best of both worlds.

Use grep/ripgrep for exact strings (TODO, FIXME, import statements). Use hhg when you want implementations, not mentions.

Install

Requires Python 3.11-3.13 (onnxruntime lacks 3.14 support).

pip install hygrep
# or
uv tool install hygrep --python 3.13
# or
pipx install hygrep

The embedding model (jina-code-int8) downloads on first use (~154MB).

Usage

hhg build [path]                # Build/update index (required first)
hhg "query" [path]              # Semantic search
hhg status [path]               # Check index status
hhg list [path]                 # List all indexes under path
hhg clean [path]                # Delete index
hhg clean [path] -r             # Delete index and all sub-indexes

# Options
hhg -n 5 "error handling" .     # Limit results
hhg --json "auth" .             # JSON output for scripts/agents
hhg -l "config" .               # List matching files only
hhg -t py,js "api" .            # Filter by file type
hhg --exclude "tests/*" "fn" .  # Exclude patterns
hhg --exclude "*.md" "api" .   # Code only (exclude docs)

# Model
hhg model                       # Check if model is installed
hhg model install               # Download model (auto-downloads on first use)

Note: Options go before positional args, or use -- separator:

hhg --exclude "*.md" "api" .   # Options first
hhg "api" . -- --exclude "*.md" # Or use -- separator

Output

Default:

src/auth.py:42 function login
  def login(user, password):
      """Authenticate user and create session."""
      ...

JSON (--json):

[
  {
    "file": "src/auth.py",
    "type": "function",
    "name": "login",
    "line": 42,
    "end_line": 58,
    "content": "def login(user, password): ...",
    "score": 0.87
  }
]

Compact JSON (--json --compact): Same fields without content.

How it Works

Query → Embed → Hybrid search (semantic + BM25) → Results
                        ↓
             Requires 'hhg build' first (.hhg/)
             Auto-updates stale files on search

Supported Files

Code (22 languages): Bash, C, C++, C#, Elixir, Go, Java, JavaScript, JSON, Kotlin, Lua, Mojo, PHP, Python, Ruby, Rust, Svelte, Swift, TOML, TypeScript, YAML, Zig

Text: Markdown, plain text, RST — smart chunking with header context for docs, blog posts, research papers

Development

git clone https://github.com/nijaru/hygrep && cd hygrep
pixi install && pixi run build-ext && pixi run test

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hygrep-0.0.16-cp313-cp313-manylinux_2_17_x86_64.whl (102.4 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

hygrep-0.0.16-cp313-cp313-macosx_11_0_arm64.whl (92.4 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

hygrep-0.0.16-cp312-cp312-manylinux_2_17_x86_64.whl (102.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

hygrep-0.0.16-cp312-cp312-macosx_11_0_arm64.whl (92.4 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

hygrep-0.0.16-cp311-cp311-manylinux_2_17_x86_64.whl (102.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

hygrep-0.0.16-cp311-cp311-macosx_11_0_arm64.whl (92.4 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file hygrep-0.0.16-cp313-cp313-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.16-cp313-cp313-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 11ef89efe21d4e477f41bfb7127cedc74f1f9f11c0502051bb5282dff133fb42
MD5 fc8c849d4d295c6298ef6c137bca9d63
BLAKE2b-256 95e60d29b31cf7132aca882ddf17c349edecc52d6ed67cb77f9d4a2100d263cf

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.16-cp313-cp313-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.16-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.16-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7a649aa06d852b09146cc8816f0fc354df2bc6ace7ec76c505572a2f42517e50
MD5 c95a4ca3a141c4216222ac4f820ef143
BLAKE2b-256 6d91a9551b3ac68008d8978b4df6551d194ce69b334a518daebb909b12a401f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.16-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.16-cp312-cp312-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.16-cp312-cp312-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 3f1d39fe887494b026a5659fb43a011726d6e3dfbf68e6745822c9054d300db5
MD5 86b19931b3b84888a57cb3236e955734
BLAKE2b-256 965c03bb3b45da050a509b2bb05c241eb59eceedeb4b727228a09cd22cd30898

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.16-cp312-cp312-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.16-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.16-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3e8c365c38d810df1c1ae907cdc62fc7eff0f7e4e37c9eb5547f1a4849f85cfc
MD5 57ef0d04b892a41573e92c6eb0da131f
BLAKE2b-256 a8926adfa6776c3aac1a31354209d9202993573131d4dac5014cb42044e1eabc

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.16-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.16-cp311-cp311-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.16-cp311-cp311-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 6e1c595331e076ab84f3b579d4059d95d2eecfc24caed7a5d4dd03fc05d124b6
MD5 6d06e223d13592b044130a7f48cb5332
BLAKE2b-256 f955c14df577c9e6c0003517215510a818cd068550da1fc414e36973cead626e

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.16-cp311-cp311-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.16-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.16-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4adbf8b4b6a6a48a54ee963489e91824f6052d1b1f3589ffa3d5abb1547bebd1
MD5 1d75507219fbed3612f407ef3515cee1
BLAKE2b-256 f53e39e3e9a0f5d7b4fcd831dc9e0505e8134874cff3c0ae8b46cb2844f8e47a

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.16-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page