Skip to main content

Hybrid file search — semantic + keyword matching

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

hhg (hybrid grep)

Hybrid file search — semantic + keyword matching

pip install hhg
hhg build ./src
hhg "authentication flow" ./src

What it does

Search code and text using natural language. Combines semantic understanding with keyword matching (BM25) for accurate results:

$ hhg build ./src                    # Build index first
Found 40 files (0.0s) Indexed 646 blocks from 40 files (34.2s)

$ hhg "error handling" ./src         # Then search
api_handlers.ts:127 function errorHandler
  function errorHandler(err: Error, req: Request, res: Response, next: NextFunc...

errors.rs:7 class AppError
  pub enum AppError {
      Database(DatabaseError),

2 results (0.52s)

Why hhg over grep?

grep finds exact text. hhg understands what you're looking for.

Query grep finds hhg finds
"error handling" Comments mentioning it errorHandler(), AppError
"authentication" Strings containing "auth" login(), verify_token()
"database" Config files, comments Connection, query(), Db

Hybrid search combines semantic understanding (finds related concepts) with BM25 keyword matching (finds exact terms). Best of both worlds.

Use grep/ripgrep for exact strings (TODO, FIXME, import statements). Use hhg when you want implementations, not mentions.

Install

Requires Python 3.11-3.13 (onnxruntime lacks 3.14 support).

pip install hhg
# or
uv tool install hhg
# or
pipx install hhg

The embedding model (jina-code-int8) downloads on first use (~154MB).

Usage

hhg build [path]                # Build/update index (required first)
hhg "query" [path]              # Semantic search
hhg status [path]               # Check index status
hhg list [path]                 # List all indexes under path
hhg clean [path]                # Delete index
hhg clean [path] -r             # Delete index and all sub-indexes

# Options
hhg -n 5 "error handling" .     # Limit results
hhg --json "auth" .             # JSON output for scripts/agents
hhg -l "config" .               # List matching files only
hhg -t py,js "api" .            # Filter by file type
hhg --exclude "tests/*" "fn" .  # Exclude patterns
hhg --exclude "*.md" "api" .   # Code only (exclude docs)

# Model
hhg model                       # Check if model is installed
hhg model install               # Download model (auto-downloads on first use)

Note: Options go before positional args, or use -- separator:

hhg --exclude "*.md" "api" .   # Options first
hhg "api" . -- --exclude "*.md" # Or use -- separator

Output

Default:

src/auth.py:42 function login
  def login(user, password):
      """Authenticate user and create session."""
      ...

JSON (--json):

[
  {
    "file": "src/auth.py",
    "type": "function",
    "name": "login",
    "line": 42,
    "end_line": 58,
    "content": "def login(user, password): ...",
    "score": 0.87
  }
]

Compact JSON (--json --compact): Same fields without content.

How it Works

Query → Embed → Hybrid search (semantic + BM25) → Results
                        ↓
             Requires 'hhg build' first (.hhg/)
             Auto-updates stale files on search

Supported Files

Code (22 languages): Bash, C, C++, C#, Elixir, Go, Java, JavaScript, JSON, Kotlin, Lua, Mojo, PHP, Python, Ruby, Rust, Svelte, Swift, TOML, TypeScript, YAML, Zig

Text: Markdown, plain text, RST — smart chunking with header context for docs, blog posts, research papers

Development

git clone https://github.com/nijaru/hygrep && cd hygrep
pixi install && pixi run build-ext && pixi run test

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hhg-0.0.20-cp313-cp313-manylinux_2_17_x86_64.whl (104.6 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

hhg-0.0.20-cp313-cp313-macosx_11_0_arm64.whl (94.5 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

hhg-0.0.20-cp312-cp312-manylinux_2_17_x86_64.whl (104.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

hhg-0.0.20-cp312-cp312-macosx_11_0_arm64.whl (94.5 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

hhg-0.0.20-cp311-cp311-manylinux_2_17_x86_64.whl (103.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

hhg-0.0.20-cp311-cp311-macosx_11_0_arm64.whl (94.5 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file hhg-0.0.20-cp313-cp313-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hhg-0.0.20-cp313-cp313-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 c262b873c35e07334f8ab1bd1a2c3288fc38faf45b4ae0f0a1db3a8b30fee241
MD5 bcefce2da98c9375417525e411b8de08
BLAKE2b-256 99c9b9245e1d8e23154908d6b02a1d8139daaae91310a5a9e25cbb86fd931ddc

See more details on using hashes here.

Provenance

The following attestation bundles were made for hhg-0.0.20-cp313-cp313-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hhg-0.0.20-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hhg-0.0.20-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d7548943ec679f9913401681443d74d5c997c127de04337534144c3824da9aea
MD5 12e90aad9d6c8a8919be44963cba8f54
BLAKE2b-256 fe288cd4c8eab983b210c58703528e0a9f9b5d1dc400763c0c2eaa139083540b

See more details on using hashes here.

Provenance

The following attestation bundles were made for hhg-0.0.20-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hhg-0.0.20-cp312-cp312-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hhg-0.0.20-cp312-cp312-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 2cc2e4a08bf5ba87630c18c04603e0276573547f0911e2d1b1319889be131a11
MD5 4d5958fcbd683cd68e8a55567fe42a14
BLAKE2b-256 04af3d6b587cb67b00415a01f1a4e122d80459a42f9bece215ef233a0c305f2b

See more details on using hashes here.

Provenance

The following attestation bundles were made for hhg-0.0.20-cp312-cp312-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hhg-0.0.20-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hhg-0.0.20-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e7ae0b5e58b54a00e0e88bbb5f239ea9b18076be5e6544cacf5f3196923b3973
MD5 1df1a6dd4eabbc01614e357d6097daa0
BLAKE2b-256 b46c30eea796fd8604fda4363ea76898057fe4ab4db324b88fe0a3ea4e63cd10

See more details on using hashes here.

Provenance

The following attestation bundles were made for hhg-0.0.20-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hhg-0.0.20-cp311-cp311-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hhg-0.0.20-cp311-cp311-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 96ef2977cdc6ea0876810027d52641ec53a281f036c1153cacfc360e70013601
MD5 cb26532492f6f2489e59065178053720
BLAKE2b-256 dc37be71d99a39db08e20560a1f21d8837bacbcc882dfb780c45ca50d5213850

See more details on using hashes here.

Provenance

The following attestation bundles were made for hhg-0.0.20-cp311-cp311-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hhg-0.0.20-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hhg-0.0.20-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d5bf930b0301682f1a966038c881fc72eec4c3deb3fd4e30d56f8dcb92f659af
MD5 c34c0b4f3bc9d6b8d7cf0a626736d2dc
BLAKE2b-256 4be883ebd253b4b0a7203dae2560038865c567b77b2a04d7035b397be2a21868

See more details on using hashes here.

Provenance

The following attestation bundles were made for hhg-0.0.20-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page