Skip to main content

Hybrid file search — semantic + keyword matching

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

hygrep (hhg)

Hybrid file search — semantic + keyword matching

pip install hygrep
hhg build ./src
hhg "authentication flow" ./src

What it does

Search code and text using natural language. Combines semantic understanding with keyword matching (BM25) for accurate results:

$ hhg build ./src                    # Build index first
Found 40 files (0.0s) Indexed 646 blocks from 40 files (34.2s)

$ hhg "error handling" ./src         # Then search
api_handlers.ts:127 function errorHandler
  function errorHandler(err: Error, req: Request, res: Response, next: NextFunc...

errors.rs:7 class AppError
  pub enum AppError {
      Database(DatabaseError),

2 results (0.52s)

Why hhg over grep?

grep finds exact text. hhg understands what you're looking for.

Query grep finds hhg finds
"error handling" Comments mentioning it errorHandler(), AppError
"authentication" Strings containing "auth" login(), verify_token()
"database" Config files, comments Connection, query(), Db

Hybrid search combines semantic understanding (finds related concepts) with BM25 keyword matching (finds exact terms). Best of both worlds.

Use grep/ripgrep for exact strings (TODO, FIXME, import statements). Use hhg when you want implementations, not mentions.

Install

Requires Python 3.11-3.13 (onnxruntime lacks 3.14 support).

pip install hygrep
# or
uv tool install hygrep --python 3.13
# or
pipx install hygrep

Models are downloaded from HuggingFace on first use (~40MB).

Usage

hhg build [path]                # Build/update index (required first)
hhg "query" [path]              # Semantic search
hhg status [path]               # Check index status
hhg clean [path]                # Delete index

# Options
hhg -n 5 "error handling" .     # Limit results
hhg --json "auth" .             # JSON output for scripts/agents
hhg -l "config" .               # List matching files only
hhg -t py,js "api" .            # Filter by file type
hhg --exclude "tests/*" "fn" .  # Exclude patterns

# Model management
hhg model                       # Check model status
hhg model install               # Download/reinstall models

Note: Options must come before positional arguments.

Output

Default:

src/auth.py:42 function login
  def login(user, password):
      """Authenticate user and create session."""
      ...

JSON (--json):

[
  {
    "file": "src/auth.py",
    "type": "function",
    "name": "login",
    "line": 42,
    "end_line": 58,
    "content": "def login(user, password): ...",
    "score": 0.87
  }
]

Compact JSON (--json --compact): Same fields without content.

How it Works

Query → Embed → Hybrid search (semantic + BM25) → Results
                        ↓
             Requires 'hhg build' first (.hhg/)
             Auto-updates stale files on search

Supported Files

Code (22 languages): Bash, C, C++, C#, Elixir, Go, Java, JavaScript, JSON, Kotlin, Lua, Mojo, PHP, Python, Ruby, Rust, Svelte, Swift, TOML, TypeScript, YAML, Zig

Text: Markdown, plain text, RST — smart chunking with header context for docs, blog posts, research papers

Development

git clone https://github.com/nijaru/hygrep && cd hygrep
pixi install && pixi run build-ext && pixi run test

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hygrep-0.0.13-cp313-cp313-manylinux_2_17_x86_64.whl (100.3 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

hygrep-0.0.13-cp313-cp313-macosx_11_0_arm64.whl (90.4 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

hygrep-0.0.13-cp312-cp312-manylinux_2_17_x86_64.whl (100.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

hygrep-0.0.13-cp312-cp312-macosx_11_0_arm64.whl (90.4 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

hygrep-0.0.13-cp311-cp311-manylinux_2_17_x86_64.whl (100.3 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

hygrep-0.0.13-cp311-cp311-macosx_11_0_arm64.whl (90.2 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file hygrep-0.0.13-cp313-cp313-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.13-cp313-cp313-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 c5be34fe4198d59578e196685eb29c271d6fe590394592c447688a2c07b6a7e0
MD5 d3f43ef41d6e6c8726281cf54e2b776e
BLAKE2b-256 ee4dfceb4425c867fdd92f2519f614af482f90a7479a167031df74f69fa28435

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.13-cp313-cp313-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.13-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.13-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 496ad676a3a91edf171aeb6bcc54461fe15b1d20a4f6287527dd7d225d4cb92b
MD5 f35fe13c18981fdf88fc963600228af6
BLAKE2b-256 6bbb46d4432e8f78525a7a79eaf3ff992b8ea3c07a2d0e76d70c8032279ec6e7

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.13-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.13-cp312-cp312-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.13-cp312-cp312-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 7e6b6b9f9ce9935d4d918bcdcbbcce1038bde5762362b1b67eb8687f22d6241e
MD5 06aaff78e696e1561aa2d6df9d3e8483
BLAKE2b-256 6fcc1b20820c6b8a00283ca5d65e820f3c9ddbe92fc8a8be51196d15984d983e

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.13-cp312-cp312-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.13-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.13-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dfb69e0e3543ff4dc20b9876eab02b1cc5d64d541b477b2d44753bb4d1ed1dbd
MD5 bef1fec2f11f8346bd97390cfdde60f0
BLAKE2b-256 8f507dda8a0975f28e3c9e93b813fc796dfaf57f264bb17764aa9642973e21c2

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.13-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.13-cp311-cp311-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.13-cp311-cp311-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 4b1f2b13a913f0325ac02d4eb50c2351e131597d1cb808e6a154839141c4502b
MD5 e6602afc5f61da92a7dc44c74198c726
BLAKE2b-256 f3b046f3c22c829a42377c4ffa432a11d362a34e1fb419adf7b788da48facb76

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.13-cp311-cp311-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.13-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.13-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c9aab2b2dbdbb9a4f5b286f597110ec9a8f28f7e1d9da29a249e58b99111b630
MD5 4ef5d3fb3b9e894c39f98817fb2cda0c
BLAKE2b-256 ce82c087b72732697e6c7cefa7ed804b7d4f9f750abfa9162e0b0d967919b11d

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.13-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page