Skip to main content

Hybrid file search — semantic + keyword matching

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

hygrep (hhg)

Hybrid file search — semantic + keyword matching

pip install hygrep
hhg build ./src
hhg "authentication flow" ./src

What it does

Search code and text using natural language. Combines semantic understanding with keyword matching (BM25) for accurate results:

$ hhg build ./src                    # Build index first
Found 40 files (0.0s) Indexed 646 blocks from 40 files (34.2s)

$ hhg "error handling" ./src         # Then search
api_handlers.ts:127 function errorHandler
  function errorHandler(err: Error, req: Request, res: Response, next: NextFunc...

errors.rs:7 class AppError
  pub enum AppError {
      Database(DatabaseError),

2 results (0.52s)

Why hhg over grep?

grep finds exact text. hhg understands what you're looking for.

Query grep finds hhg finds
"error handling" Comments mentioning it errorHandler(), AppError
"authentication" Strings containing "auth" login(), verify_token()
"database" Config files, comments Connection, query(), Db

Hybrid search combines semantic understanding (finds related concepts) with BM25 keyword matching (finds exact terms). Best of both worlds.

Use grep/ripgrep for exact strings (TODO, FIXME, import statements). Use hhg when you want implementations, not mentions.

Install

Requires Python 3.11-3.13 (onnxruntime lacks 3.14 support).

pip install hygrep
# or
uv tool install hygrep --python 3.13
# or
pipx install hygrep

Models are downloaded from HuggingFace on first use (~40MB).

Usage

hhg build [path]                # Build/update index (required first)
hhg "query" [path]              # Semantic search
hhg status [path]               # Check index status
hhg list [path]                 # List all indexes under path
hhg clean [path]                # Delete index
hhg clean [path] -r             # Delete index and all sub-indexes

# Options
hhg -n 5 "error handling" .     # Limit results
hhg --json "auth" .             # JSON output for scripts/agents
hhg -l "config" .               # List matching files only
hhg -t py,js "api" .            # Filter by file type
hhg --exclude "tests/*" "fn" .  # Exclude patterns
hhg --exclude "*.md" "api" .   # Code only (exclude docs)

# Model
hhg model                       # Check if model is installed
hhg model install               # Download model (auto-downloads on first use)

Note: Options go before positional args, or use -- separator:

hhg --exclude "*.md" "api" .   # Options first
hhg "api" . -- --exclude "*.md" # Or use -- separator

Output

Default:

src/auth.py:42 function login
  def login(user, password):
      """Authenticate user and create session."""
      ...

JSON (--json):

[
  {
    "file": "src/auth.py",
    "type": "function",
    "name": "login",
    "line": 42,
    "end_line": 58,
    "content": "def login(user, password): ...",
    "score": 0.87
  }
]

Compact JSON (--json --compact): Same fields without content.

How it Works

Query → Embed → Hybrid search (semantic + BM25) → Results
                        ↓
             Requires 'hhg build' first (.hhg/)
             Auto-updates stale files on search

Supported Files

Code (22 languages): Bash, C, C++, C#, Elixir, Go, Java, JavaScript, JSON, Kotlin, Lua, Mojo, PHP, Python, Ruby, Rust, Svelte, Swift, TOML, TypeScript, YAML, Zig

Text: Markdown, plain text, RST — smart chunking with header context for docs, blog posts, research papers

Development

git clone https://github.com/nijaru/hygrep && cd hygrep
pixi install && pixi run build-ext && pixi run test

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hygrep-0.0.15-cp313-cp313-manylinux_2_17_x86_64.whl (102.5 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

hygrep-0.0.15-cp313-cp313-macosx_11_0_arm64.whl (92.5 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

hygrep-0.0.15-cp312-cp312-manylinux_2_17_x86_64.whl (102.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

hygrep-0.0.15-cp312-cp312-macosx_11_0_arm64.whl (92.4 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

hygrep-0.0.15-cp311-cp311-manylinux_2_17_x86_64.whl (101.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

hygrep-0.0.15-cp311-cp311-macosx_11_0_arm64.whl (92.5 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file hygrep-0.0.15-cp313-cp313-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.15-cp313-cp313-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 d2d18ade5db604e3c56ea9f8ead030cdd534578de480adc73e27bf00884a4d4e
MD5 495df1e8c97aeb576235dded5ea1812b
BLAKE2b-256 f4e45b572d8d3e1daded6e439434b0eb49bdfe71bb72dd0550131c0adfedba4f

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.15-cp313-cp313-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.15-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.15-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b5cfb2256ce944b64b3bc2c2d6dc3ca2d15a27a6746533622adb6ab6e18bc56b
MD5 0e8b4998eea19a07f2c5d82f5e1faf40
BLAKE2b-256 9eaf6e4e99bfa3b0ce6dc8d5441519f627c99845908f1dfafc820ff5fdab571f

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.15-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.15-cp312-cp312-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.15-cp312-cp312-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 43efa1d5a40b6e39fe06d3e40f1aa14895d79dffa01aa03d7bd276ade7000ef0
MD5 710d1d353813ff77546347f1630b190a
BLAKE2b-256 70c731364f43fa8f77b4a20fac016fda7a998654a13c2bfb414ee0dcb7892b2c

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.15-cp312-cp312-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.15-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.15-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 db15406807defd4f82cc06f52d3d4035eb3529acdb9c3e87be6321dcfcae1df2
MD5 2aa0b3d3f722cb97a1a03497027f0111
BLAKE2b-256 453447a89be8858f49d256fa0713e2ece8b2b461db5756c65408460444a59750

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.15-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.15-cp311-cp311-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.15-cp311-cp311-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 bdad443e0243468e0c301f4d16777ad45ada67b21c623a1eb2bc206ef99e55ea
MD5 e42c4530843e6c562a0c2e062fe83d41
BLAKE2b-256 62f4076e3e849e24a5099b43d3ea291aee96fd1501f7fa128089215b1ae01f9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.15-cp311-cp311-manylinux_2_17_x86_64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hygrep-0.0.15-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for hygrep-0.0.15-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 34a9f25dd9b1c0758cf534f46b8a9c3f775e3e9cb808e7d007c40b4fddfe9bd5
MD5 debadbc18e0860e648a829a872fcd9d1
BLAKE2b-256 4ea66d6afeef4291f805caa2ce00a5065b28125276f0505659cc8115d55299f0

See more details on using hashes here.

Provenance

The following attestation bundles were made for hygrep-0.0.15-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on nijaru/hygrep

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page