Queryable concept map of a codebase for LLM coding agents

These details have not been verified by PyPI

Project description

combfind

Give an AI agent a codebase. combfind tells it where to look.

combfind builds a local index of a repository so an agent can find the right files and functions for a task with a plain-text query, without reading the entire codebase.

Install

For local LLM inference:

pip3 install "combfind[llm]" \
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu

Download the default model (Qwen2.5-Coder-3B-Instruct Q6_K, ~2.5 GB):

combfind download-model

For a remote OpenAI-compatible API instead:

pip3 install "combfind[openai]"

For Apple Silicon (MLX):

pip3 install "combfind[mlx]"

Usage

# Index a repository (local LLM, auto-detected model)
combfind init /path/to/repo --db repo.db

# Exclude test files (recommended for cleaner concepts)
combfind init /path/to/repo --db repo.db --exclude-regex '.*test.*'

# Index using a remote OpenAI-compatible API
COMBFIND_LLM_API_KEY=sk-... COMBFIND_LLM_MODEL=gpt-4o-mini \
  combfind init /path/to/repo --db repo.db --llm-mode openai

# Index using Apple Silicon MLX
combfind init /path/to/repo --db repo.db --llm-mode mlx \
  --llm-model mlx-community/Qwen2.5-7B-Instruct-4bit

# Query it
combfind query "how does authentication work" --db repo.db
combfind query "where are database migrations" --db repo.db --format json

# Inspect a symbol returned by a query
combfind inspect auth.service.AuthService --db repo.db
combfind inspect auth.service.AuthService --db repo.db --format json

Query output (text)

[1] Token Refresh (implementation) — 0.87
    why: Handles session token validation and refresh logic.
    auth/service.py
      auth.service.AuthService.refresh  :42-67
      auth.service.AuthService.validate  :70-91

Query output (JSON)

[
  {
    "rank": 1,
    "concept": "Token Refresh",
    "role": "implementation",
    "score": 0.87,
    "files": [
      {
        "path": "auth/service.py",
        "symbols": [
          {"name": "refresh", "qualified_name": "auth.service.AuthService.refresh", "start_line": 42, "end_line": 67},
          {"name": "validate", "qualified_name": "auth.service.AuthService.validate", "start_line": 70, "end_line": 91}
        ]
      }
    ],
    "why_relevant": "Handles session token validation and refresh logic.",
    "sibling_implementations": []
  }
]

Inspect output (text)

auth.service.AuthService  (class, auth/service.py:10-80)
concept:  Token Refresh  [implementation]
sig:      class AuthService

callers (1):
  auth.mock.MockAuthService  auth/mock.py:5

callees (1):
  auth.service.AuthService.validate  auth/service.py:20

concept siblings (1):
  auth.service.AuthService.validate  [method]  auth/service.py

Init options

Flag	Default	Description
`--db`	`<repo_path>/.combfind.db`	Output path
`--llm-model`	auto-detected	Path to a GGUF model file (local mode only)
`--llm-mode`	`local`	LLM backend: `local` (llama.cpp), `openai` (OpenAI-compatible API), or `mlx` (Apple Silicon)
`--exclude-paths`	-	Paths to skip relative to repo root (repeatable)
`--exclude-regex`	-	Regex matched against file paths to skip
`--llm-workers`	`1`	Parallel LLM calls (useful with `--llm-mode openai`)
`--docgen`	off	Generate LLM docstrings for undocumented symbols (slow)
`--force`	off	Re-run all stages, ignoring the cache

Query options

Flag	Default	Description
`--db`	`.combfind.db`	Database to query
`--top-k`	5	Number of results to return
`--format`	`text`	Output format: `text` or `json`
`--rerank`	off	Re-score results with LLM for better precision (requires `--llm-mode`)
`--agentic`	off	Run iterative query loop: LLM steers follow-up searches until satisfied (requires `--llm-mode`)
`--agentic-limit`	`3`	Max iterations for `--agentic` mode
`--llm-mode`	-	LLM backend for `--rerank` / `--agentic`: `local` or `openai`

Inspect options

Flag	Default	Description
`--db`	`.combfind.db`	Database to query
`--format`	`text`	Output format: `text` or `json`

Environment variables

Variable	Default	Description
`COMBFIND_LOG_LEVEL`	`info`	Log verbosity: `debug`, `info`, `warning`, `error`
`COMBFIND_MODEL`	(auto-detected)	GGUF path for `local` mode / HF repo for `mlx` mode; equivalent to `--llm-model`
`COMBFIND_LLM_BASE_URL`	-	Base URL for OpenAI-compatible API (e.g. `https://api.openai.com/v1`)
`COMBFIND_LLM_API_KEY`	-	API key for the remote LLM
`COMBFIND_LLM_MODEL`	`gpt-4o-mini`	Model name to use with `--llm-mode openai`
`HF_HUB_OFFLINE`	-	Set to `1` to skip HuggingFace network checks and use cached embedding models only

Using a remote LLM API

Pass --llm-mode openai to use any OpenAI-compatible API instead of a local model. Configure it with environment variables:

export COMBFIND_LLM_BASE_URL=https://api.openai.com/v1
export COMBFIND_LLM_API_KEY=sk-...
export COMBFIND_LLM_MODEL=gpt-4o-mini

combfind init /path/to/repo --db repo.db --llm-mode openai

Any API that speaks the OpenAI chat completions format works, including:

OpenAI — set COMBFIND_LLM_BASE_URL=https://api.openai.com/v1
Ollama — set COMBFIND_LLM_BASE_URL=http://localhost:11434/v1 and COMBFIND_LLM_API_KEY=ollama
LM Studio — set COMBFIND_LLM_BASE_URL=http://localhost:1234/v1
Any other OpenAI-compatible server — point COMBFIND_LLM_BASE_URL at its /v1 endpoint

--llm-model is ignored in openai mode; the model is selected via COMBFIND_LLM_MODEL.

Clustering

combfind groups symbols by their package/directory, then sub-clusters large packages using KMeans (targeting ~20 symbols per concept). This produces stable, interpretable concepts aligned with the codebase structure.

For best results, exclude test files at index time:

combfind init . --exclude-regex '.*test.*'

Supported languages

Python, Go, Java. More languages can be added via tree-sitter grammars.

Optional dependencies

These tools are not required but improve reference edge quality when installed:

Tool	Language	Install	Effect
`scip-go`	Go	`go install github.com/scip-code/scip-go/cmd/scip-go@latest`	Type-resolved call/import edges between Go symbols
`scip-python`	Python	`npm install -g @sourcegraph/scip-python`	Type-resolved call/import edges between Python symbols
`scip-java`	Java	See scip-java releases	Type-resolved call/import edges between Java symbols

Without them, combfind falls back to tree-sitter heuristics for import edges, which is less precise but works out of the box.

Contributing

See CONTRIBUTING.md for dev setup, commit conventions, and how the release pipeline works.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.5.3

May 3, 2026

1.5.2

May 3, 2026

1.5.1

May 2, 2026

1.5.0

May 2, 2026

1.4.0

May 2, 2026

This version

1.3.0

May 2, 2026

1.2.0

May 2, 2026

1.1.0

May 2, 2026

1.0.3

May 2, 2026

1.0.2

May 2, 2026

1.0.1

May 2, 2026

0.2.0

May 2, 2026

0.1.25

May 1, 2026

0.1.24

May 1, 2026

0.1.23

May 1, 2026

0.1.22

May 1, 2026

0.1.21

May 1, 2026

0.1.20

May 1, 2026

0.1.19

May 1, 2026

0.1.18

May 1, 2026

0.1.17

May 1, 2026

0.1.16

May 1, 2026

0.1.15

May 1, 2026

0.1.14

May 1, 2026

0.1.13

May 1, 2026

0.1.12

May 1, 2026

0.1.11

May 1, 2026

0.1.10

May 1, 2026

0.1.9

May 1, 2026

0.1.8

May 1, 2026

0.1.7

May 1, 2026

0.1.6

May 1, 2026

0.1.5

May 1, 2026

0.1.4

May 1, 2026

0.1.3

May 1, 2026

0.1.2

May 1, 2026

0.1.0

May 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

combfind-1.3.0.tar.gz (37.0 kB view details)

Uploaded May 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

combfind-1.3.0-py3-none-any.whl (46.6 kB view details)

Uploaded May 2, 2026 Python 3

File details

Details for the file combfind-1.3.0.tar.gz.

File metadata

Download URL: combfind-1.3.0.tar.gz
Upload date: May 2, 2026
Size: 37.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for combfind-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`7e1e82d033ec05ebaf10daf2d4d26f72c75e362c77501901dcb72fb51f0af5bd`
MD5	`8a8f1e2736c561172a2b80bbd1a353b9`
BLAKE2b-256	`77eb551a4868d8d440842e4856e34b50f72a412450822b846887bee74571bff7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for combfind-1.3.0.tar.gz:

Publisher: release.yml on The127/combfind

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: combfind-1.3.0.tar.gz
- Subject digest: 7e1e82d033ec05ebaf10daf2d4d26f72c75e362c77501901dcb72fb51f0af5bd
- Sigstore transparency entry: 1429107263
- Sigstore integration time: May 2, 2026
Source repository:
- Permalink: The127/combfind@8a115ebeb5ec5c211fd323779445fe40a9694974
- Branch / Tag: refs/heads/master
- Owner: https://github.com/The127
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@8a115ebeb5ec5c211fd323779445fe40a9694974
- Trigger Event: push

File details

Details for the file combfind-1.3.0-py3-none-any.whl.

File metadata

Download URL: combfind-1.3.0-py3-none-any.whl
Upload date: May 2, 2026
Size: 46.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for combfind-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8490135f7ded9c4d292ff74dc37cc72d2d2b4ce2c0c569a5f559e6cf9b31cfe7`
MD5	`18f5de59e629c2cb1ed7ff3d3c49fc3d`
BLAKE2b-256	`a9037e59136923c1e2ed4d5506dd6abe59772f185fb9e1fbfb6b1d84d92162e5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for combfind-1.3.0-py3-none-any.whl:

Publisher: release.yml on The127/combfind

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: combfind-1.3.0-py3-none-any.whl
- Subject digest: 8490135f7ded9c4d292ff74dc37cc72d2d2b4ce2c0c569a5f559e6cf9b31cfe7
- Sigstore transparency entry: 1429107272
- Sigstore integration time: May 2, 2026
Source repository:
- Permalink: The127/combfind@8a115ebeb5ec5c211fd323779445fe40a9694974
- Branch / Tag: refs/heads/master
- Owner: https://github.com/The127
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@8a115ebeb5ec5c211fd323779445fe40a9694974
- Trigger Event: push

combfind 1.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

combfind

Install

Usage

Query output (text)

Query output (JSON)

Inspect output (text)

Init options

Query options

Inspect options

Environment variables

Using a remote LLM API

Clustering

Supported languages

Optional dependencies

Contributing

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance