Skip to main content

Token-efficient CLI for indexing and searching code symbols (Python-first, designed for minimal LLM/agent context size)

Project description

Sampler

Token-efficient CLI for indexing and searching code symbols across multiple projects.

Current version: 0.4.4

Designed for humans and agents: compact default output, short paths, and low-noise symbol views.

Requirements

  • Python 3.11+

Installation

pip install sampler-cli

Development setup:

pip install -e '.[dev]'

Semantic stack (TF-IDF + local hash fallback):

pip install -e '.[semantic]'

Quick Start

sampler init
sampler project add myproj /absolute/path/to/project --language auto
sampler index myproj
sampler search retry --project myproj
sampler symbols myproj
sampler overview src/main.py

Command Overview

Core:

  • sampler version [--plain]
  • sampler init
  • sampler index <project>
  • sampler search <query> [--project <name>] [--type <t>] [--limit <n>] [--semantic] [--style plain|bars]
  • sampler search-all <query> [--type <t>] [--limit <n>]
  • sampler symbols <project> [--type <t>] [--limit <n>]
  • sampler overview <filepath> [--style plain|bars]

Relationships:

  • sampler callers <symbol> [--project <name>] [--file <path-or-suffix>]
  • sampler usages <symbol> [--project <name>] [--file <path-or-suffix>]
  • sampler related <symbol> [--project <name>] [--file <path-or-suffix>] [--style plain|bars]
  • Selector alternativo: <path>:<symbol> (ej. app/utils/helpers.py:format_kda)

Project management:

  • sampler project add <name> <path> --language <python|go|typescript|javascript|vue|auto>
  • sampler project update <name> [--path <abs-path>] [--language <lang>]
  • sampler project list
  • sampler project deps <name>
  • sampler project remove <name>

Config:

  • sampler config show
  • sampler config embeddings [--provider P] [--model M]

Semantic and analysis:

  • sampler embed <project> [--batch-size <n>]
  • sampler stale-code <project> [--limit <n>]

Embeddings & Semantic Search

sampler search --semantic (and hybrid ranking) supports pluggable providers via the adapter pattern:

  • Default: bge-small (BAAI/bge-small-en-v1.5 via fastembed — lightweight ONNX, ~384 dim, local).
  • Other built-ins: hash (always-on deterministic fallback), ollama (e.g. nomic-embed-text), nomic, openai, fastembed.
  • TF-IDF (sklearn, on-the-fly, no pre-embed) remains the fast lexical primary when no provider embeddings are precomputed for the active model.
  • Hash fingerprint is the final always-available fallback.

Configuration (in ~/.sampler/config.yaml or via sampler config embeddings ...):

embeddings:
  provider: "bge-small"
  # provider: "ollama"
  # model: "nomic-embed-text"
  # base_url: "http://localhost:11434"

Install:

# For default BGE (recommended for most users)
pip install 'sampler-cli[embeddings]'

# Or for Ollama / OpenAI only
pip install 'sampler-cli[ollama-embeddings]'
pip install 'sampler-cli[openai-embeddings]'

sampler embed <project> precomputes vectors using the current configured provider (progress bar). Changing provider? Re-run embed after updating config (old vectors are ignored until re-embedded).

Offline / air-gapped: provider: hash (or just don't install the embeddings extra — TF-IDF + hash still work if you have [semantic]).

Language Support

  • Python parser: stdlib AST (stable)
  • Go parser: tree-sitter-go (real extraction)
  • TypeScript/JavaScript parser: tree-sitter-typescript (real extraction)
  • Vue parser: extracts <script>/<script setup> + delegates to TS/JS parser (supports lang=ts/js etc.)
  • --language auto: per-file language detection for monorepos/multi-language projects (for auto projects, project list shows detected languages + file % breakdown)

Stale Code Detection

sampler stale-code <project> reports candidate stale functions/methods where:

  • function is called from test files
  • function has zero non-test callers in project call graph
  • symbol is defined in production code (symbols defined in test files are excluded)

Test file detection supports common multi-language patterns:

  • Python: tests/, test_*.py, *_test.py
  • Go: *_test.go
  • TypeScript/JavaScript/Vue: __tests__/, test/, spec/, *.test.*, *.spec.* (incl. *.test.vue)

This is heuristic signal, not guaranteed dead-code proof.

Examples

$ sampler search worker --project myproj
myproj:src/tasks/celery_app.py:70 function on_worker_ready  def on_worker_ready(sender)

$ sampler related ConfigManager --project myproj --style bars
myproj:src/config.py:24-105 class ConfigManager  [parent]
...

$ sampler stale-code myproj
myproj:src/utils/retry.py:12-28 function retry_request  test_callers=2 non_test_callers=0  [tests.test_retry.test_retry_request]

Data Location

  • Config: ~/.sampler/config.yaml
  • DB: ~/.sampler/graph.db

Running Tests

pytest -q

Notes

  • Compact output is default by design (token-efficient for agent workflows).
  • For broader roadmap details, see TODO.md and PLAN.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sampler_cli-0.4.4.tar.gz (61.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sampler_cli-0.4.4-py3-none-any.whl (58.1 kB view details)

Uploaded Python 3

File details

Details for the file sampler_cli-0.4.4.tar.gz.

File metadata

  • Download URL: sampler_cli-0.4.4.tar.gz
  • Upload date:
  • Size: 61.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sampler_cli-0.4.4.tar.gz
Algorithm Hash digest
SHA256 fbeb897891073ebc30ffb38c3c9cf2b9e948f5fa5d6a8bd5f91fffd20dfbcca0
MD5 9f36ca7f14768c710d6986c97b0ea53a
BLAKE2b-256 12e1f473be48aff66e92ea5bb954b570663ac5267d66084a843752634ac26933

See more details on using hashes here.

Provenance

The following attestation bundles were made for sampler_cli-0.4.4.tar.gz:

Publisher: publish.yml on SamuelCarmona83/sampler-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sampler_cli-0.4.4-py3-none-any.whl.

File metadata

  • Download URL: sampler_cli-0.4.4-py3-none-any.whl
  • Upload date:
  • Size: 58.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sampler_cli-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 30f377cdb6f0d3e3c788ffb15f59f450febe8656cda5a97608322515379c6536
MD5 8cb144237f4b30c085f5c1fe94c0a4be
BLAKE2b-256 e41cfa3772f5996705f0fad392467e17ba0cb960a1e6c28f6154f6b154bf3e25

See more details on using hashes here.

Provenance

The following attestation bundles were made for sampler_cli-0.4.4-py3-none-any.whl:

Publisher: publish.yml on SamuelCarmona83/sampler-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page