Skip to main content

Token-efficient CLI for indexing and searching code symbols (Python-first, designed for minimal LLM/agent context size)

Project description

Sampler

Token-efficient CLI for indexing and searching code symbols across multiple projects.

Current version: 0.4.3

Designed for humans and agents: compact default output, short paths, and low-noise symbol views.

Requirements

  • Python 3.11+

Installation

pip install sampler-cli

Development setup:

pip install -e '.[dev]'

Semantic stack (TF-IDF + local hash fallback):

pip install -e '.[semantic]'

Quick Start

sampler init
sampler project add myproj /absolute/path/to/project --language auto
sampler index myproj
sampler search retry --project myproj
sampler symbols myproj
sampler overview src/main.py

Command Overview

Core:

  • sampler version [--plain]
  • sampler init
  • sampler index <project>
  • sampler search <query> [--project <name>] [--type <t>] [--limit <n>] [--semantic] [--style plain|bars]
  • sampler search-all <query> [--type <t>] [--limit <n>]
  • sampler symbols <project> [--type <t>] [--limit <n>]
  • sampler overview <filepath> [--style plain|bars]

Relationships:

  • sampler callers <symbol> [--project <name>] [--file <path-or-suffix>]
  • sampler usages <symbol> [--project <name>] [--file <path-or-suffix>]
  • sampler related <symbol> [--project <name>] [--file <path-or-suffix>] [--style plain|bars]
  • Selector alternativo: <path>:<symbol> (ej. app/utils/helpers.py:format_kda)

Project management:

  • sampler project add <name> <path> --language <python|go|typescript|javascript|auto>
  • sampler project update <name> [--path <abs-path>] [--language <lang>]
  • sampler project list
  • sampler project deps <name>
  • sampler project remove <name>

Config:

  • sampler config show
  • sampler config embeddings [--provider P] [--model M]

Semantic and analysis:

  • sampler embed <project> [--batch-size <n>]
  • sampler stale-code <project> [--limit <n>]

Embeddings & Semantic Search

sampler search --semantic (and hybrid ranking) supports pluggable providers via the adapter pattern:

  • Default: bge-small (BAAI/bge-small-en-v1.5 via fastembed — lightweight ONNX, ~384 dim, local).
  • Other built-ins: hash (always-on deterministic fallback), ollama (e.g. nomic-embed-text), nomic, openai, fastembed.
  • TF-IDF (sklearn, on-the-fly, no pre-embed) remains the fast lexical primary when no provider embeddings are precomputed for the active model.
  • Hash fingerprint is the final always-available fallback.

Configuration (in ~/.sampler/config.yaml or via sampler config embeddings ...):

embeddings:
  provider: "bge-small"
  # provider: "ollama"
  # model: "nomic-embed-text"
  # base_url: "http://localhost:11434"

Install:

# For default BGE (recommended for most users)
pip install 'sampler-cli[embeddings]'

# Or for Ollama / OpenAI only
pip install 'sampler-cli[ollama-embeddings]'
pip install 'sampler-cli[openai-embeddings]'

sampler embed <project> precomputes vectors using the current configured provider (progress bar). Changing provider? Re-run embed after updating config (old vectors are ignored until re-embedded).

Offline / air-gapped: provider: hash (or just don't install the embeddings extra — TF-IDF + hash still work if you have [semantic]).

Language Support

  • Python parser: stdlib AST (stable)
  • Go parser: tree-sitter-go (real extraction)
  • TypeScript/JavaScript parser: tree-sitter-typescript (real extraction)
  • --language auto: per-file language detection for monorepos/multi-language projects

Stale Code Detection

sampler stale-code <project> reports candidate stale functions/methods where:

  • function is called from test files
  • function has zero non-test callers in project call graph
  • symbol is defined in production code (symbols defined in test files are excluded)

Test file detection supports common multi-language patterns:

  • Python: tests/, test_*.py, *_test.py
  • Go: *_test.go
  • TypeScript/JavaScript: __tests__/, test/, spec/, *.test.*, *.spec.*

This is heuristic signal, not guaranteed dead-code proof.

Examples

$ sampler search worker --project myproj
myproj:src/tasks/celery_app.py:70 function on_worker_ready  def on_worker_ready(sender)

$ sampler related ConfigManager --project myproj --style bars
myproj:src/config.py:24-105 class ConfigManager  [parent]
...

$ sampler stale-code myproj
myproj:src/utils/retry.py:12-28 function retry_request  test_callers=2 non_test_callers=0  [tests.test_retry.test_retry_request]

Data Location

  • Config: ~/.sampler/config.yaml
  • DB: ~/.sampler/graph.db

Running Tests

pytest -q

Notes

  • Compact output is default by design (token-efficient for agent workflows).
  • For broader roadmap details, see TODO.md and PLAN.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sampler_cli-0.4.3.tar.gz (58.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sampler_cli-0.4.3-py3-none-any.whl (56.0 kB view details)

Uploaded Python 3

File details

Details for the file sampler_cli-0.4.3.tar.gz.

File metadata

  • Download URL: sampler_cli-0.4.3.tar.gz
  • Upload date:
  • Size: 58.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sampler_cli-0.4.3.tar.gz
Algorithm Hash digest
SHA256 b9a65557b574473af276967656e550cd1507a65904e0f964c14ebebb7a5941de
MD5 8fae991f178f59d9555437d300cad6bc
BLAKE2b-256 6bfaad66a67a110dc14b54b78a24ab6d573ec34512966c17b8564c5e3dcf9bd2

See more details on using hashes here.

Provenance

The following attestation bundles were made for sampler_cli-0.4.3.tar.gz:

Publisher: publish.yml on SamuelCarmona83/sampler-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sampler_cli-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: sampler_cli-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 56.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sampler_cli-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 54b059a6a8ed17cc9fc7d537aaef16be6074041514823efa2dffe2b200974ab9
MD5 93bb309a60e28c830d8932d0843e4d19
BLAKE2b-256 467c5b9c6a3192e7cc287e3348e12636f4f71321950ed78bdf263c10778e74f7

See more details on using hashes here.

Provenance

The following attestation bundles were made for sampler_cli-0.4.3-py3-none-any.whl:

Publisher: publish.yml on SamuelCarmona83/sampler-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page