Autocomplete engine built from scratch: pluggable predictors, explainable ranking, offline evaluation, and weight optimisation. Zero required dependencies.
Project description
adaptive-autocomplete
An autocomplete engine that learns from user selections, recovers typos, and can explain every ranking decision as plain numbers.
from aac import create_engine
engine = create_engine("production")
engine.suggest("programing")
# → ['programming'] # SymSpell finds the 1-deletion correction
engine.record_selection("programing", "programming")
engine.suggest("programing")
# → ['programming'] # same result, now history-boosted to the top
engine.explain("programing")[0]
# RankingExplanation(value='programming', base=1.65, boost=+1.50, final=3.15)
Quick start
git clone https://github.com/bonnie-mcconnell/adaptive-autocomplete
cd adaptive-autocomplete
make install
source .venv/bin/activate # Linux/macOS
# .venv\Scripts\activate # Windows PowerShell
make demo # interactive browser demo (first run takes ~5s while SymSpell index builds)
make test-fast # fast unit tests (skips integration, property-based, and perf)
make test # full suite including integration and property-based tests
make benchmark # latency numbers
make record-demo # record docs/demo.gif (requires VHS: https://github.com/charmbracelet/vhs)
Or with Docker (no Python install required):
make demo-docker # opens at http://localhost:5000
Presets
| Preset | Predictors | Ranker | Use when |
|---|---|---|---|
stateless |
frequency | score | no history, fast |
default |
frequency, history | score | history-aware, no decay |
recency |
frequency, history | score + decay | recency matters |
production |
frequency, history, symspell, trigram | score + decay | typos + recency |
robust |
frequency, history, symspell | score + decay | typo recovery only |
bktree |
frequency, history, edit-distance | score + decay | benchmarking only* |
*BK-tree degrades to O(n) at 48k+ words (~60ms/query vs ~0.4ms for SymSpell). Excluded from available_presets() but accessible via create_engine("bktree").
from aac import create_engine
from aac.presets import compare_presets
engine = create_engine("production")
comparison = compare_presets("programing", presets=["default", "production"])
HTTP API
aac serve starts a minimal JSON API server with zero extra dependencies:
aac serve # http://127.0.0.1:8420
aac serve --port 9000
aac serve --host 0.0.0.0 --port 8420 # Docker
GET /suggest?q=prog[&limit=10] → {"suggestions": ["program", ...]}
GET /explain?q=prog[&limit=10] → {"explanations": [{...}, ...]}
POST /record?q=prog&value=programming → {"recorded": true}
GET /health → {"status": "ok", "preset": "production"}
The server is single-threaded (stdlib http.server). For concurrent workloads,
run multiple processes behind a reverse proxy, or use the async API
(engine.suggest_async()) directly.
How it works
Prediction and ranking are separate layers. Predictors are read-only with respect to history: they take a prefix and return scored candidates but never write to History. Rankers are read-write: they reorder candidates based on history and recency, and the engine records selections back into History. The seam exists because the original monolithic function was untestable - writing a learning test required mocking the full prediction pipeline.
text input
↓
CompletionContext - lowercases and normalises prefix
↓
Predictors - frequency (stateless), history (read-only), symspell (stateless), trigram (stateless)
↓
Weighted sum - additive, configurable per predictor
↓
Rankers - reorder and boost; cannot add or remove candidates
↓
Suggestions + explanations
explain() costs the same as suggest() - one pipeline run. Score deltas are captured as each ranker runs rather than by re-running the pipeline after the fact.
RankingExplanation.__post_init__ enforces final_score == base_score + history_boost at construction time. If a ranker produces an inconsistent explanation, you get a ValueError immediately, not a silent wrong answer downstream.
Rankers cannot modify the candidate set. After each ranker step the engine checks that suggestions-in equals suggestions-out; violations raise RuntimeError naming the offender.
Typo recovery
Three approximate-match predictors, all sharing the same scoring formula (internal predictors/_scoring.py):
score = (base / (1 + edit_distance)) * (1 + 0.5 * log_freq_score)
Distance is dominant. The frequency multiplier orders words within each distance bucket but can't promote a distance-2 match above a distance-1 match. The 0.5 weight makes scores directly comparable across predictors - a SymSpellPredictor result and a TrigramPredictor result at the same distance and frequency produce the same number.
SymSpell builds a delete-neighbourhood index at startup: for each dictionary word, every variant reachable by 1-2 deletions is stored. Lookups hash the query's delete-neighbourhood and intersect with the stored set. O(1) average, no BK-tree traversal.
AdaptiveSymSpell uses max_distance=1 for prefixes ≤3 chars. The full 2-deletion index on a 2-character query generates hundreds of candidates, many nonsensical.
Trigram builds a character 3-gram inverted index. Jaccard similarity over shared trigrams filters candidates before computing exact edit distance, so Levenshtein only runs on plausible matches.
Weight optimisation
from aac.evaluation import EvaluationHarness, WeightOptimiser
from aac.evaluation.datasets import make_synthetic_query_log
from aac.data import load_english_frequencies
vocab = list(load_english_frequencies().keys())
query_log = make_synthetic_query_log(vocab, prefix_lengths=[2, 3, 4])
harness = EvaluationHarness(query_log)
opt = WeightOptimiser(harness, verbose=False)
result = opt.coordinate_descent(
"production",
weight_grid={"frequency": [0.5, 1.0, 2.0], "symspell": [0.3, 0.5, 1.0]},
)
print(result.best_weights)
# → {'frequency': 2.0, 'symspell': 0.3, 'history': 1.2, 'trigram': 0.4}
Predictor indexes are built once per preset and cached. Only weight wrappers are recreated per evaluation - 27 evaluations on the production preset cost ~0.05s total instead of ~135s.
Two strategies: grid search (exhaustive, optimal on the grid) and coordinate descent (one weight at a time, converges faster, may find a local optimum).
Concurrency
History is not thread-safe. Pass thread_safe=True for multi-threaded use:
from pathlib import Path
from aac import create_engine, JsonHistoryStore
store = JsonHistoryStore(Path.home() / ".aac_history.json")
engine = create_engine("production", history=store.load(), thread_safe=True)
ThreadSafeHistory uses a threading.Condition for write serialisation and reference-counted readers for read concurrency. Multiple suggest() calls run simultaneously; record_selection() waits for active readers to finish. Safe on CPython, PyPy, and free-threaded 3.13+ - does not rely on GIL guarantees.
Tests
837 tests: invariant correctness, IR metrics, evaluation harness, concurrency, async API, CLI integration, ranker and predictor contracts, HTTP API, and property-based fuzzing with Hypothesis.
- SymSpell brute-force equivalence - every result verified against a linear scan across multiple queries and distance thresholds.
- Explain single-pass -
ranker.rank()called exactly once perexplain()call, verified with a spy. - Ranker invariant -
RuntimeErrorif a ranker adds or removes candidates. - Ranker thread-safety -
LearningRankerandDecayRankeruse thread-local caches; concurrentrank()calls on a shared engine don't corrupt each other's prefix state. - History isolation -
compare_presets()does not modify the caller'sHistory. - Property-based (Hypothesis) - four core invariants across thousands of generated inputs; found two floating-point precision bugs that example-based tests missed.
CI runs on Ubuntu and Windows across Python 3.10-3.13.
Project layout
src/aac/
├── engine/ - AutocompleteEngine, EngineConfig (serialisation + diff)
├── domain/ - History, ThreadSafeHistory, CompletionContext
├── predictors/ - FrequencyPredictor, HistoryPredictor, SymSpell, Trigram, EditDistance
├── ranking/ - ScoreRanker, LearningRanker, DecayRanker, RankingExplanation
├── storage/ - JsonHistoryStore (atomic write, v1→v2 migration)
├── evaluation/ - EvaluationHarness, WeightOptimiser, IR metrics (MRR, NDCG, AP)
├── benchmarks/ - latency benchmarks with baseline comparison
├── cli/ - aac suggest / explain / record / history / serve / demo
└── presets.py - named engine configurations
DESIGN.md- architecture decisions, tradeoffs, and what I'd changeBENCHMARK.md- latency numbers, CI gates, and how to reproduceCHANGELOG.md- version historyexamples/- FastAPI integration, custom vocabulary, contextual history
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file adaptive_autocomplete-1.0.4.tar.gz.
File metadata
- Download URL: adaptive_autocomplete-1.0.4.tar.gz
- Upload date:
- Size: 298.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e3f84eb60972c85da7d88166f6364040be8dda8356a2ab7b14f11210860fdd3
|
|
| MD5 |
46db20f37564e70b35ca40006233ae31
|
|
| BLAKE2b-256 |
c15fea4826f6ad4b6debd385eebae2217c8510c72d31e41a29aa5060906eb310
|
File details
Details for the file adaptive_autocomplete-1.0.4-py3-none-any.whl.
File metadata
- Download URL: adaptive_autocomplete-1.0.4-py3-none-any.whl
- Upload date:
- Size: 315.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e9e45984aa51f1da98dd2befee766b9233ad3aae7701bfe1fe87e5835731f00
|
|
| MD5 |
dace9b3cbc833f5aa559185020b0e0d7
|
|
| BLAKE2b-256 |
0f1489943e6775e7cd6e9903c675767941361526296f4c5e44810aa19f513051
|