4 projects
harnais-web-extractor
Two-stage web article extractor (trafilatura + Playwright) with YouTube transcript fetching.
prompt-injection-sanitizer
Deterministic, dependency-free prompt-injection defense: regex detection, Unicode normalization, anti-typoglycemia, anti-leetspeak, data-tag escaping, and risk scoring.
hybrid-retrieval-scoring
Hybrid document retrieval: BM25 + TF-IDF fused with Reciprocal Rank Fusion, with bilingual FR/EN stopwords.
atomic-json-io
Crash-safe atomic JSON / text / JSONL file writes for Linux (stdlib only).