28 projects
cortexflowx
Brain-to-image/audio/text reconstruction using Diffusion Transformers and Flow Matching. Decode what someone saw, heard, or thought from fMRI.
lmscan
Detect AI-generated text and fingerprint which LLM wrote it. Open-source GPTZero alternative. Zero dependencies, works offline.
neuroattack
Adversarial robustness testing for neural encoding models — attack, analyze, and defend brain-AI interfaces
vibescore
Grade your vibe-coded project. One command, instant letter grade across security, quality, dependencies, and testing.
datacruxai
Lightweight data quality toolkit for LLM instruction tuning. Deduplication, PII detection, contamination checking, and quality scoring — no GPU required.
injectionguard
Prompt injection detection for LLM applications and MCP servers
modeldiffx
Behavioral regression testing for LLMs. Capture outputs, diff behavior, detect drift — pytest for model upgrades.
trainpulse
Lightweight training health monitor. Detect loss spikes, gradient explosions, and NaN — 2 lines of code, no server, no signup.
quantbenchx
Quantization quality analyzer — pure-Python GGUF/safetensors parsing, layerwise analysis, quality prediction. Zero deps.
vibesafex
AI-generated code safety scanner for the vibe coding era
ckptkit
Inspect, convert, diff, and merge model checkpoints. The missing Swiss Army knife for ML weights.
tokonomix
Universal LLM token counting and cost management. Track, compare, and optimize your LLM API spending.
datamix
Dataset mixing & curriculum optimizer — profile, blend, schedule, and budget training data. Zero deps.
toksight
Tokenizer analysis toolkit. Compare vocabulary coverage, compression ratios, and token boundaries across GPT-4o, Llama 3, Mistral, and any HuggingFace tokenizer.
infermark
Benchmark any OpenAI-compatible LLM endpoint. TTFT, inter-token latency, throughput, P50-P99 — in one command.
castwright
Generate high-quality synthetic instruction-tuning data from seed examples. Simple API, built-in quality filtering, cost-aware.
llm-throttle-guard
Token-bucket rate limiter for LLM API calls with per-model limits
model-card-writer
Auto-generate model cards for ML models in Markdown format
llm-cache-toolkit
Lightweight caching layer for LLM API responses with TTL and size limits
embedding-similarity
Compute and compare text embeddings with cosine similarity, no heavy dependencies
llm-ab-test
A/B testing framework for LLM prompts - compare prompt variants with statistical analysis
prompt-template-engine
Simple and safe templating engine for LLM prompts with variable substitution and guards
pii-scrubber-lite
Scrub PII from text before sending to LLMs. Detect and redact emails, phones, SSNs, credit cards, names, and more.
llm-spend-tracker
Track and limit LLM API spending in real-time. Drop-in middleware for OpenAI, Anthropic, Google, and any OpenAI-compatible API.
token-overflow
Prevent LLM context window overflow with token counting, smart truncation, and budget management
llm-response-validator
Validate LLM responses against schemas, types, and constraints. Catch bad JSON, missing fields, and hallucinated formats before they crash your app.
llm-fallback
Automatic failover between LLM providers. When OpenAI is down, seamlessly switch to Anthropic, Google, or any backup.
prompt-inject-detect
Detect and block prompt injection attacks before they reach your LLM. Zero dependencies.