2 projects
squish-ai
Local LLM inference server for Apple Silicon. Block-level paged KV cache for long-context workloads. 5.4× faster end-to-end on 4K-token prompts vs Ollama, less RAM, INT3 support for Qwen3. OpenAI-compatible API.
squash-ai
Squash violations, not velocity. Automated EU AI Act compliance for ML teams — CI/CD-native, open-core, developer-first.