8 projects
tuetoken
The fastest tokenizer for modern LLMs, up to 20x faster. Drop-in for transformers.AutoTokenizer, byte-exact, never quadratic.
gtboost
Rust/Python gradient boosting for tabular data
be-great
Generating Realistic Tabular Data using Large Language Models
chugchug
Next-generation progress bars — event-driven, multiprocessing-safe, pipeline-aware
samey
Dataset diversity scoring for synthetic instruction data
augini
AI-powered Python framework for tabular data enrichment and analysis using LLMs. Features include intelligent feature engineering, natural language data analysis, and AI agents for automated workflows.
deeptlf
Deep Tabular Learning Framework
amorf
A framework for multi-output regression in Python