12 projects
fastspell
Targetted language identifier, based on FastText and Hunspell.
fastspell-dictionaries
Hunspell dictionaries for FastSpell
heliport
Fast and accurate language identifier
bifixer
None
bicleaner-hardrules
Pre-filtering step for obvious noise based on rules, poor language based on general language modelling and vulgar language based on specific language modelling
bicleaner-ai
Parallel corpus classifier, indicating the likelihood of a pair of sentences being mutual translations or not (neural version)
escape-unk
Escape unknown symbols in SentecePiece vocabularies
bicleaner
Parallel corpus classifier, indicating the likelihood of a pair of sentences being mutual translations or not
sacremoses
SacreMoses
monocleaner
Monolingual corpus fluency filter
bicleaner-ai-glove
glove-python fork for bicleaner-ai
doommoses
DoomMoses