10 projects
fastspell
Targetted language identifier, based on FastText and Hunspell.
fastspell-dictionaries
Hunspell dictionaries for FastSpell
bifixer
None
bicleaner-hardrules
Pre-filtering step for obvious noise based on rules, poor language based on general language modelling and vulgar language based on specific language modelling
bicleaner-ai
Parallel corpus classifier, indicating the likelihood of a pair of sentences being mutual translations or not (neural version)
escape-unk
Escape unknown symbols in SentecePiece vocabularies
bicleaner
Parallel corpus classifier, indicating the likelihood of a pair of sentences being mutual translations or not
loomchild-segment
Python wrapper for Loomchild segmenter
doommoses
DoomMoses
binonymizer
Binonymizer is a tool in Python that aims at tagging personal data in a parallel corpus.