Skip to main content

A language detector for short string or chat.

Project description

LanguageDetection

Language detector used by Interaction Bot.

Method

We made detection by dictionary and with ngram based method with fastText. We also made a language priority (made by counting the number of detected language by Interaction Bot in 24 hours). With discord, you could also use the language of the discord user interface send by the api to determine if a input is reliable (when reliabe is false).

Usage

Run the main.py file (you must dl the fasttext model with this command:

wget https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin # put it to model
import detector


def detect(text):
    pred = detector.detect(text)
    return pred[0], True if pred[1] > 0.4 else False

print(detect('Hi'))

Output

Language code | reliability

Futur

We will add a command to report bad detection, so we will update the dictionary detection accordingly. (we also update the code because it is bad actually 😂).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ShortLanguageDetection-0.0.2.tar.gz (3.9 kB view hashes)

Uploaded Source

Built Distribution

ShortLanguageDetection-0.0.2-py3-none-any.whl (4.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page