Skip to main content

Simple, fast dictionary-based language detector

Project description

PicoLang

Simple, fast dictionary-based language detector for short texts.

Installation

pip install picolang

Usage

from picolang.detector import detect

print(detect("bonjour")) # ('fr', 0.45)
print(detect("学中文")) # ('zh', 0.45)
print(detect("ciao mondo")) # ('it', 0.9)
print(detect("El gato doméstico")) # ('es', 0.45)

# Optionally, specify a subset of languages to consider
print(detect("ciao", languages=["de", "ro"])) # ('de', 0.45)

detect(text, languages=[]) -> tuple (iso_639_1, confidence)

Supported Languages

  • Afrikaans
  • Albanian
  • Arabic
  • Basque
  • Bengali
  • Bulgarian
  • Catalan
  • Chinese
  • Czech
  • Danish
  • Dutch
  • English
  • Esperanto
  • Estonian
  • Finnish
  • French
  • German
  • Greek
  • Hebrew
  • Hindi
  • Hungarian
  • Indonesian
  • Italian
  • Japanese
  • Kabyle
  • Kazakh
  • Korean
  • Latvian
  • Lithuanian
  • Macedonian
  • Norwegian
  • Occitan
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Serbian
  • Slovak
  • Slovenian
  • Spanish
  • Swedish
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese
  • Farsi

Limitations

This detector was designed for handling small texts (< 150 characters). It will probably not work reliably for longer text sequences. As it relies on dictionaries, if a word is missing or misspelled, the detection will fail.

Contributing

If you want to add a new language, or improve an existing one, add more words to the respective dictionary in the dictionaries folder.

License

AGPLv3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

picolang-0.5.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

picolang-0.5.0-py3-none-any.whl (1.3 MB view details)

Uploaded Python 3

File details

Details for the file picolang-0.5.0.tar.gz.

File metadata

  • Download URL: picolang-0.5.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for picolang-0.5.0.tar.gz
Algorithm Hash digest
SHA256 c6a5dedea4706134434cf7526df961b2b230d208b1c9e9240ed4c75fc4c2fab1
MD5 539536e54866826af1c61c6f0da8b4f8
BLAKE2b-256 46d3c6cd862b15b9c389a9451faabf5acf7290f557311bb0c823d234b71f12ca

See more details on using hashes here.

File details

Details for the file picolang-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: picolang-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for picolang-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e90809aa790e1a4340ddc246490a7f6e260a2c4ec090f8b70306a506ca708d1e
MD5 c182f2c3d4c849d84d6ef1fb941b1527
BLAKE2b-256 f3cb8764114cd014a16f1a0aea93d979f2a0df56bbcfc9f4a383fd704d0ed42c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page