Skip to main content

Simple, fast dictionary-based language detector

Project description

PicoLang

Simple, fast dictionary-based language detector for short texts.

Installation

pip install picolang

Usage

from picolang.detector import detect

print(detect("bonjour")) # ('fr', 0.45)
print(detect("学中文")) # ('zh', 0.45)
print(detect("ciao mondo")) # ('it', 0.9)
print(detect("El gato doméstico")) # ('es', 0.45)

# Optionally, specify a subset of languages to consider
print(detect("ciao", languages=["de", "ro"])) # ('de', 0.45)

detect(text, languages=[]) -> tuple (IETF BCP 47 language tag, confidence)

Supported Languages

  • Afrikaans
  • Albanian
  • Arabic
  • Basque
  • Bengali
  • Bulgarian
  • Catalan
  • Chinese
  • Czech
  • Danish
  • Dutch
  • English
  • Esperanto
  • Estonian
  • Finnish
  • French
  • German
  • Greek
  • Hebrew
  • Hindi
  • Hungarian
  • Indonesian
  • Italian
  • Japanese
  • Kabyle
  • Kazakh
  • Korean
  • Latvian
  • Lithuanian
  • Macedonian
  • Norwegian
  • Occitan
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Serbian
  • Slovak
  • Slovenian
  • Spanish
  • Swedish
  • Thai
  • Turkish
  • Ukrainian
  • Vietnamese
  • Farsi

Limitations

This detector was designed for handling small texts (< 150 characters). It will probably not work reliably for longer text sequences. As it relies on dictionaries, if a word is missing or misspelled, the detection will fail.

Contributing

If you want to add a new language, or improve an existing one, add more words to the respective dictionary in the dictionaries folder.

License

AGPLv3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

picolang-1.0.0.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

picolang-1.0.0-py3-none-any.whl (1.3 MB view details)

Uploaded Python 3

File details

Details for the file picolang-1.0.0.tar.gz.

File metadata

  • Download URL: picolang-1.0.0.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for picolang-1.0.0.tar.gz
Algorithm Hash digest
SHA256 6903d8af5327cb6a119c9db19977d324358db33a763758c47b02b7108ed3edf1
MD5 bfba430fb50fb96402bc0d7e7705a3de
BLAKE2b-256 f77979bdadecbe939ab95f43b514f92860056a46660939bf2600b623974174fb

See more details on using hashes here.

File details

Details for the file picolang-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: picolang-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for picolang-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0ef7a648703bf1a43507f1144f48d82e6021463842125c6f072bc4263cd87271
MD5 379511e11c3d61e6f8224e98bd21163f
BLAKE2b-256 54e42a322f493509ba995bbceb84cc22a7ea98b21a3a94f5dbdafc0dca63b9c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page