Language detection module based on the GiellaLT models, specifically aimed at minority and indigenous languages
Project description
Makes the language classification script from the GiellaLT's corpus tools available as a python module (GiellaLT's website, original repo).
The source code as well as the language model files are released under the GPL-3.0 license.
Installation
pip install gielladetect
Usage
import gielladetect
text = "Lurer du på hva som rører seg innenfor veggene til Nasjonalbiblioteket på Solli plass i Oslo?"
gielladetect.detect(text)
# Result: 'nob'
# To restrict detection to a subset of languages:
gielladetect.detect(text, ['nob', 'nno', 'eng'])
# Result: 'nob'
Supported languages
Using ISO 639-3 codes.
| Code | Name |
|---|---|
| ara | Arabic |
| bxr | Russia Buriat |
| ckb | Central Kurdish |
| dan | Danish |
| deu | German |
| eng | English |
| est | Estonian |
| fao | Faroese |
| fas | Persian |
| fin | Finnish |
| fit | Tornedalen Finnish |
| fkv | Kven Finnish |
| fra | French |
| hbs | Serbo-Croatian |
| isl | Icelandic |
| ita | Italian |
| kal | Kalaallisut |
| kmr | Northern Kurdish |
| koi | Komi-Permyak |
| kpv | Komi-Zyrian |
| krl | Karelian |
| mdf | Moksha |
| mhr | Eastern Mari |
| mns | Mansi |
| mrj | Western Mari |
| myv | Erzya |
| nno | Norwegian Nynorsk |
| nob | Norwegian Bokmål |
| olo | Livvi |
| pol | Polish |
| rmf | Kalo Finnish Romani |
| rmn | Balkan Romani |
| rmu | Tavringer Romani |
| rmy | Vlax Romani |
| ron | Romanian |
| rus | Russian |
| sma | Southern Sami |
| sme | Northern Sami |
| smj | Lule Sami |
| smn | Inari Sami |
| sms | Skolt Sami |
| som | Somali |
| spa | Spanish |
| swe | Swedish |
| tur | Turkish |
| udm | Udmurt |
| urd | Urdu |
| vep | Veps |
| vie | Vietnamese |
| yid | Yiddish |
| yrk | Nenets |
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gielladetect-1.0.4.tar.gz.
File metadata
- Download URL: gielladetect-1.0.4.tar.gz
- Upload date:
- Size: 3.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31c9f63c48d484fcdf6f5956019a3b0501528d2db232f1ff4c62910fba3a826f
|
|
| MD5 |
fd741094f4f9ac782db9ebb76b25c7ec
|
|
| BLAKE2b-256 |
d1f46a382cfe62a52b15f8bb7d7b1d9587c9e95ac274f52f49b0e0f18c61e296
|
File details
Details for the file gielladetect-1.0.4-py3-none-any.whl.
File metadata
- Download URL: gielladetect-1.0.4-py3-none-any.whl
- Upload date:
- Size: 4.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c3da513ecf30f85701c933acfc42918636a620732a6ca1607428571fdefa357
|
|
| MD5 |
50b7c68dd1a7220e45f188e6bb7714c4
|
|
| BLAKE2b-256 |
10cafa53205200a5629add01d60fea543d8ab0839a5e9786943365747f61b630
|