Multilingual text embeddings for Northeast Indian languages
Project description
ne-embed
Multilingual text embeddings for Northeast Indian languages.
10 languages · 768 dimensions · Built on LaBSE · CC-BY-4.0
Install
pip install ne-embed
Usage
from ne_embed import NEEmbed
model = NEEmbed() # downloads from HuggingFace on first run
sentences = [
'Where is the nearest hospital?',
'Ngi la pynjot ia ki shnong baroh', # Khasi
'Pilakchin an senganiko man na.', # Garo
]
embeddings = model.encode(sentences)
print(embeddings.shape) # (3, 768)
model.languages() # print supported languages
Supported Languages
| Code | Language | Tier |
|---|---|---|
| asm | Assamese | Supported |
| brx | Bodo | Supported |
| grt | Garo | Supported |
| kha | Khasi | Supported |
| lus | Mizo | Supported |
| mni | Meitei | Supported |
| njz | Nyishi | Supported |
| trp | Kokborok | Limited |
| pbv | Pnar | Limited |
| nag | Nagamese | Limited |
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ne_embed-1.0.0.tar.gz
(2.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ne_embed-1.0.0.tar.gz.
File metadata
- Download URL: ne_embed-1.0.0.tar.gz
- Upload date:
- Size: 2.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
beb1bcffca3a96c2d2895e4a5f155a29bfcab7d75139a039791e6d5aba18eef6
|
|
| MD5 |
f7d7cc74f67d692e7605e715f1b78a02
|
|
| BLAKE2b-256 |
84c16f6dbd24aedc4ceb49314fe76e95b565ff921acec254915edbd8e6d7a5a6
|
File details
Details for the file ne_embed-1.0.0-py3-none-any.whl.
File metadata
- Download URL: ne_embed-1.0.0-py3-none-any.whl
- Upload date:
- Size: 2.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c9658d5ce227bb6008cdc6a22022e0286bdebab72102a07cd54932038efec71
|
|
| MD5 |
1c060a37e1e02e0626db8d4f5291dd91
|
|
| BLAKE2b-256 |
918097427e12f38025f391c7e9858b0ee4d807e6f4e27085f3e8d3a929e6e42d
|