Skip to main content

Multilingual text embeddings for Northeast Indian languages

Project description

ne-embed

Multilingual text embeddings for Northeast Indian languages.

10 languages · 768 dimensions · Built on LaBSE · CC-BY-4.0

Install

pip install ne-embed

Usage

from ne_embed import NEEmbed

model = NEEmbed()  # downloads from HuggingFace on first run

sentences = [
    'Where is the nearest hospital?',
    'Ngi la pynjot ia ki shnong baroh',  # Khasi
    'Pilakchin an senganiko man na.',     # Garo
]

embeddings = model.encode(sentences)
print(embeddings.shape)  # (3, 768)

model.languages()  # print supported languages

Supported Languages

Code Language Tier
asm Assamese Supported
brx Bodo Supported
grt Garo Supported
kha Khasi Supported
lus Mizo Supported
mni Meitei Supported
njz Nyishi Supported
trp Kokborok Limited
pbv Pnar Limited
nag Nagamese Limited

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ne_embed-1.0.0.tar.gz (2.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ne_embed-1.0.0-py3-none-any.whl (2.8 kB view details)

Uploaded Python 3

File details

Details for the file ne_embed-1.0.0.tar.gz.

File metadata

  • Download URL: ne_embed-1.0.0.tar.gz
  • Upload date:
  • Size: 2.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for ne_embed-1.0.0.tar.gz
Algorithm Hash digest
SHA256 beb1bcffca3a96c2d2895e4a5f155a29bfcab7d75139a039791e6d5aba18eef6
MD5 f7d7cc74f67d692e7605e715f1b78a02
BLAKE2b-256 84c16f6dbd24aedc4ceb49314fe76e95b565ff921acec254915edbd8e6d7a5a6

See more details on using hashes here.

File details

Details for the file ne_embed-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: ne_embed-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 2.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for ne_embed-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8c9658d5ce227bb6008cdc6a22022e0286bdebab72102a07cd54932038efec71
MD5 1c060a37e1e02e0626db8d4f5291dd91
BLAKE2b-256 918097427e12f38025f391c7e9858b0ee4d807e6f4e27085f3e8d3a929e6e42d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page