Fast and Lightweight Text Embedding

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Project description

LightEmbed

LightEmbed is a light-weight, fast, and efficient tool for generating sentence embeddings. It does not rely on heavy dependencies like PyTorch and Transformers, making it suitable for environments with limited resources.

Benefits

1. Light-weight

Minimal Dependencies: LightEmbed does not depend on PyTorch and Transformers.
Low Resource Requirements: Operates smoothly with minimal specs: 1GB RAM, 1 CPU, and no GPU required.

2. Fast (as light)

ONNX Runtime: Utilizes the ONNX runtime, which is significantly faster compared to Sentence Transformers that use PyTorch.

3. Consistent with Sentence Transformers

Consistency: Incorporates all modules from a Sentence Transformer model, including normalization and pooling.
Accuracy: Produces embedding vectors identical to those from Sentence Transformers.

4. Supports models not managed by LightEmbed

LightEmbed can work with any Hugging Face repository, even those not hosted on Hugging Face ONNX models, as long as ONNX files are available.

5. Local Model Support

LightEmbed can load models from the local file system, enabling faster loading times and functionality in environments without internet access, such as AWS Lambda or EC2 instances in private subnets.

Installation

pip install -U light-embed

Usage

Then you can specify the original model name like this:

from light_embed import TextEmbedding
sentences = ["This is an example sentence", "Each sentence is converted"]

model = TextEmbedding(model_name_or_path='sentence-transformers/all-MiniLM-L6-v2')
embeddings = model.encode(sentences)
print(embeddings)

or, alternatively, you can specify the onnx model name like this:

from light_embed import TextEmbedding
sentences = ["This is an example sentence", "Each sentence is converted"]

model = TextEmbedding(model_name_or_path='onnx-models/all-MiniLM-L6-v2-onnx')
embeddings = model.encode(sentences)
print(embeddings)

Using a Non-Managed Model: To use a model from its original repository without relying on Hugging Face ONNX models, simply specify the model name and provide the model_config, assuming the original repository includes ONNX files.

from light_embed import TextEmbedding
sentences = ["This is an example sentence", "Each sentence is converted"]

model_config = {
    "onnx_file": "onnx/model.onnx",
    "pooling_config_path": "1_Pooling",
    "normalize": False
}
model = TextEmbedding(
    model_name_or_path='sentence-transformers/all-MiniLM-L6-v2',
    model_config=model_config
)
embeddings = model.encode(sentences)
print(embeddings)

Using a Local Model: To use a local model, specify the path to the model's folder and provide the model_config.

from light_embed import TextEmbedding
sentences = ["This is an example sentence", "Each sentence is converted"]

model_config = {
    "onnx_file": "onnx/model.onnx",
    "pooling_config_path": "1_Pooling",
    "normalize": False
}
model = TextEmbedding(
    model_name_or_path='/path/to/the/local/model/all-MiniLM-L6-v2-onnx',
    model_config=model_config
)
embeddings = model.encode(sentences)
print(embeddings)

The model_config is a dictionary that provides details about the model, such as the location of the ONNX file and whether pooling or normalization is needed. Pooling is required if it hasn't been incorporated into the ONNX file itself.

model_config = {
    "onnx_file": "relative path to the onnx file, e.g., model.onnx, or onnx/model.onnx",
    "pooling_config_path": "relative path to the pooling config folder, e.g., 1_Pooling",
    "normalize": True/False
}

If the pooling has been incorporated into the ONNX file, you can ignore the "pooling_config_path". Similarly, if normalization is already included in the ONNX file, you can omit the "normalize" entry.

Citing & Authors

Binh Nguyen / binhcode25@gmail.com

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

binhcode25

Release history Release notifications | RSS feed

This version

1.0.8

Sep 6, 2025

1.0.7

Nov 27, 2024

1.0.6

Nov 22, 2024

1.0.5

Nov 22, 2024

1.0.4

Nov 14, 2024

1.0.3

Oct 16, 2024

1.0.2

Oct 13, 2024

1.0.1

Oct 13, 2024

1.0.0

Oct 13, 2024

0.1.8

Sep 25, 2024

0.1.7

Aug 2, 2024

0.1.5

Jun 22, 2024

0.1.4

Jun 21, 2024

0.1.3

Jun 18, 2024

0.1.2

Jun 17, 2024

0.1.1

Jun 17, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

light_embed-1.0.8.tar.gz (14.7 kB view details)

Uploaded Sep 6, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

light_embed-1.0.8-py3-none-any.whl (16.8 kB view details)

Uploaded Sep 6, 2025 Python 3

File details

Details for the file light_embed-1.0.8.tar.gz.

File metadata

Download URL: light_embed-1.0.8.tar.gz
Upload date: Sep 6, 2025
Size: 14.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for light_embed-1.0.8.tar.gz
Algorithm	Hash digest
SHA256	`fe5d64b4319d48e0bf24881a665c9ac96b9eb02b2beff505b7def9ab8e7a8cf5`
MD5	`0344150adf4454e5d3e8e8f24fbf8e56`
BLAKE2b-256	`51256abe1cadf34131541da2d3fa6ec4926a792b6d31bb74b72c65da72f2c0d3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for light_embed-1.0.8.tar.gz:

Publisher: publish.yml on nguyenthaibinh/light-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: light_embed-1.0.8.tar.gz
- Subject digest: fe5d64b4319d48e0bf24881a665c9ac96b9eb02b2beff505b7def9ab8e7a8cf5
- Sigstore transparency entry: 480559811
- Sigstore integration time: Sep 6, 2025
Source repository:
- Permalink: nguyenthaibinh/light-embed@0f6617734ec99778ccbef88e0bfe227105b26dc2
- Branch / Tag: refs/tags/v1.0.8
- Owner: https://github.com/nguyenthaibinh
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@0f6617734ec99778ccbef88e0bfe227105b26dc2
- Trigger Event: push

File details

Details for the file light_embed-1.0.8-py3-none-any.whl.

File metadata

Download URL: light_embed-1.0.8-py3-none-any.whl
Upload date: Sep 6, 2025
Size: 16.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for light_embed-1.0.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1100718fa39d96389d36a469842cf68f5981dac355abe6cf8797fad711050fd0`
MD5	`c54f8b2eebf6c0ca0bbc2ec421589cf9`
BLAKE2b-256	`7d46ddfb281f83fb249c2ed275713ed49c348dbeb6945239936cf35641acacbe`

See more details on using hashes here.

Provenance

The following attestation bundles were made for light_embed-1.0.8-py3-none-any.whl:

Publisher: publish.yml on nguyenthaibinh/light-embed

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: light_embed-1.0.8-py3-none-any.whl
- Subject digest: 1100718fa39d96389d36a469842cf68f5981dac355abe6cf8797fad711050fd0
- Sigstore transparency entry: 480559814
- Sigstore integration time: Sep 6, 2025
Source repository:
- Permalink: nguyenthaibinh/light-embed@0f6617734ec99778ccbef88e0bfe227105b26dc2
- Branch / Tag: refs/tags/v1.0.8
- Owner: https://github.com/nguyenthaibinh
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@0f6617734ec99778ccbef88e0bfe227105b26dc2
- Trigger Event: push

light-embed 1.0.8

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

LightEmbed

Benefits

1. Light-weight

2. Fast (as light)

3. Consistent with Sentence Transformers

4. Supports models not managed by LightEmbed

5. Local Model Support

Installation

Usage

Citing & Authors

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance