EmbedSelection

Embedding selection: A tool for selecting the best embedding model for your use case

Project description

Embedding Selector Framework

- This framework helps you automatically select the most suitable text embedding model for a given downstream use case.

- It analyzes task requirements (e.g., retrieval, classification, summarization), matches them against available embedding models, and evaluates performance on relevant benchmarks.

Features

Use Case–Driven Selection: Takes a natural-language description of a use case and extracts structured metadata (e.g., languages, token limits, complexity).
Metadata Extraction: Uses advanced LLM models to normalize requirements into a standardized schema (parameters, memory, licensing, etc.).
Model Matching: Filters embedding models based on attributes like size, efficiency, license, and language coverage.
Task Alignment: Selects relevant evaluation tasks from MTEB (Massive Text Embedding Benchmark).
Performance Evaluation: Loads benchmark results and computes average scores per candidate model.

How It Works

The pipeline runs in sequential steps:

Use Case Selection Choose from predefined scenarios (chatbots, legal retrieval, recommendations, sentiment analysis, summarization, etc.) or provide your own description.
Requirement Extraction (LLM Agent) GPT-4o parses the description into structured metadata, including: Supported languages Max token length Memory usage & parameter limits Task/domain classification
Model Filtering Candidate models from MTEB are filtered according to the extracted attributes.
Task Evaluation Candidate models are benchmarked on the most relevant MTEB tasks (retrieval, classification, summarization, etc.).
Ranking & Export Models are ranked by performance (with ties broken by efficiency) and exported to CSV for inspection.

Usage

To use the tool, follow these steps:

  pip install EmbedSelection

  EmbedSelection

Contributing

Contributions to improve the tool are welcome! Feel free to open issues for bugs or feature requests, or submit pull requests for enhancements.

Acknowledgements

This project utilizes MTEB benchmark from huggingface : https://huggingface.co/spaces/mteb/leaderboard

Project details

Release history Release notifications | RSS feed

This version

1.0

Aug 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embedselection-1.0.tar.gz (11.1 kB view details)

Uploaded Aug 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

embedselection-1.0-py3-none-any.whl (11.4 kB view details)

Uploaded Aug 28, 2025 Python 3

File details

Details for the file embedselection-1.0.tar.gz.

File metadata

Download URL: embedselection-1.0.tar.gz
Upload date: Aug 28, 2025
Size: 11.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for embedselection-1.0.tar.gz
Algorithm	Hash digest
SHA256	`fe681977f34c7c7b6b5be21cf92ea799072ed9f22564f3e61191d5ee34bc0232`
MD5	`53d29c40339a3101ac625ad83181968a`
BLAKE2b-256	`9358f56df2a462c9063d6f3b64ee346728e1b1f3f8087b9196d47826da27e98b`

See more details on using hashes here.

File details

Details for the file embedselection-1.0-py3-none-any.whl.

File metadata

Download URL: embedselection-1.0-py3-none-any.whl
Upload date: Aug 28, 2025
Size: 11.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for embedselection-1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`710360796aa25694d5755464d01a2622f8e1784a10b3129750e5eb89ef49585f`
MD5	`31c88f5912a03fb9ab59685aeca26af2`
BLAKE2b-256	`226d3119e980ed82fd1ed5e983c7aa17475c64e5d28e7fc676a4aa7c3561f333`

See more details on using hashes here.

EmbedSelection 1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Embedding Selector Framework

- This framework helps you automatically select the most suitable text embedding model for a given downstream use case.

- It analyzes task requirements (e.g., retrieval, classification, summarization), matches them against available embedding models, and evaluates performance on relevant benchmarks.

Features

How It Works

Usage

Contributing

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes