Simple self-hosted semantic search API

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.10

Project description

Easy Embed App

Simple self-hosted semantic search API.

GitHub: https://github.com/rafaelolal/csci-3485-final

Demo: https://github.com/rafaelolal/csci-3485-final/blob/main/CSCI_3485_Final_Project_Presentation.pptx

Installation

pip install easy-embed-rafaelolal

Usage

from easy_embed import App

app = App()

# Optional: set a SentenceTransformer model by passing in whatever necessary
# arguments
# app.set_model(**kwargs)
# 
# Optional: set device
# app.set_model_device(my_device)
#
# Optional: models that require further setup before being used can be
# accessed through `app.model`
# app.model.prepare()
#
# Optional: for models with unique encoding functions, you can override
# `app.encode` with `custom_encode` using default parameters and custom logic
# app.encode = lambda text, new = "hi": app.model.transform(text, new)

app.run(host="0.0.0.0", port=8000, allow_origins=["*"])

Documentation

API Endpoints

Visit the /docs url for more information and for quick testing.

Below are example values for the response body:

/create

{
  "doc": "string",
  "index": 0,
  "collection": "string"
}

/read

Use the below values only for small and simple semantic search tasks. Consider using the collection value for more documents or more frequent needs.

{
  "q": "string",
  "docs": [
    "string"
  ],
  "k": 0
}

Use the below values if you have already used the create endpoint to precompute the embedding vectors of the documents.

{
  "q": "string",
  "collection": "string",
  "k": 0
}

/update

The main purpose of this endpoint is to keep the embeddings up to date with your data. Consider creating a custom script to automatically make a call to this endpoint whenever a datapoint is edited in your database.

{
  "index": 0,
  "collection": "string",
  "doc": "string"
}

/delete

{
  "index": 0,
  "collection": "string"
}

Custom Embedding Model

Important note: the return type for a custom encode function must be -> list[float] | list[list[float]]. This is because of how the similarities are computed.

Refer to the usage example above.

Main Dependencies

Python version: python==3.12.8

fastapi==0.115.6
sentence-transformers==3.3.1
sqlmodel==0.0.22

Citations

How to publish to PyPi: https://youtu.be/5KEObONUkik

Default model used: Solatorio, Aivin V. "GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning." arXiv preprint arXiv:2402.16829 (2024).

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3.10

Release history Release notifications | RSS feed

This version

1.0.0

Dec 17, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easy_embed_rafaelolal-1.0.0.tar.gz (9.5 kB view details)

Uploaded Dec 17, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

easy_embed_rafaelolal-1.0.0-py3-none-any.whl (8.0 kB view details)

Uploaded Dec 17, 2024 Python 3

File details

Details for the file easy_embed_rafaelolal-1.0.0.tar.gz.

File metadata

Download URL: easy_embed_rafaelolal-1.0.0.tar.gz
Upload date: Dec 17, 2024
Size: 9.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for easy_embed_rafaelolal-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`e8bd4e515db7eb17a0a34992cbed24c94a54207fe0132e8ad16ded1af2eea1ad`
MD5	`c18a2b8485dc7a66ffaff28281d5d260`
BLAKE2b-256	`a38f05be81225079580bc317b7938ba7a30bcd7c3bec07330577fccd44492f5a`

See more details on using hashes here.

File details

Details for the file easy_embed_rafaelolal-1.0.0-py3-none-any.whl.

File metadata

Download URL: easy_embed_rafaelolal-1.0.0-py3-none-any.whl
Upload date: Dec 17, 2024
Size: 8.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.12.8

File hashes

Hashes for easy_embed_rafaelolal-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`128e08539266252111958de7fa147a1c0bc58d2424e4b6881c07fe08657def72`
MD5	`f62bf1542c7ee466532ac44f8e9f926e`
BLAKE2b-256	`83ea95b09cffa8568a8de624d6ebe4f63eb093b8662e8310e8ff908694443a79`

See more details on using hashes here.

easy-embed-rafaelolal 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Easy Embed App

Installation

Usage

Documentation

API Endpoints

/create

/read

/update

/delete

Custom Embedding Model

Main Dependencies

Citations

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes