Skip to main content

The Superlinked vector computing library

Project description

Superlinked

GitHub GitHub last commit

Superlinked is a compute framework for your information retrieval and feature engineering systems, focused on turning complex data into vector embeddings within your RAG, Search, RecSys and Analytics stack. Integrate Superlinked into your machine learning stack for custom model performance with pre-trained model convenience.

If you like what we do, give us a star! ⭐

The screenshot below shows how to build multimodal vectors from your data & define weights at query time to avoid postprocessing & rerank requirements.

If the image does not render, you can check the notebook here: https://github.com/superlinked/superlinked/blob/main/notebook/recommendations_e_commerce.ipynb

Our current release allows you to explore our computational model in simple scripts and python notebooks, our next major release will focus on helping you run Superlinked in production, with built-in data infra and vector database integrations.

Visit Superlinked for more information about the company behind this product and our other initiatives.

Try it out

Example on how to use Superlinked to experiment with the semantic search use-case.

Pre-requisites

In a notebook

Install the superlinked library:

%pip install superlinked

As a script

Ensure your python version is at least 3.10.x but not newer than 3.12.

$> python -V
Python 3.10.9

If your python version is not >=3.10 and <=3.12 you might use pyenv to install it.

Upgrade pip and install the superlinked library.

$> python -m pip install --upgrade pip
$> python -m pip install superlinked

Run the example

First run will take slightly longer as it has to download the embedding model.

from superlinked.framework.common.schema.schema import schema
from superlinked.framework.common.schema.schema_object import String
from superlinked.framework.common.schema.id_schema_object import IdField
from superlinked.framework.dsl.space.text_similarity_space import TextSimilaritySpace
from superlinked.framework.dsl.index.index import Index
from superlinked.framework.dsl.query.param import Param
from superlinked.framework.dsl.query.query import Query
from superlinked.framework.dsl.source.in_memory_source import InMemorySource
from superlinked.framework.dsl.executor.in_memory.in_memory_executor import InMemoryExecutor


@schema # Describe your schemas.
class Document:
    id: IdField  # Each schema should have exactly one `IdField`.
    body: String # Use `String` for text fields.

document = Document()

relevance_space = TextSimilaritySpace(text=document.body, model="sentence-transformers/all-mpnet-base-v2") # Select your semantic embedding model.
document_index = Index([relevance_space]) # Combine your spaces to a queryable index.

query = Query(document_index).find(document).similar(relevance_space.text, Param("query_text")) # Define your query with dynamic parameters.

source: InMemorySource = InMemorySource(document) # Connect a data source to your schema.

executor = InMemoryExecutor(sources=[source], indices=[document_index]) # Tie it all together to run your configuration.
app = executor.run()

source.put([{"id": "happy_dog", "body": "That is a happy dog"}])
source.put([{"id": "happy_person", "body": "That is a very happy person"}])
source.put([{"id": "sunny_day", "body": "Today is a sunny day"}])

print(app.query(query, query_text="Who is a positive friend?")) # Run your query.

Ready to go to production? We have released Superlinked Server that allows you to host your Superlinked instance with RestAPIs and the following Vector DB connectors:

  • Redis
  • MongoDB

Tell us which Vector DB we should support next!

Use-cases

You can check a full list of examples here.

Logging

Contextual information is automatically included in log messages, such as the process ID and package scope. Personally Identifiable Information (PII) is filtered out by default but can be exposed with the SUPERLINKED_EXPOSE_PII environment variable to true.

Reference

  1. Describe your data using Python classes with the @schema decorator.
  2. Describe your vector embeddings from building blocks with Spaces.
  3. Combine your embeddings into a queryable Index.
  4. Define your search with dynamic parameters and weights as a Query.
  5. Load your data using a Source.
  6. Define your transformations with a Parser (e.g.: from pd.DataFrame).
  7. Run your configuration with an Executor.

You can check a list of our features or head to our documentation.

Articles

  • Vector DB Comparison: Open-source collaborative comparison of vector databases by Superlinked.
  • Vector Hub: VectorHub is a free and open-sourced learning hub for people interested in adding vector retrieval to their ML stack

Support

If you encounter any challenges during your experiments, feel free to create an issue, request a feature or to start a discussion. Make sure to group your feedback in separate issues and discussions by topic. Thank you for your feedback!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superlinked-9.43.0.tar.gz (143.0 kB view hashes)

Uploaded Source

Built Distribution

superlinked-9.43.0-py3-none-any.whl (332.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page