Skip to main content

The Superlinked vector computing library

Project description

 

PyPI Last commit License

Why use Superlinked

Improve your vector search relevance by encoding your metadata together with your data into your vector embeddings.

What is Superlinked

Superlinked is a framework AND a self-hostable REST API server that helps you make better vectors, that sits between your data, vector database and backend services.

How does it work

Superlinked makes it easy to construct custom data & query embedding models from pre-trained encoders, see the feature and use-case notebooks below for examples.

If you like what we do, give us a star! ⭐

Visit Superlinked for more information about the company behind this product and our other initiatives.

Features

You can check a full list of our features or head to our reference section for more information.

Use-cases

Dive deeper with our notebooks into how each use-case benefits from the Superlinked framework.

You can check a full list of examples here.

Experiment in a notebook

Example on combining Text with Numerical encoders to get correct results with LLMs.

Install the superlinked library

%pip install superlinked

Run the example:

First run will take slightly longer as it has to download the embedding model.

import json
import os

from superlinked import framework as sl


class Product(sl.Schema):
    id: sl.IdField
    description: sl.String
    rating: sl.Integer


product = Product()

description_space = sl.TextSimilaritySpace(
    text=product.description, model="Alibaba-NLP/gte-large-en-v1.5"
)
rating_maximizer_space = sl.NumberSpace(
    number=product.rating, min_value=1, max_value=5, mode=sl.Mode.MAXIMUM
)
index = sl.Index([description_space, rating_maximizer_space], fields=[product.rating])

# fill this with your API key - this will drive param extraction
openai_config = sl.OpenAIClientConfig(
    api_key=os.environ["OPEN_AI_API_KEY"], model="gpt-4o"
)

# it is possible now to add descriptions to a `Param` to aid the parsing of information from natural language queries.
text_similar_param = sl.Param(
    "query_text",
    description="The text in the user's query that refers to product descriptions.",
)

# Define your query using dynamic parameters for query text and weights.
# we will have our LLM fill them based on our natural language query
query = (
    sl.Query(
        index,
        weights={
            description_space: sl.Param("description_weight"),
            rating_maximizer_space: sl.Param("rating_maximizer_weight"),
        },
    )
    .find(product)
    .similar(
        description_space,
        text_similar_param,
        sl.Param("description_similar_clause_weight")
    )
    .limit(sl.Param("limit"))
    .with_natural_query(sl.Param("natural_query"), openai_config)
)

# Run the app.
source = sl.InMemorySource(product)
executor = sl.InMemoryExecutor(sources=[source], indices=[index])
app = executor.run()

# Download dataset.
data = [
    {"id": 1, "description": "Budget toothbrush in black color.", "rating": 1},
    {"id": 2, "description": "High-end toothbrush created with no compromises.", "rating": 5},
    {"id": 3, "description": "A toothbrush created for the smart 21st century man.", "rating": 3},
]

# Ingest data to the framework.
source.put(data)

result = app.query(query, natural_query="best toothbrushes", limit=1)

# examine the extracted parameters from your query
print(json.dumps(result.knn_params, indent=2))
# the result is the 5 star rated product
result.to_pandas()

Run in production

Superlinked Server allows you to leverage the power of Superlinked in deployable projects. With a single script, you can deploy a Superlinked-powered app instance that creates REST endpoints and connects to external Vector Databases. This makes it an ideal solution for those seeking an easy-to-deploy environment for their Superlinked projects.

If your are interested in learning more about running at scale, Book a demo for an early access to our managed cloud.

Supported VDBs

We have started partnering with vector database providers to allow you to use Superlinked with your VDB of choice. If you are unsure, which VDB to chose, check-out our Vector DB Comparison.

Missing your favorite VDB? Tell us which vector database we should support next!

Reference

  1. Describe your data using Python classes with the @schema decorator.
  2. Describe your vector embeddings from building blocks with Spaces.
  3. Combine your embeddings into a queryable Index.
  4. Define your search with dynamic parameters and weights as a Query.
  5. Load your data using a Source.
  6. Define your transformations with a Parser (e.g.: from pd.DataFrame).
  7. Run your configuration with an Executor.

You can check all references here.

Logging

Contextual information is automatically included in log messages, such as the process ID and package scope. Personally Identifiable Information (PII) is filtered out by default but can be exposed with the SUPERLINKED_EXPOSE_PII environment variable to true.

Resources

  • Vector DB Comparison: Open-source collaborative comparison of vector databases by Superlinked.
  • Vector Hub: VectorHub is a free and open-sourced learning hub for people interested in adding vector retrieval to their ML stack

Support

If you encounter any challenges during your experiments, feel free to create an issue, request a feature or to start a discussion. Make sure to group your feedback in separate issues and discussions by topic. Thank you for your feedback!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superlinked-14.6.0.tar.gz (179.3 kB view details)

Uploaded Source

Built Distribution

superlinked-14.6.0-py3-none-any.whl (435.6 kB view details)

Uploaded Python 3

File details

Details for the file superlinked-14.6.0.tar.gz.

File metadata

  • Download URL: superlinked-14.6.0.tar.gz
  • Upload date:
  • Size: 179.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for superlinked-14.6.0.tar.gz
Algorithm Hash digest
SHA256 d00d0af2efe80040ea3e57e5c82cb8e1a6405249e15e8f9b7f459cf5e9c7b65e
MD5 dd966289fcaaac9a55aad8c684df1ac9
BLAKE2b-256 913370ea9b28db95a0aec70bd44cc467aea007f5f811ca0ecb8ad0b8a381186b

See more details on using hashes here.

Provenance

The following attestation bundles were made for superlinked-14.6.0.tar.gz:

Publisher: python.yml on superlinked/superlinked-internal

Attestations:

File details

Details for the file superlinked-14.6.0-py3-none-any.whl.

File metadata

  • Download URL: superlinked-14.6.0-py3-none-any.whl
  • Upload date:
  • Size: 435.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for superlinked-14.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 755d40ed2c8dffaddb276a56f0134245750d85dc28364ffa0c8e0b4eb1f445ab
MD5 5cb4f928f8e4a952b35f2e7cea97c1a6
BLAKE2b-256 4053b7db648c45517cfdacade69e6b52f4de8f83adb81835ebe55e79867cbcef

See more details on using hashes here.

Provenance

The following attestation bundles were made for superlinked-14.6.0-py3-none-any.whl:

Publisher: python.yml on superlinked/superlinked-internal

Attestations:

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page