Skip to main content

The Superlinked vector computing library

Project description

Superlinked

Superlinked is a declarative Python SDK that enables you to turn complex data into vectors, in a way that fits the modern data stack and works with your favorite Vector Databases.

3 key areas of focus:

  1. Custom embedding model creation that fits your complex data entities.
  2. ETL for your vector index for both streaming and batch use-cases.
  3. Vector-native query language that helps you convert hybrid search queries to pure vector queries.

Visit Superlinked for more information about the company behind this product and our other initiatives.

Use-cases

Reference

  1. Describe your data using Python classes with the @schema decorator.
  2. Describe your vector embeddings from building blocks with Spaces.
  3. Combine your embeddings into a queryable Index.
  4. Define your search with dynamic parameters and weights as a Query.
  5. Load your data using a Source.
  6. Define your transformations with a Parser (e.g.: from pd.DataFrame).
  7. Run your configuration with an Executor.

Example code

Example on how to use Superlinked in a notebook to experiment with the semantic search use-case.

from superlinked.framework.common.schema.schema import schema
from superlinked.framework.common.schema.schema_object import String, Timestamp
from superlinked.framework.common.schema.id_schema_object import IdField
from superlinked.framework.dsl.space.text_similarity_space import TextSimilaritySpace
from superlinked.framework.dsl.space.recency_space import RecencySpace
from superlinked.framework.dsl.index.index import Index
from superlinked.framework.dsl.query.param import Param
from superlinked.framework.dsl.query.query import Query
from superlinked.framework.dsl.source.in_memory_source import InMemorySource
from superlinked.framework.dsl.executor.in_memory.in_memory_executor import InMemoryExecutor


@schema # Desribe your schemas.
class Document:
    id: IdField  # Each schema should have exactly one `IdField`.
    body: String # Use `String` for text fields.

document = Document()

relevance_space = TextSimilaritySpace(text=document.body, model="sentence-transformers/all-mpnet-base-v2") # Select your semantic embedding model.
document_index = Index([relevance_space]) # Combine your spaces to a queryable index.

query = Query(document_index).find(document).similar(relevance_space.text, Param("query_text")) # Define your query with dynamic parameters.

source: InMemorySource = InMemorySource(document) # Connect a data source to your schema.

executor = InMemoryExecutor(sources=[source], indices=[document_index]) # Tie it all together to run your configuration.
app = executor.run()

source.put([{"id": "happy_dog", "body": "That is a happy dog"}])
source.put([{"id": "happy_person", "body": "That is a very happy person"}])
source.put([{"id": "sunny_day", "body": "Today is a sunny day"}])

print(app.query(query, query_text="Who is a positive friend?")) # Run your query.

Ready to go to production? We are launching our first Vector DB connectors soon! Tell us which Vector DB we should support!

Articles

  • Vector DB Comparison: Open-source collaboritve comparison of vector databases by Superlinked.
  • Vector Hub: VectorHub is a free and open-sourced learning hub for people interested in adding vector retrieval to their ML stack

Support

If you encounter any challanges during your experiments, feel free to create an issue, request a feature or to start a discussion. Make sure to group your feedback in separate issues and discussions by topic. Thank you for your feedback!

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superlinked-3.4.0.tar.gz (76.1 kB view hashes)

Uploaded Source

Built Distribution

superlinked-3.4.0-py3-none-any.whl (192.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page